Context-Aware Web Attack Detection in Open-Source SIEM Systems via MITRE ATT&CK-Enriched Behavioral Profiling

View PDF HTML (experimental)

Abstract:Security Information and Event Management (SIEM) systems aggregate log data from heterogeneous sources to detect coordinated attacks. Traditional rule-based correlation engines struggle to classify multi-step web application attacks because they examine each event without reference to the behavioural history of the originating host.
We present Smart-SIEM, an AI module for the open-source Wazuh SIEM platform with two contributions: (1) a per-source-IP behavioural context vector encoding HTTP response-status distributions, peak rule activation counts, and MITRE ATT&CK technique frequencies from the N most recent prior events; (2) a two-stage hybrid cascade combining LightGBM for binary attack detection and XGBoost for six-class attack categorisation.
Evaluated on 46,454 purpose-built Wazuh security events, context features improve all tested gradient boosting algorithms from ~0.705 macro F1 to 0.947-0.967 (Stage 1) and 0.876-0.914 (Stage 2), an average gain of +0.254 and +0.324 respectively. The hybrid cascade achieves F1 of 0.967 (binary) and 0.914 (six-class). Wazuh's native rule engine detects 0% of Brute Force and Broken Authentication events; the AI module detects 100% and 98.3% respectively. A self-adaptive retraining mechanism recovers from concept drift: F1 drops from 0.905 to 0.465 when unseen attack types emerge, recovering to 0.814 after retraining on the combined corpus.

Comments:	38 pages, 13 figures, 13 tables
Subjects:	Cryptography and Security (cs.CR); Machine Learning (cs.LG)
ACM classes:	C.2.0; K.6.5
Cite as:	arXiv:2605.13337 [cs.CR]
	(or arXiv:2605.13337v1 [cs.CR] for this version)
	https://doi.org/10.48550/arXiv.2605.13337 arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Badr Alboushy [view email]
[v1] Wed, 13 May 2026 10:54:36 UTC (1,177 KB)