Intrusion Detection Systems (IDS) deploy sensors at the network or host level to monitor traffic and system activity. Security operations centers (SOCs) or managed security services analyze the resulting alerts. Alert correlation studies relationships among these alerts—grouping duplicates, inferring cause and effect, and reconstructing multi-step attack scenarios for better understanding and prediction.
Alerts can relate to each other in several ways. Correlation techniques group and link them so analysts can see the bigger picture instead of isolated events.
Same event, multiple sensors or times. Deduplication reduces noise.
Known tools/patterns (obvious) or impact/conditions (less straightforward).
Statistical patterns; no pre-defined attack graph. Infer attacker’s multi-step plan.
Granger causality is a
statistical notion of “cause”: if past values of one time series
(e.g., alert type u) help predict another
(y), then u is said to Granger-cause
y. It captures temporal precedence and correlation, but
not necessarily true causation (e.g., a common underlying factor could
explain both).
y using only its
own past values. The residual (prediction error) measures how well
y can be predicted by itself.
u as
extra predictors. If the ARMA residual is significantly smaller,
then u’s past carries useful information for predicting
y.
The GCI measures how much adding u improves prediction
of y. An F-test compares the two models: if the GCI
exceeds a threshold, we say u Granger-causes
y. Ranking alert pairs by GCI helps identify which
alerts are statistically related as precursors or consequences.
In a worm scenario, Loki has the highest GCI with DB_NewClient. The worm sends data out, then downloads more from the same site (a feedback loop). Granger causality works well for strong temporal patterns and complements other correlation techniques.
A Bayesian network is a directed acyclic graph (DAG) where nodes are random variables (e.g., alert types, attack stages) and edges encode direct dependencies. Each node has a conditional probability table (CPT) that defines how its probability depends on its parents. This lets us encode expert knowledge about attack prerequisites, update beliefs when new alerts arrive, and infer hidden relationships.
Typical layers: Info Gathering → System Performance → Service →
Confidentiality → Root Privilege → Integrity → Suspicious Connection
→ User Privilege. Given observed alerts (evidence), we compute
P(A1 correlates with A2 | evidence) and use depth-first
search to find the path with highest correlation score—that path
represents the likely attack scenario.