Attackers will try to defeat ML-based security mechanisms: pollute training data to produce wrong models, or study features and adapt attacks to evade. The bar is higher than other ML apps because security implies adversarial environments.
ML for security has a long history (e.g., Wenke Lee's PhD 1994–99, MADAM ID - cited ~1500 times). Early challenges: hard to get labeled data, no standard datasets, manual feature construction. Security experts were skeptical (false positives, “why not expert rules?”). Recent work: more data (malware samples, org data logging, IRB approval), but standard datasets still rare.
Attacker generates examples to probe the ML system, infer the decision boundary, then craft attacks to evade.
Attacker injects malicious examples into training data; ML produces an ineffective model.
Multi-step bank-fraud attack that evades defenses and uses social engineering.
Measures and models the n-gram frequency distribution in payloads of each network service. For n=1, models byte/character frequency.
Each service (web, mail, etc.) has unique characteristics. User normally gets certain web/email content. Attacks (e.g., malware in attachment) are very different.
For each byte xi: compute relative frequency f(xi) and std dev s(xi) in normal traffic. Profile = {f(xi), s(xi)} for all bytes. Anomaly score = Σ |observed − f| / (s + α).
Separate model per packet length. Unusual length → anomalous. Advantages: simple, efficient; detects zero-day and polymorphic attacks.
A polymorphic attack changes appearance with every instance; has no predictable signature. False: each instance has “different but normal” appearance - polymorphic code does not resemble normal.
Polymorphic attacks have encrypted/transformed code → different byte frequency from legit traffic → detectable. Blending makes each instance match normal byte frequency so it evades anomaly detection.
Each byte in attack body mapped to byte from legitimate traffic. Greedy: most frequent attack byte → most frequent normal byte; second → second; etc. Add padding so full packet matches profile. Decryptor removes padding and reverses substitution.
Process should not result in abnormally large attack size; blending must be economical in time/space. Not true that attacker must collect lots of data - artificial profile can converge with <100 packets.
False. Attacker only approximates real profile. Packet just needs to match within variance. Simpler IDSs (e.g., PAYL) are more frequently evaded; sophisticated models are harder.
Goals of a successful poisoning attack: undetected; subtle (no one knows data poisoned for a while); permanent (damage hard to repair).
Neighbors tired of diverted traffic falsely reported congestion. Hope: app learns wrong data and stops routing through neighborhood. Did not work - far more drivers reported streets not congested; signal outweighed noise. In other cases, attackers can control noise level.
Automatic signature generators (e.g., Polygraph) extract invariants from polymorphic worm flows. If attacker injects fake anomalous flows into training - flows that look anomalous but need not exploit any vulnerability - flow classifiers (honeynet, port-scan detector, anomaly IDS) cannot distinguish them. Result: useless signatures (too many FP/FN).
Polygraph generates worm signatures for polymorphic worms. Authors assumed it handles noise. Not true when attacker deliberately injects well-crafted fake anomalous flows.
Ordered set of tokens common to all suspicious flows (protocol framework + true invariants)
Sequence of tokens; regex-like pattern
Tokens with scores lj = log(Psf/Pif); sum scores; threshold for detection
Conjunction/Token-subsequence: Permute worm body; inject fake invariants (P(fake|innocuous) < P(true|innocuous)) so signatures based on fake invariants are selected → no true invariants → many FN. Craft so clustering puts one fake flow per worm cluster.
Bayes: Inject score multiplier strings - normal HTTP substrings (e.g., "Pragma: no-cache") into all fake flows. Innocuous flows containing that string then match many tokens → L ≫ threshold → FP.
Noise injection has high chance to mislead syntactic worm signature generators. Mitigation: need a precise flow classifier that filters noise - open problem.
Against polymorphic blending: use more complex models. Simpler approaches are more frequently used (efficient, scalable) but more easily evaded.
v at random for
2-gram variant; combine several such systems
Need a precise flow classifier to filter noise - open problem.
Train on adversarial examples (e.g., PGD). Improves robustness; trade-off with clean accuracy.
Formal guarantees (e.g., randomized smoothing) - classify correctly for all perturbations in ε-ball with high probability.
Outlier removal, clustering, anomaly detection on training data. Can remove some poisons; adaptive attacks may evade.
Generate poisons during training, inject into batches. Desensitizes model; withstands adaptive attacks; can outperform DP-SGD.
Defenses often trade accuracy for robustness. Adversaries adapt. Defense-in-depth: combine multiple techniques, monitor for drift, retrain periodically.