Data Poisoning & Model Evasion

Topics Poisoning · Evasion · Adversarial Examples · Defenses
01 //

ML for Security & Adversarial ML

Attackers will try to defeat ML-based security mechanisms: pollute training data to produce wrong models, or study features and adapt attacks to evade. The bar is higher than other ML apps because security implies adversarial environments.

History

ML for security has a long history (e.g., Wenke Lee's PhD 1994–99, MADAM ID - cited ~1500 times). Early challenges: hard to get labeled data, no standard datasets, manual feature construction. Security experts were skeptical (false positives, “why not expert rules?”). Recent work: more data (malware samples, org data logging, IRB approval), but standard datasets still rare.

Adversarial ML (Early Research)

Two Attack Types

Exploratory (Evasion)

Probe & Evade

Attacker generates examples to probe the ML system, infer the decision boundary, then craft attacks to evade.

Causative (Poisoning)

Inject Bad Data

Attacker injects malicious examples into training data; ML produces an ineffective model.

Evasion Tactics (O, C, E, T)
  • O - Obfuscating internal data: tricks so analysis system cannot detect code
  • C - Confusing automated tools: avoid signature-based AV
  • E - Environmental awareness: detect runtime environment
  • T - Timing-based: run at certain times or after user actions
02 //

Real-Life Evasion: DyreWolf

Multi-step bank-fraud attack that evades defenses and uses social engineering.

  1. 1
    Spear phishing - employee receives email with malware attachment
  2. 2
    Install - employee opens attachment; malware installs
  3. 3
    C2 - malware connects to attack server, downloads new malware
  4. 4
    Bank response modification - when victim logs into bank, malware changes response so victim is told to call illegitimate number
  5. 5
    Social engineering - victim calls; attacker extracts personal info
  6. 6
    Money transfer - attacker bypasses fraud detection, convinces bank to transfer funds
  7. 7
    DoS - attacker launches noisy DoS to distract investigation
03 //

PAYL: Payload-Based Anomaly Detection

Measures and models the n-gram frequency distribution in payloads of each network service. For n=1, models byte/character frequency.

Intuition

Each service (web, mail, etc.) has unique characteristics. User normally gets certain web/email content. Attacks (e.g., malware in attachment) are very different.

Model

For each byte xi: compute relative frequency f(xi) and std dev s(xi) in normal traffic. Profile = {f(xi), s(xi)} for all bytes. Anomaly score = Σ |observed − f| / (s + α).

Separate model per packet length. Unusual length → anomalous. Advantages: simple, efficient; detects zero-day and polymorphic attacks.

Polymorphism

A polymorphic attack changes appearance with every instance; has no predictable signature. False: each instance has “different but normal” appearance - polymorphic code does not resemble normal.

04 //

Evading PAYL: Polymorphic Blending

Polymorphic attacks have encrypted/transformed code → different byte frequency from legit traffic → detectable. Blending makes each instance match normal byte frequency so it evades anomaly detection.

Polymorphic Components

Blending Steps

  1. 1
    Adversary compromises host; observes normal traffic A → B
  2. 2
    Uses IDS algorithm (e.g., PAYL) to generate artificial normal profile
  3. 3
    Creates attack instance matching profile (shellcode encryption + padding)
  4. 4
    Launches attack - IDS cannot detect
Substitution Cipher

Each byte in attack body mapped to byte from legitimate traffic. Greedy: most frequent attack byte → most frequent normal byte; second → second; etc. Add padding so full packet matches profile. Decryptor removes padding and reverses substitution.

Blending Constraints

Process should not result in abnormally large attack size; blending must be economical in time/space. Not true that attacker must collect lots of data - artificial profile can converge with <100 packets.

Perfect Match?

False. Attacker only approximates real profile. Packet just needs to match within variance. Simpler IDSs (e.g., PAYL) are more frequently evaded; sophisticated models are harder.

05 //

Data Poisoning

Goals of a successful poisoning attack: undetected; subtle (no one knows data poisoned for a while); permanent (damage hard to repair).

LA Residents & Waze

Neighbors tired of diverted traffic falsely reported congestion. Hope: app learns wrong data and stops routing through neighborhood. Did not work - far more drivers reported streets not congested; signal outweighed noise. In other cases, attackers can control noise level.

Noise Injection on Worm Signatures

Automatic signature generators (e.g., Polygraph) extract invariants from polymorphic worm flows. If attacker injects fake anomalous flows into training - flows that look anomalous but need not exploit any vulnerability - flow classifiers (honeynet, port-scan detector, anomaly IDS) cannot distinguish them. Result: useless signatures (too many FP/FN).

Flow Classifiers
  • Simulated Honeynet - traffic to non-existent hosts → suspicious pool
  • Double Honeynet - real honeypots infected → redirect to simulated; worm + fake flows both in pool
  • Port-scanning detector - scanning traffic → suspicious pool
  • Anomaly IDS (e.g., PAYL) - anomalous flows → suspicious pool
06 //

Case Study: Polygraph

Polygraph generates worm signatures for polymorphic worms. Authors assumed it handles noise. Not true when attacker deliberately injects well-crafted fake anomalous flows.

Signature Types

Conjunction

Ordered set of tokens common to all suspicious flows (protocol framework + true invariants)

Token-Subsequence

Sequence of tokens; regex-like pattern

Bayes

Tokens with scores lj = log(Psf/Pif); sum scores; threshold for detection

Crafting the Noise

Conjunction/Token-subsequence: Permute worm body; inject fake invariants (P(fake|innocuous) < P(true|innocuous)) so signatures based on fake invariants are selected → no true invariants → many FN. Craft so clustering puts one fake flow per worm cluster.

Bayes: Inject score multiplier strings - normal HTTP substrings (e.g., "Pragma: no-cache") into all fake flows. Innocuous flows containing that string then match many tokens → L ≫ threshold → FP.

Conclusion

Noise injection has high chance to mislead syntactic worm signature generators. Mitigation: need a precise flow classifier that filters noise - open problem.

Quiz
  • If we completely control training data and ascertain integrity → no poisoning worry. True
  • If training data from open environment (e.g., Web) → always potential for poisoning; cannot be eliminated. True
07 //

Defenses

Against polymorphic blending: use more complex models. Simpler approaches are more frequently used (efficient, scalable) but more easily evaded.

PAYL Countermeasures

  • Complex models - syntactic/semantic info, not just byte frequency
  • Multiple simple IDSs - model different features (e.g., byte pairs v chars apart)
  • Randomness - choose v at random for 2-gram variant; combine several such systems

Against Poisoning

Need a precise flow classifier to filter noise - open problem.

General Defenses

Adversarial Training

Train on adversarial examples (e.g., PGD). Improves robustness; trade-off with clean accuracy.

Certified Robustness

Formal guarantees (e.g., randomized smoothing) - classify correctly for all perturbations in ε-ball with high probability.

Against Poisoning

Data Sanitization

Outlier removal, clustering, anomaly detection on training data. Can remove some poisons; adaptive attacks may evade.

Adversarial Training for Poisoning

Generate poisons during training, inject into batches. Desensitizes model; withstands adaptive attacks; can outperform DP-SGD.

No Silver Bullet

Defenses often trade accuracy for robustness. Adversaries adapt. Defense-in-depth: combine multiple techniques, monitor for drift, retrain periodically.

08 //

Summary

Data Poisoning & Model Evasion - Takeaways
  • Exploratory (evasion) - probe decision boundary, craft inputs to evade (e.g., PAYL, polymorphic blending)
  • Causative (poisoning) - inject bad data → wrong model (e.g., noise injection on worm signatures)
  • PAYL - byte frequency anomaly detection; polymorphic blending defeats it via substitution + padding
  • Polygraph - conjunction, token-subsequence, Bayes signatures; all defeated by crafted fake anomalous flows
  • Defenses - complex models, multiple IDSs, randomness; precise flow classifier for poisoning is open