Botnet Detection

Topics Bot · Botnet · BotHunter · BotMiner · DNS-Based
01 //

Botnets

A bot is a computer controlled by malware without the consent and knowledge of the user - often called a zombie. A botnet is a network of bots controlled by a bot master. It is a key platform for for-profit fraud and other exploits.

Botnet Definition

A coordinated group of malware instances controlled via command-and-control (C&C) channels. C&C architectures: centralized (IRC, HTTP) or distributed (P2P).

Botnet Tasks

>95% of spam
DDoS attacks
Click fraud
Phishing & pharming
Key logging, identity theft
Malware distribution (spyware)

Also: anonymized criminal/terrorist communication.

Network Monitoring Context

Traditional firewalls/NIDS identify obvious attack traffic (e.g. exploit payloads). Advanced monitoring is needed because: (1) botnet HTTP-based C&C looks like normal web traffic; (2) mobile devices compromised outside the perimeter bypass traditional defenses.

02 //

Why Traditional Measures Fail

Anti-Virus

Bots use packers, rootkits, frequent updates. AV has no big picture; bots are long-term. Bots can detect honeypots.

Honeypot

Not scalable; mostly passive. Bots can discover and avoid honeynets.

IDS/IPS

Look at only specific aspects; exploit-based signatures.

Detection Challenges

Bot infection is multi-faceted, multi-phased. Bots are stealthy, dynamically evolving. C&C design is flexible. Static/signature approaches may fail.

Detection Guidelines
  • Bot: non-human - distinguish from normal traffic and older attacks
  • Net: bots connected; activities coordinated
  • For profit: long-term use, updates; coordinated C&C
03 //

BotHunter: Vertical Dialog Correlation

Monitors two-way flows between internal network and Internet. Correlates inbound intrusion alarms with outbound patterns. Produces a bot infection profile. Vertical (dialog) correlation - infection lifecycle model.

Infection Lifecycle (E1–E5)
E1: Inbound Scan
E2: Infection (A→V)
E3: Egg Download (V→A)
E4: C&C Comms (V→C)
E5: Outbound Scan

A=Attacker, V=Victim, C=C&C. External stimulus alone cannot trigger alert; requires 2× internal bot behavior.

BotHunter Sensor Suite

SCADE

Statistical Scan Anomaly Detection Engine. Weighted scan detection: inbound (E1), outbound (E5). Bounded memory; failed connections to vulnerable ports = high weight.

SLADE

Statistical payLoad Anomaly Detection Engine. Lossy n-gram (4-gram, 2048 vector). Detects suspicious payloads; lower FP than PAYL.

Signature Engine

Snort/Bleeding Edge rules. e1–e5.rules: exploits, egg downloads, C&C, outbound scans.

04 //

BotMiner: Protocol & Structure Independent

Botnets can change C&C content (encryption), protocols (IRC, HTTP), structures (P2P), servers, infection models. BotMiner uses both vertical and horizontal correlation. Key insight: bots are for long-term use; communication and activities are coordinated/similar.

BotMiner Pipeline
  1. 1
    C-Plane: Cluster C-flows (protocol, srcIP, dstIP, dstPort, time, bytes). Features: FPH, PPF, BPP, BPS. Two-step: coarse (8 features, X-means) → refined (52 features).
  2. 2
    A-Plane: Cluster by activity type (scan, exploit, binary download, spam). Cluster by activity features.
  3. 3
    Cross-Plane: Intersection of A-plane and C-plane clusters. More intersections = stronger evidence. Same activity + C-cluster → same botnet.
Suspicious Behavior

C&C link, IRC on specific ports, SMTP traffic - indicative. Simultaneous identical DNS requests (not plain DNS) are suspicious. Noticeable performance reduction is not typical bot behavior (bots are stealthy).

05 //

DNS-Based Botnet Detection

Botnets use DNS for C&C location, malware hosting. Recursive DNS monitoring at ISP - analyze traffic from internal hosts to recursive DNS; detect abnormal patterns.

Dynamic DNS Abuse

Botnet authors reuse SLD with many 3LDs (traceable purchases, stealth). Cluster 3LDs by name similarity and resolved IP subnets. Sum look-ups per cluster.

Look-up Arrival Rate

Bots resolve C&C immediately after boot. Exponential/spike arrival (time zones, schedules). Normal users have smoother patterns.

Anomalous Domain Names

Botnet domains often random-looking (e.g. wbghid.1dumb.com). Long, random 3LDs. Train Bloom filter + Markov model; "new and suspicious" = not in filter, doesn't fit model.

Propagation & Growth

Exploit-based: exponential. Email-based: exponential or linear. Drive-by: sublinear. Monitor popularity growth of suspicious names.

Sinkholing

DynDNS CName can be updated to point to a sinkhole. Dnstop alerts on updates; redirects bots to researcher-controlled server for analysis.

06 //

BotMiner Evasion

Botnets can evade detection by manipulating patterns or using undetectable channels.

C-Plane Evasion

Manipulate communication patterns. Introduce random packets to reduce similarity between C&C flows.

A-Plane Evasion

Slow spamming. Use Gmail (HTTPS) for spam; download exe from HTTPS - encrypted, hard to inspect.

07 //

Summary

Botnet Detection Takeaways
  • BotHunter - Vertical dialog correlation; infection lifecycle E1–E5; SCADE, SLADE, signatures
  • BotMiner - C-plane (communication) + A-plane (activity) clustering; cross-plane correlation; protocol independent
  • DNS-based - Dynamic DNS abuse; clustered 3LDs; look-up arrival spikes; anomalous names
  • Evasion - Noise in C-plane; slow/encrypted activity in A-plane