Malware Analysis

Topics Polymorphism · Metamorphism · Packers · Unpackers · Evasion · Behavioral Detection
01 //

Malware Categories

Malware is software designed to infiltrate or damage systems. Understanding how attackers classify and hide malware helps defenders build better detection tools.

Viruses

Attach to executables; spread by running infected files

Worms

Exploit network services to spread automatically

Trojans

Claim to be useful software; hide malicious payload

02 //

Traditional Detection

Signature-based: byte sequences in malware. Manual analysis, reverse engineering. Syntactic signatures easily evaded.

03 //

Evasion Methods

Malware authors use various techniques to evade detection. These fall into three broad categories depending on what the defender is doing.

Vs Signature

Polymorphism and metamorphism alter the binary so static byte signatures fail.

Vs Dynamic

Anti-debugging, anti-VM, and emulator detection—malware exits or sleeps if run in a lab.

Vs Static

Anti-disassembly, packing, and control-flow obfuscation hide the real code.

04 //

Polymorphic Code

Polymorphic malware encrypts its main body and uses a different encryption key for each copy. The decryptor (which must run in the clear) may have several variants or be obfuscated. Byte-sequence signatures on the body fail because the ciphertext changes every time.

Detection Approaches
  • Signature the decryptor – Works if the decryptor is not heavily obfuscated
  • Emulation – Run the decryptor in a safe emulator; once decrypted, the body can be scanned
  • Malware response – Many polymorphic samples use anti-emulation to refuse to run in sandboxes
05 //

Metamorphic Code

Metamorphic malware avoids encryption entirely. Instead, it rewrites its own code so each instance looks different but behaves the same. The entire body is obfuscated through transformations like code reordering, garbage insertion, equivalent instruction replacement, jump insertion, and packing.

Why Detection Is Hard

Identifying semantically equivalent code is undecidable. Syntactic signatures fail because the surface form changes. Semantics-based detection (e.g., behavior, data-flow) is more robust than pure syntax.

06 //

Anti-Static Analysis

These techniques make static inspection (disassembly, decompilation) difficult or misleading.

07 //

Dynamic Analysis

Running malware in a sandbox (e.g., VM) defeats most anti-static techniques—the real code must execute to be observed. However, malware can detect analysis environments and refuse to run.

Tracing Approaches

Hook system calls via DLL injection, a kernel driver, or the virtual machine monitor. Tools like CWSandbox and TTAnalyze capture API calls and file/network activity.

Evasion

Malware detects tracer/VM and exits or sleeps. Defenders must hide the analysis environment. Rootkits can also detect kernel drivers used for tracing.

08 //

Unpackers

Packed malware hides its real code until runtime. Unpackers automatically reveal the hidden code so it can be analyzed statically or used for signatures.

PolyUnpack

Static model of original code; execution outside the model = unpacked region. Single-step and track EIP.

Renovo / OmniUnpack

Heuristic: when freshly written memory is executed, it's unpacked code. Fine-grained (instruction) or coarse (page/syscall).

Packers

UPX, Armadillo, commercial protectors. Often include anti-debug, anti-trace, and obfuscation.

09 //

Trigger-based Behavior

Malware often hides malicious behavior behind a trigger—a condition that may not be met during a short sandbox run. A single execution path misses the hidden code.

Multipath Exploration

Tools like Moser/Bitscope use an emulator, taint tracking, and path constraints. They save/reload state and use an SMT solver to explore alternative paths. Hash functions are non-linear, so symbolic execution struggles to discover inputs that satisfy conditions like Hash(input)==target. Input discovery often requires heuristics or concrete execution.

10 //

Summary

Key Takeaways
  • Polymorphic: Encrypted body, varying keys; decryptor may be obfuscated; signatures fail on the body
  • Metamorphic: Full code rewrite; semantics preserved; equivalence undecidable; syntax-based signatures fail
  • Unpackers: PolyUnpack (static model divergence), Renovo/OmniUnpack (write-then-execute heuristic)
  • Behavioral detection: Runtime templates and system-call monitoring; evaded by trigger-based and conditional behavior