Bugs are common in software. Programmers make implementation errors, and finding them becomes harder as codebases grow. Even though testing often consumes ~80% of development budget, bugs still slip through. When a bug can be abused by an attacker, it becomes a security vulnerability.
The path from a coding mistake to an actual attack typically follows:
Security researchers and vendors disclose vulnerabilities at cve.mitre.org. Developers then issue patches. Public disclosure helps defenders fix systems before attackers exploit them.
A buffer overflow occurs when a program writes more data into a fixed-size buffer than it can hold. The extra bytes spill into adjacent memory, overwriting whatever is there. On the stack, that includes local variables, saved registers, and crucially—the return address that tells the CPU where to go when the function finishes.
Data is laid out in order: buffer → other locals → saved registers → return address. An overflow writes upward in memory, so excess bytes overwrite variables and eventually the return address.
Overwrite passwordok or userid to bypass
authentication. A crafted input can flip a "false" to "true" or
change a user ID to gain privileged access.
Overwrite the return address to point into the buffer itself, where the attacker has placed machine code (shellcode). When the function returns, the CPU executes the attacker's code instead of the real caller.
gets(password) reads input with no length limit into a
fixed buffer—extremely dangerous. Safer alternatives:
fgets with size limit, bounds checking, or higher-level
safe APIs.
A heap overflow happens
when a program writes past the end of a heap-allocated buffer (from
malloc). Unlike the stack, the heap stores dynamically
allocated data. Exploits often target the internal structures of the
memory allocator (e.g., dlmalloc used in glibc).
If an adjacent buffer holds a function pointer (e.g., callback, vtable), overflowing into it lets the attacker replace the pointer with an address of their choosing. The next call goes to attacker-controlled code.
When freeing a chunk, the allocator "unlinks" it from a
doubly-linked list of free chunks. The unlink macro does
FD->bk = BK and BK->fd = FD. By
corrupting chunk headers, the attacker controls FD and BK—and thus
which memory locations get overwritten with which values.
Free chunks are in a doubly-linked list with forward (FD) and backward (BK) pointers. The unlink macro updates the neighboring chunks' pointers. If the attacker overflows and fakes a free-chunk header, they choose FD and BK so that the unlink writes a chosen value to a chosen address (e.g., return address or GOT entry). Modern glibc has hardened unlink checks to mitigate this.
Integers have a fixed size (e.g., 32 bits). When a computation exceeds
the maximum value, it
wraps around instead of
raising an error. Examples:
0xffffffff + 1 = 0 (unsigned),
0x80000001 * 2 = 2. Integer overflow by itself does not
change control flow, but it often leads to buffer overflows or
bypassed security checks.
If p = malloc(x * 4) and x is
attacker-controlled: when x * 4 overflows (e.g., x =
0x40000001), the allocation is tiny (e.g., 4 bytes). A later
p[x] = 0 then writes far out-of-bounds.
Example:
if (x + 4 > y) error(); buffer[x] = 0; The check
intends to block large x. But if x = 0xffffffff, then x + 4 = 3
(overflow). The check passes (3 < y), yet
buffer[x] is a huge out-of-bounds access.
Functions like printf take a format string (e.g.,
"%d %s") and optional arguments. The format string tells
printf how many arguments to expect and where to read
them from the stack. If the format string is
user-controlled (e.g.,
printf(user_input)), the attacker can make
printf read from arbitrary stack locations or even write
to memory.
With printf(argv[1]), an attacker passes
%d %d %d .... Each %d makes
printf treat the next stack slot as an integer and
print it. This leaks locals, return addresses, and other sensitive
data.
The %n specifier writes the
count of bytes printed so far to the address pointed to
by the corresponding argument.
printf("AAAA%n", &i) sets i = 4. By
controlling the format string and stacking arguments, the attacker
can write chosen values to return addresses, GOT entries, etc.
Never pass user input directly as the format string. Use
printf("%s", user_input) so the user's input is treated
as a plain string, not as format directives.