AiTechWorlds
AiTechWorlds
Run cat /proc/self/maps in a terminal. You will see a map of every memory region of the cat process — code, libraries, heap, stack. Each row is a VMA (Virtual Memory Area). Together they define a process's entire virtual address space.
55a3c1200000-55a3c1201000 r--p 00000000 08:01 1234567 /usr/bin/cat
55a3c1201000-55a3c1205000 r-xp 00001000 08:01 1234567 /usr/bin/cat
55a3c1205000-55a3c1207000 r--p 00005000 08:01 1234567 /usr/bin/cat
55a3c1207000-55a3c1208000 r--p 00006000 08:01 1234567 /usr/bin/cat
55a3c1208000-55a3c1209000 rw-p 00007000 08:01 1234567 /usr/bin/cat
55a3c2a1e000-55a3c2a3f000 rw-p 00000000 00:00 0 [heap]
7f8c14000000-7f8c14200000 r--p 00000000 08:01 2345678 /usr/lib/x86_64-linux-gnu/libc.so.6
7f8c14200000-7f8c14378000 r-xp 00200000 08:01 2345678 /usr/lib/x86_64-linux-gnu/libc.so.6
7fff89a00000-7fff89a21000 rw-p 00000000 00:00 0 [stack]
7fff89bcd000-7fff89bd1000 r--p 00000000 00:00 0 [vvar]
7fff89bd1000-7fff89bd3000 r-xp 00000000 00:00 0 [vdso]
Six regions, dozens of lines, one small program. Every single byte of virtual address space that cat can legally access is described by this map. Any access outside these regions triggers an immediate SIGSEGV — the kernel kills the process before it can read or corrupt memory it does not own.
The data structure behind each line is vm_area_struct. Understanding it is understanding how Linux implements process isolation, demand paging, copy-on-write, and memory protection in a unified framework.
Every mapped region in a process is described by a struct vm_area_struct (defined in include/linux/mm_types.h):
struct vm_area_struct {
unsigned long vm_start; // start virtual address (inclusive)
unsigned long vm_end; // end virtual address (exclusive)
unsigned long vm_flags; // VM_READ, VM_WRITE, VM_EXEC, VM_SHARED, ...
pgoff_t vm_pgoff; // offset within file (for file-backed VMAs)
struct file *vm_file; // pointer to file struct (NULL for anonymous)
const struct vm_operations_struct *vm_ops; // fault, open, close handlers
struct mm_struct *vm_mm; // back-pointer to the process's mm_struct
// ... red-black tree links, anonymous VMA chain, etc.
};
Key fields and their meaning:
[vm_start, vm_end) of virtual addresses this VMA covers. Always page-aligned (multiples of 4096).VM_READ, VM_WRITE, VM_EXEC map directly to the protection bits in page table entries. VM_SHARED distinguishes shared from private mappings.struct file for file-backed mappings. NULL for anonymous (heap, stack).readpage. For anonymous, it allocates a zero page.A process can have thousands of VMAs (every mmap() call, every loaded library section, every malloc() call above the mmap threshold creates VMAs). Linear search would be unacceptable for address lookup.
All VMAs are organized in a red-black tree anchored at mm_struct.mm_rb. Red-black trees are self-balancing binary search trees guaranteeing O(log n) lookup, insertion, and deletion. A process with 1000 VMAs requires at most 10 comparisons to find which VMA (if any) contains a given faulting address.
Linux 6.1 added maple trees as a replacement for the VMA red-black tree — a B-tree variant optimized for range operations, improving performance for processes with many VMAs (common in JVM-based applications).
cat /proc/<pid>/smaps # detailed per-VMA info including RSS, swap, etc.
cat /proc/<pid>/smaps_rollup # aggregated totals
pmap -x <pid> # human-readable VMA map with RSS
On x86-64 Linux, the virtual address space is 128TB for user space and 128TB for the kernel (total 256TB, using 48-bit virtual addresses with a gap in the middle):
Null pointer region (0x0 – 0x10000): Never mapped. Any access to address 0 (or close to it) is a page fault with no valid VMA → SIGSEGV. This is why null pointer dereferences are caught immediately rather than silently reading garbage.
Text segment (r-xp): Read + execute, private. The program's machine code. Multiple processes running the same binary share the same physical pages — the kernel maps the same page cache frames into each process. No physical duplication.
Data and BSS (rw-p): Read + write, private. Initialized global variables (.data) and zero-initialized globals (.bss). The BSS segment in the binary contains only a size, not actual zeros — the kernel zero-fills on demand.
Heap: Anonymous, read + write, private. Grows upward via the brk() system call (for small increments) or mmap(MAP_ANONYMOUS) for large allocations. malloc() manages this region; the kernel only sees the raw pages.
Memory-mapped region: Where mmap() allocations land, between heap top and stack bottom. Shared libraries, database files, and all anonymous allocations above MMAP_THRESHOLD (128KB default) live here. This region grows both upward and downward from the middle — the kernel searches for gaps.
Stack: Starts at a high virtual address and grows downward. Limited to RLIMIT_STACK (default 8MB). The kernel auto-expands the stack VMA when a page fault occurs just below the current stack bottom (stack growth fault) — no brk() needed.
VDSO (Virtual Dynamically-linked Shared Object): A small shared library the kernel maps into every process. It provides fast implementations of frequently-called syscalls — gettimeofday(), clock_gettime(), getpid() — that read kernel data without a full syscall (no ring transition), saving ~100ns per call.
Without ASLR, every process of the same binary starts at the same addresses. An attacker who knows a buffer overflow can be triggered knows exactly where to redirect execution — the return address, the stack, the libc functions are all at predictable locations.
ASLR randomizes the base addresses of text, stack, heap, and mmap regions on each execution:
# Control ASLR:
cat /proc/sys/kernel/randomize_va_space
# 0 = disabled
# 1 = randomize stack, mmap, VDSO
# 2 = randomize stack, mmap, VDSO, and heap (default on Linux)
echo 2 > /proc/sys/kernel/randomize_va_space
On x86-64 with ASLR=2:
PIE (Position Independent Executable): ASLR only randomizes the text segment if the binary is compiled as PIE (-fPIE -pie). All modern Linux distributions compile system binaries as PIE. The text segment is then loaded at a random address via mmap.
file /usr/bin/ls
# /usr/bin/ls: ELF 64-bit LSB pie executable, x86-64 ...
# ^^^
# "pie executable" confirms PIE
ASLR bypass techniques (why it is not a complete defense):
Despite limitations, ASLR significantly raises the cost of exploitation and is considered essential defense-in-depth.
Linux allows allocating more virtual memory than physical RAM + swap. This is called memory overcommit.
cat /proc/sys/vm/overcommit_memory
# 0 = Heuristic: allow overcommit up to a reasonable limit
# 1 = Always allow: never fail mmap/malloc (dangerous but useful for certain workloads)
# 2 = Strict: total commit limited to swap + (overcommit_ratio% of RAM)
cat /proc/sys/vm/overcommit_ratio # default 50
cat /proc/meminfo | grep Committed # see current committed vs available
Why overcommit works: Most allocated memory is never used. A process that allocates 1GB with malloc() rarely writes to all of it. The kernel bets that the combined peak usage of all processes will stay within physical capacity.
The risk: If the bet fails and physical memory is genuinely exhausted, the OOM killer fires. This is a calculated tradeoff that Linux makes by default — it would rather occasionally kill a process than refuse memory allocations that usually succeed.
| Region | Permissions | Backed By | ASLR | Typical Size | Notes |
|---|---|---|---|---|---|
| Null page | Unmapped | — | No | ~64KB | Catches null pointer dereferences |
| Text (.text) | r-xp | Binary file | Yes (PIE) | KB–MB | Shared between processes running same binary |
| Data (.data) | rw-p | Binary file | Yes (PIE) | KB–MB | COW — private copy on write |
| BSS | rw-p | Zero (anonymous) | Yes (PIE) | KB–MB | Zero-filled on demand |
| Heap | rw-p | Anonymous (swap) | Yes (partial) | 0–GBs | Managed by malloc, extended by brk() |
| Shared libraries | r-xp / r--p | .so file | Yes | MB–100s MB | Shared physical pages across all users |
| Stack | rw-p | Anonymous (swap) | Yes (28 bits) | Up to 8MB | Auto-expands downward on fault |
| VDSO | r-xp | Kernel | Yes | ~8KB | Fast syscalls without ring transition |
The VMA is the kernel's unit of virtual address space management. Every permission check, every page fault dispatch, every COW operation traces back to finding the right VMA. The red-black tree makes this lookup O(log n) even when processes have thousands of mappings — something a JVM process or a database server with hundreds of mmapped files will do constantly.
ASLR's randomization sits on top of this infrastructure: the VMAs exist at randomized base addresses, but once placed, the same demand-paging and fault-handling machinery handles them identically. From the kernel's perspective, a text segment at 0x400000 and one at a randomized 0x55a3c1200000 are handled by identical code — only the addresses differ.
The maps file is one of the most useful diagnostic tools available. When a process consumes unexpected memory, smaps_rollup shows exactly which categories (anonymous, file, shared) are consuming RSS and swap. That data, combined with knowledge of the VMA types, points directly to whether the problem is a memory leak (growing anonymous VMA), an oversized file cache (file-backed VMA), or simply a large working set that genuinely needs the RAM it is using.
Get this course's notes on Telegram!
Free cheat sheets, summaries & practice exercises