AiTechWorlds
AiTechWorlds
Imagine a library with 500,000 books but no catalog, no shelving system, no librarian. Every visitor who wants a book must physically walk through every shelf until they find it — or don't. New books get placed wherever there is empty space. Old books are "removed" by putting a sticky note on them that says "available," but the book stays in place until someone puts a new book there.
This would be chaos. The Dewey Decimal System (or Library of Congress Classification) exists precisely to bring structure: every book gets a unique identifier, a fixed location, and metadata (title, author, subject) that makes retrieval predictable and fast.
A file system is that organizational system for your storage device. Without it, your 1 TB hard drive is just a flat sequence of 2 billion 512-byte sectors with no way to find anything.
A file is a named, typed collection of data stored persistently on a storage device. Every file has attributes maintained by the file system:
| Attribute | Description | Example |
|---|---|---|
| Name | Human-readable identifier | report.pdf |
| Type | Extension or magic bytes | .pdf, .txt, ELF binary |
| Size | Current size in bytes | 2,048,576 bytes |
| Location | Pointer to data blocks on disk | Block 48239 |
| Timestamps | Created, modified, accessed | 2026-06-01 14:30 |
| Permissions | Who can read/write/execute | rwxr-xr-- |
| Owner | User and group ownership | uid=1000, gid=100 |
Directories organize files into a hierarchical tree. Each directory is itself a file — its content is a list of (name, inode_number) pairs mapping file names to their metadata.
/
├── etc/
│ ├── passwd
│ └── hosts
├── home/
│ └── alice/
│ ├── documents/
│ │ └── report.pdf
│ └── .bashrc
└── var/
└── log/
└── syslog
Absolute path: starts from root (/): /home/alice/documents/report.pdf
Relative path: starts from current directory: documents/report.pdf
Path resolution: the OS traverses directory entries one component at a time, looking up each name in the current directory's inode until reaching the final file.
How does the file system store file data across physical disk blocks?
All blocks of a file are stored in consecutive disk locations.
File A: blocks 10, 11, 12, 13
File B: blocks 20, 21, 22
Pros: Fast sequential access, simple to implement (just store start block + length)
Cons: External fragmentation (holes between files); files cannot grow easily without moving
Each block contains a pointer to the next block. File is a linked list of blocks.
Block 10 → Block 25 → Block 3 → Block 47 → NULL
Pros: No external fragmentation; files can grow freely
Cons: Random access is O(n) — must follow pointers; pointer takes space in each block; pointer corruption loses entire file tail
FAT (File Allocation Table): a linked allocation system that moves pointers into a central table (the FAT) rather than in each block. Used in FAT16, FAT32, and exFAT (USB drives, SD cards).
A dedicated index block (or inode) stores all pointers to the file's data blocks.
inode:
direct[0] → block 120
direct[1] → block 345
...
direct[11] → block 892
indirect → [block of pointers to more blocks]
dbl_indirect→ [block of pointers to indirect blocks]
trbl_indirect→ [block of pointers to double indirect blocks]
Pros: Fast random access (any block reachable in 1–3 steps); no external fragmentation
Cons: Small files waste an entire inode block; large files need multiple levels of indirection
This is the foundation of Unix/Linux file systems (ext2, ext3, ext4, XFS, ZFS).
An inode (index node) stores all metadata about a file except its name. It contains:
The directory entry stores: (filename → inode number). The inode stores everything else.
stat /etc/passwd
# Output:
# File: /etc/passwd
# Size: 2847 Blocks: 8 Inode: 524308
# Access: -rw-r--r-- Uid: 0 Gid: 0
# Access: 2026-05-28 10:12:03
# Modify: 2026-05-01 09:33:21
Hard link: creates a new directory entry pointing to the same inode. Both names refer to the same physical data. Deleting one name does not delete the data — only when link count reaches 0 is the inode freed.
ln file.txt hardlink.txt # same inode, both equally valid names
Symbolic link (symlink): a special file whose content is a path string pointing to another file. Like a shortcut. If the target is deleted, the symlink becomes dangling (broken).
ln -s /etc/hosts hosts_link # symlink — stores the string "/etc/hosts"
| Feature | Hard Link | Symbolic Link |
|---|---|---|
| Same inode? | Yes | No (new inode) |
| Works across filesystems? | No | Yes |
| Survives target deletion? | Yes (inode persists) | No (dangling link) |
| Can link directories? | No (usually) | Yes |
Without journaling, a power failure mid-write can leave the file system in an inconsistent state — a file half-written, a directory update incomplete, an inode partially updated. fsck (file system check) on a large disk takes minutes to hours.
Journaling solves this by writing a log (journal) before modifying the actual data structures:
On crash recovery, the OS replays any committed but not-yet-applied journal entries. Recovery takes seconds instead of hours.
| File System | OS | Max File Size | Max Volume Size | Journaling | Notes |
|---|---|---|---|---|---|
| FAT32 | Cross-platform | 4 GB | 2 TB | No | USB drives, legacy |
| exFAT | Cross-platform | 16 EB | 128 PB | No | Modern USB, SD cards |
| NTFS | Windows | 16 EB | 16 EB | Yes | Windows default |
| ext4 | Linux | 16 TB | 1 EB | Yes | Linux default |
| XFS | Linux | 8 EB | 8 EB | Yes | Large files, servers |
| APFS | macOS/iOS | 8 EB | 8 EB | Yes (copy-on-write) | SSD-optimized |
| ZFS | Linux/FreeBSD | 16 EB | 256 ZB | Yes (COW) | Data integrity, RAID |
The file system is the invisible infrastructure that makes storage usable:
Every time you save a file, open a document, or run a program, the file system performs dozens of these operations transparently in microseconds.
Get this course's notes on Telegram!
Free cheat sheets, summaries & practice exercises