AiTechWorlds
AiTechWorlds
The year is 2003. Brad Fitzpatrick, the 23-year-old founder of LiveJournal, is watching his database servers buckle under load. LiveJournal had grown to millions of users posting blog entries, commenting, and refreshing pages — and every single page load hit the database. The MySQL servers were maxing out their CPU, queries were timing out, and users were seeing blank pages.
Fitzpatrick did not buy bigger servers. Instead, he built Memcached — a simple in-memory key-value store that sat in front of the database and answered the same question repeatedly without touching the database at all. The result: database load dropped by 95% overnight.
That insight — serve the same data from memory instead of re-reading it from disk — is now fundamental to how every high-traffic website on the planet operates. Understanding caching is understanding why systems can scale.
Every database read involves disk I/O, query parsing, index lookups, and network round trips. For popular content read thousands of times per second, repeating that work is wasteful.
Caching solves three interconnected problems:
The principle is simple: if data is expensive to compute or fetch, and the same data is requested repeatedly, remember the answer.
The most common caching pattern. The application manages the cache directly.
How it works:
Pros: Only data that is actually requested gets cached (no wasted memory). Cache failures are non-fatal — the app falls back to the database.
Cons: The first request after a cache miss always pays the full database cost. On a fresh cache restart, every request is a miss — this is called a cold start.
Used by: Amazon product pages, Twitter timelines, most web applications.
Every write goes to the cache and the database simultaneously. The cache is always up to date.
Pros: No stale data in cache. Reads are always fast.
Cons: Every write is slower because it must complete two operations. Cache fills with data that may never be read.
Used by: Banking dashboards, inventory systems — anywhere stale reads are unacceptable.
Writes go to the cache immediately, but the database write is deferred and handled asynchronously in the background.
Pros: Extremely fast writes — the application does not wait for the database. Good for high-write workloads like gaming leaderboards.
Cons: If the cache crashes before the async write completes, data is lost. Not suitable for financial or transactional data.
The cache sits transparently in front of the database. The application only talks to the cache. On a miss, the cache fetches from the database and populates itself.
Pros: Simpler application code — one data source to manage.
Cons: First-request latency. Less control over what gets cached.
Redis (Remote Dictionary Server) was created by Salvatore Sanfilippo in 2009 and is now the most widely deployed cache in the world. Unlike Memcached, Redis is far more than a key-value store.
| Data Structure | Example Use Case |
|---|---|
| String | Session tokens, counters, API responses |
| List | Activity feeds, job queues |
| Hash | User profile objects |
| Set | Unique visitors, tag collections |
| Sorted Set | Leaderboards, rate limiting windows |
| Stream | Real-time event logs |
Redis operates entirely in memory and delivers 100,000+ read/write operations per second on a single node. Key features:
SET user:42 "data" EX 3600 expires in one hour.Fitzpatrick's Memcached remains relevant for one specific use case: pure, high-throughput key-value caching with multiple CPU cores. It is multi-threaded (Redis was historically single-threaded, though Redis 6.0 added I/O threading), has no persistence, and supports no data structures beyond strings. If you need the absolute maximum throughput for simple caches and do not need any of Redis's advanced features, Memcached remains competitive.
A Content Delivery Network is a geographically distributed network of cache servers — called edge nodes — placed close to users worldwide.
When a user in Tokyo requests a video thumbnail, the edge node in Tokyo serves it directly from its local cache — not from a server in Virginia. Round-trip time drops from ~200ms to ~5ms.
Major CDN providers: CloudFront (AWS), Fastly, Cloudflare, Akamai.
CDNs cache static assets: images, CSS, JavaScript files, fonts, and videos. They can also cache entire HTML pages for anonymous users.
Netflix is the canonical CDN success story. Netflix built its own CDN called Open Connect and embedded its appliances directly inside ISP networks. Today, Netflix serves roughly 15% of all global internet traffic through this CDN, with most streams never leaving the ISP's own network. A user in London watching Stranger Things is almost certainly streaming from a server inside their ISP's data centre, not from Netflix's cloud.
Phil Karlton, a Netscape engineer, famously said: "There are only two hard things in Computer Science: cache invalidation and naming things."
When the underlying data changes, the cache must be updated or the application serves stale data. Three common approaches:
TTL-based expiration: Every cached item has a time-to-live. After expiry, the next request fetches fresh data. Simple, but data can be stale for the full TTL duration.
Event-driven invalidation: When data changes, explicitly delete or update the relevant cache key. Ensures freshness but requires careful coordination between services.
Cache stampede (Thundering Herd): When a popular cached item expires, thousands of requests simultaneously hit the database before the cache is repopulated. Solutions: mutex locking (only one request populates the cache), probabilistic early expiration (proactively refresh before TTL ends), or staggered TTLs.
When the cache is full, something must be removed to make room for new data.
| Policy | How It Works | Best For |
|---|---|---|
| LRU (Least Recently Used) | Evict the item not accessed for the longest time | General-purpose caching |
| LFU (Least Frequently Used) | Evict the item accessed fewest times overall | Long-lived caches with clear hot/cold data |
| FIFO (First In, First Out) | Evict the oldest item regardless of access | Simple queues, not general caching |
| Random | Evict a random item | Surprisingly effective, extremely fast |
Redis supports LRU and LFU eviction policies configurable per-instance.
| Strategy | Read Performance | Write Performance | Complexity | Data Consistency | Use Case |
|---|---|---|---|---|---|
| Cache-Aside | Fast (after warm-up) | Unchanged | Low | Eventual | General web apps, APIs |
| Write-Through | Fast | Slower | Medium | Strong | Financial dashboards |
| Write-Back | Fast | Very fast | High | Weak (risk of loss) | Gaming scores, analytics |
| Read-Through | Fast (after warm-up) | Unchanged | Low | Eventual | ORM-level caching |
| CDN | Very fast (edge) | Not applicable | Low (managed) | Eventual | Static assets, media |
Caching is one of the highest-leverage optimisations in system design. Brad Fitzpatrick's 2003 insight still holds: the fastest database query is the one you never make. Modern systems layer caches at every level — in-process memory, distributed cache (Redis), and edge CDN — each absorbing a different class of request.
The discipline is knowing what to cache, how long to cache it, and when to invalidate it. Those three decisions determine whether a cache is a performance multiplier or a source of subtle, hard-to-debug data corruption.
Get this course's notes on Telegram!
Free cheat sheets, summaries & practice exercises