They are not asking you to write code. They are not asking for a data structure or an algorithm. They want to watch how you think about building a system used by 400 million people simultaneously — a system that must accept 6,000 tweets per second, store them durably, and deliver them to followers across the world in under 200 milliseconds.

This question separates engineers who can code from engineers who can architect. System design is the discipline of making high-level structural decisions before a single line of production code is written.

What System Design Is

System design is the process of defining the architecture, components, interfaces, and data flows of a system to satisfy specified requirements — particularly non-functional requirements like scale, availability, and latency.

It answers questions that no individual class or function can answer:

How do we handle 10 million users when our current system handles 10,000?
What happens when our primary database server crashes at 3am?
How do we deliver a video file to a user in Tokyo from servers in Virginia?
How do we let 100 engineers deploy independently without breaking each other's work?

The System Design Process

Experienced engineers follow a consistent process when tackling system design questions — in interviews and in real projects.

Step 1 — Clarify Requirements: Before drawing anything, ask questions. What exactly must the system do? How many users? What is the read-to-write ratio? Do we need strong consistency or eventual consistency? Real interviews fail because candidates assume instead of ask.

Step 2 — Estimate Scale: Back-of-envelope calculations anchor the design in reality. A system for 1,000 users is radically different from one for 1 billion.

Step 3 — High-Level Design: Identify the major components and how data flows between them. Boxes and arrows.

Step 4 — Detailed Design: Deep-dive into the 2–3 most critical or interesting components.

Step 5 — Identify Bottlenecks: Where will the system fail? How does it recover? How does it scale?

Back-of-Envelope Estimation

Estimation is a required skill. Nobody expects exact numbers — they want to see how you reason.

Useful constants to memorise:

Metric	Value
Seconds per day	86,400 (~10^5)
1 million users × 1 request/day	~12 requests/second
1 KB text	1,000 bytes
1 image (compressed)	~300 KB
1 minute HD video (compressed)	~100 MB
1 TB storage	~1 billion KB

Twitter estimation example:

300 million DAU (Daily Active Users)
Each user reads 20 tweets per day
Read QPS = 300M × 20 / 86,400 = ~70,000 reads/second
Each user writes 1 tweet per 5 days = 0.2 tweets/day
Write QPS = 300M × 0.2 / 86,400 = ~700 writes/second
Each tweet: 280 characters × 2 bytes UTF-16 = 560 bytes ≈ 1 KB with metadata
Daily write storage = 700 writes/sec × 86,400 sec × 1 KB = ~58 GB/day

Key Non-Functional Requirements

These are the dimensions that determine how the system is built, not what it does.

Availability

The percentage of time a system is operational.

SLA	Annual Downtime	Example
99% ("two nines")	87.6 hours	Internal tools
99.9% ("three nines")	8.7 hours	Small businesses
99.99% ("four nines")	52 minutes	AWS EC2 SLA
99.999% ("five nines")	5 minutes	Telecom, payment processors

Amazon estimated in 2012 that every 100ms of added latency costs 1% in sales. Google found that a 500ms delay in search results causes a 20% drop in traffic. Availability and latency are not academic — they are directly tied to revenue.

Scalability

The ability to handle increasing load. There are two approaches:

Vertical scaling: bigger machine — more RAM, faster CPU. Limited by the largest available hardware.
Horizontal scaling: more machines. Theoretically unlimited, but requires stateless services and data partitioning.

Consistency

In a distributed system, all replicas should agree on the current state of data. But achieving perfect consistency means slower responses and lower availability.

The CAP Theorem

Eric Brewer's CAP theorem (2000, formally proven 2002): in a distributed system experiencing a network partition, you must choose between:

Choice	Sacrifice	Examples	Use When
CP (Consistent + Partition Tolerant)	Availability	MongoDB, HBase, Zookeeper	Banking transactions, inventory (correctness critical)
AP (Available + Partition Tolerant)	Strong Consistency	Cassandra, DynamoDB, CouchDB	Social feeds, shopping carts (availability critical)
CA (Consistent + Available)	Partition Tolerance	Traditional RDBMS (single-node)	Not viable in distributed systems — partitions always happen

Real example: Amazon's DynamoDB (AP) allows a shopping cart to accept items even when some replicas are unavailable. Two replicas might briefly disagree on cart contents — Amazon accepts this eventual consistency trade-off because a cart that silently refuses items loses sales.

Latency

Average latency can be misleading — 10% of your requests being slow means millions of users have bad experiences.
p99 latency (99th percentile): the response time below which 99% of requests fall. This is what SREs actually monitor.
p99.9 latency: critical for financial systems, real-time services.

Core System Design Components

Every large-scale system assembles from a standard toolkit of components:

Component	Role	Examples
DNS	Translates domain names to IP addresses	Route 53, Cloudflare
CDN	Serves static assets from edge locations near users	Cloudflare, AWS CloudFront, Akamai
Load Balancer	Distributes requests across app servers	AWS ALB, Nginx, HAProxy
API Gateway	Single entry point for clients; handles auth, rate limiting	AWS API Gateway, Kong
Cache	In-memory store for frequent reads	Redis, Memcached
Message Queue	Asynchronous communication between services	Kafka, AWS SQS, RabbitMQ
Database	Durable data storage	PostgreSQL, MySQL, MongoDB, DynamoDB

Designing Twitter: High-Level Architecture

Applying the process to Twitter's core features (post tweet, read home timeline):

Functional requirements clarified: Post tweets, follow users, read home timeline (tweets from followed users, most recent first).

Scale estimates (from above): ~700 writes/sec, ~70,000 reads/sec — this is a read-heavy system (100:1 read/write ratio).

Key design decisions:

Pre-compute timelines (fan-out on write): when a user tweets, push the tweet ID to all followers' timeline caches immediately. Reading a timeline is a single Redis lookup — O(1).
Exception for celebrities: users with 10M+ followers (Obama, Taylor Swift) use fan-out on read — pushing to 10M caches per tweet is too expensive.
CDN for media: images and videos served from edge locations, not origin servers.
Eventual consistency for timeline: it is acceptable if a follower sees a new tweet 1–2 seconds after it is posted.

Non-Functional Req	Metric	How to Achieve	Trade-off
Availability	99.99%	Multi-region deployment, read replicas	Higher cost and complexity
Scalability	10× traffic spike	Horizontal app servers, auto-scaling	Stateless services required
Consistency	Eventual (seconds)	AP design (Cassandra for timelines)	Old timeline briefly visible
Latency	p99 < 200ms	Redis caching, CDN, read replicas	Cache invalidation complexity
Durability	No tweet loss	Replication factor 3, write-ahead log	Storage cost

Key Takeaways

System design is about architectural decisions at scale — the questions no algorithm can answer alone.
The five-step process (clarify → estimate → high-level design → detailed design → bottlenecks) provides a framework for tackling any design problem.
Back-of-envelope estimation is not about precision — it is about reasoning in the right order of magnitude.
The CAP theorem forces a choice between consistency and availability during network partitions. Real systems (DynamoDB, Cassandra) are AP; banking systems are CP.
p99 latency, not average latency, is what matters for user experience and SRE monitoring.
Every large system uses the same component toolkit: CDN, load balancer, cache, message queue, database replicas — assembled differently for each problem.
Twitter's 100:1 read-to-write ratio drives its core architectural choice: pre-compute timelines on write, serve on read.

💬 DiscussionPowered by GitHub Discussions

📱

Get this course's notes on Telegram!

Free cheat sheets, summaries & practice exercises

Get Notes Free →

30 minLesson 8 of 18

Course Contents(18 lessons)

▾

Chapter 1: SE Foundations

What Is Software Engineering? The Discipline Explained20 min

SDLC Models: Waterfall, Agile, Spiral, V-Model25 min

Agile, Scrum, and Kanban: How Teams Actually Work28 min

Requirements Engineering: User Stories to Specifications28 min

Chapter 2: Design Principles and Patterns

Software Design Principles: SOLID, DRY, KISS, YAGNI30 min

Creational Design Patterns: Singleton, Factory, Builder32 min

Structural and Behavioral Patterns: Decorator, Observer, Strategy35 min

Chapter 3: System Design Fundamentals

System Design Fundamentals: Approach and Trade-offs30 min

Scalability: Vertical, Horizontal, Load Balancing32 min

Caching Strategies: Redis, CDN, Cache Invalidation28 min

Databases in System Design: SQL vs NoSQL Trade-offs32 min

Chapter 4: Architecture Patterns

Microservices Architecture: Design and Communication35 min

API Design: REST, GraphQL, and gRPC30 min

Message Queues and Event-Driven Architecture28 min

Chapter 5: Quality and Delivery

Software Testing: Unit, Integration, E2E, TDD28 min

Security in Software Engineering: OWASP Top 1028 min

DevOps and CI/CD: From Code to Production30 min

Chapter 6: Final Project

Final Project: Design a URL Shortener at Scale45 min