AiTechWorlds
AiTechWorlds
The year is 2006. Netflix mails red envelopes containing DVDs to subscribers across America. Their entire technology platform runs on a single monolithic application — one codebase, one database, deployed as a single unit on a handful of servers. It works fine. Until it doesn't.
In August 2008, a critical database corruption incident brought Netflix's entire service down for three days. No DVDs could be shipped. The monolith had a catastrophic flaw: one failure could take down everything, because everything was one thing.
That incident forced a decision. Netflix migrated to Amazon Web Services starting in 2008 and spent the next seven years dismantling their monolith into independent services. By 2015, Netflix was running on more than 700 microservices — each one a separate, independently deployable application. Today, Netflix serves over 280 million subscribers in 190 countries. When the service that handles thumbnail images has a problem, your video keeps playing. Services fail independently, and the system keeps running.
Understanding microservices is understanding how modern software survives failure at scale.
Before diving into microservices, it is essential to state something that is often forgotten: monoliths are not bad. For most companies at most stages, a well-structured monolith is the right architecture.
| Aspect | Monolith | Microservices |
|---|---|---|
| Deployment | Single unit, simple | Many units, complex orchestration |
| Development | Easier to start, one codebase | Harder to start, multiple repos/pipelines |
| Scaling | Scale the whole app | Scale individual services independently |
| Failure isolation | One failure can cascade | Failures contained per service |
| Data | Shared database, easy queries | Each service owns its data, harder joins |
| Team size | Works well with small teams | Suited to large, autonomous teams |
| When to Choose | Early stage, small team, unclear domain | Large scale, multiple teams, clear domain boundaries |
Martin Fowler, who helped popularise microservices, also coined the Monolith First principle: build a monolith, learn your domain boundaries, then extract services when you feel genuine pain from the monolith. Premature decomposition into microservices creates complexity without benefit.
When microservices are the right choice, they rest on four principles:
1. Single Responsibility Each service owns exactly one business capability. The Payment Service handles payments. The User Service handles accounts. The Notification Service handles emails and SMS. No service does two unrelated things.
2. Own Your Data Each service has its own database. No two services share a database schema. This is the hardest principle to follow and the most important. Shared databases create invisible coupling — a change in one service's schema silently breaks another.
3. Communicate via APIs Services interact through well-defined interfaces: HTTP/REST, gRPC, or message queues. The internal implementation of a service — what language it uses, how it stores data — is invisible to other services.
4. Design for Failure Any service can fail at any time. Every service must handle the failure of its dependencies gracefully: with timeouts, retries, fallbacks, and circuit breakers.
The hardest question in microservices is: where do you draw the boundaries? The answer comes from Domain-Driven Design (DDD), introduced by Eric Evans in 2003.
A Bounded Context is a clear boundary within which a particular domain model applies. Inside the "Order" bounded context, "customer" means the person placing an order. Inside the "Support" bounded context, "customer" means a ticket-holder. They are different models of the same real-world entity.
Each bounded context maps naturally to a microservice. Domain events — immutable facts that something happened — cross context boundaries. OrderPlaced, PaymentFailed, UserRegistered are domain events that other services can react to without tight coupling.
Synchronous (Request/Response):
Asynchronous (Event-Driven):
Rule of thumb: Use synchronous for queries (you need the answer now). Use asynchronous for commands and events (trigger something and move on).
When a mobile app needs to display a user's home screen, it might need data from the User Service, the Recommendations Service, the Activity Service, and the Subscription Service. Without an API Gateway, the client makes four separate network calls.
The API Gateway is the single entry point for all clients. It handles:
Popular gateways: Kong, AWS API Gateway, Nginx, Traefik.
Microservices fail. When Service A calls Service B, and Service B is slow or down, Service A should not keep waiting and accumulating blocked threads. If it does, Service A fails too, and then Service C (which calls A) fails, until the entire system collapses. This is called cascading failure.
The Circuit Breaker pattern (named after the electrical component) prevents this.
Netflix built Hystrix as an open-source circuit breaker library. Though now in maintenance mode, it popularised the pattern. The modern replacement is Resilience4j. Netflix reported that Hystrix prevented thousands of cascading failures per day at peak usage.
When you have 700 services, managing TLS certificates, retries, timeouts, and observability for every service-to-service call becomes a full-time job. A service mesh solves this by deploying a lightweight proxy (called a sidecar) alongside every service instance.
The sidecar handles all network communication transparently: mTLS encryption, load balancing, retries, circuit breaking, and distributed tracing — without changing application code.
Istio (Google/IBM) and Linkerd are the leading service mesh implementations. Uber, Lyft, and Airbnb all run service meshes at scale.
Each box in this diagram is an independently deployed service with its own team, codebase, deployment pipeline, and database. A failure in the Recommendation Service means users see slightly outdated recommendations — not a dark screen.
Microservices solve problems but create new ones:
Netflix's 2008 database crash was the inciting incident that drove one of the most important architectural evolutions in software history. But the lesson is not "always use microservices." The lesson is that architecture must match scale and team structure.
Conway's Law states that software systems mirror the communication structure of the organisations that build them. If you have 5 engineers, a monolith is correct. If you have 500 engineers across 50 teams, microservices allow each team to own their domain and deploy independently. The architecture follows the organisation, not the other way around.
Get this course's notes on Telegram!
Free cheat sheets, summaries & practice exercises