AiTechWorlds
AiTechWorlds
Picture the moment a major city wakes up on a Monday morning. At 8:47 AM, hundreds of thousands of people simultaneously open the Uber app and request a ride. In the span of 60 seconds, Uber receives over one million ride requests.
If Uber processed every request synchronously — meaning each request had to fully complete before the next could begin — the system would collapse immediately. You cannot answer a million things at once in a straight line.
Instead, Uber does exactly what a ticket booth does when a crowd arrives: it creates a queue. Each ride request is placed into a message queue the moment it arrives. The matching system, the pricing engine, the driver notification service — all of them read from this queue at their own pace. No request is dropped. No system is overwhelmed. The queue acts as a buffer between the world's unpredictable demand and the system's finite processing capacity.
This is the core insight of asynchronous, event-driven architecture: decouple when something is requested from when it is processed.
In synchronous communication, the caller waits for the response. Service A calls Service B and blocks until B replies.
This creates three problems at scale:
Asynchronous communication through a message queue solves all three:
Before examining specific technologies, the vocabulary:
RabbitMQ, first released in 2007, implements the AMQP (Advanced Message Queuing Protocol). It is a general-purpose message broker focused on routing flexibility.
RabbitMQ's model: producers send messages to an exchange. The exchange applies routing rules and forwards messages to one or more queues. Consumers read from queues.
Routing types:
Consumer acknowledgement (ack/nack): A consumer sends an ack after successfully processing a message. Until the broker receives the ack, the message is not removed. If the consumer crashes, the message is redelivered to another consumer. A nack explicitly rejects a message — it can be requeued or dead-lettered.
Use cases for RabbitMQ: Task queues (background job processing), complex routing workflows, when message ordering within a task is critical and message volumes are moderate (tens of thousands per second, not millions).
Kafka was built at LinkedIn in 2011 by Jay Kreps, Neha Narkhede, and Jun Rao, and open-sourced the same year. It has a fundamentally different design from traditional message brokers.
Kafka's model: an immutable, append-only, distributed log.
Messages are written to the end of a log and never modified. They stay in the log for a configurable retention period — days, weeks, or forever. Consumers read from any position in the log by specifying an offset. Multiple consumer groups can read the same data independently, each tracking their own offset.
Why the distributed log is powerful:
Kafka at LinkedIn (2023): LinkedIn processes over 7 trillion messages per day on Kafka. It is the backbone of LinkedIn's entire data infrastructure — activity tracking, notifications, job recommendations, and analytics all flow through Kafka.
Amazon SQS (Simple Queue Service) is AWS's managed message queue. No servers to provision, no clusters to manage. It provides:
SQS is the right choice when you want queue semantics without operational overhead and your workload fits within its constraints.
Message queues are infrastructure. Event-driven architecture (EDA) is the design pattern built on top of them.
In EDA, services communicate by publishing events — immutable records that something happened in the past.
UserRegisteredOrderPlacedPaymentFailedItemShippedEvents are facts, not commands. Publishing OrderPlaced is announcing a fact. The publisher does not know or care which services react to it.
This is the fanout pattern: one event triggers multiple independent reactions, all in parallel, without the Order Service knowing any of them exist.
Traditional systems store the current state: the user's account balance is $1,240.
Event sourcing stores the history of events instead: AccountOpened($500), Deposit($1,000), Withdrawal($260). The current state is computed by replaying events. This gives you a complete audit log, the ability to replay events to rebuild state, and the ability to answer questions like "what was this user's balance on March 3rd?"
In systems with complex domains, the optimal data model for writing data is often different from the optimal model for reading it. CQRS separates these into two models:
Events flow from the command model to update the query model asynchronously.
Dead Letter Queue (DLQ): When a message fails processing repeatedly (e.g., 3 retries), move it to a separate DLQ instead of infinitely retrying. Engineers can inspect the DLQ to diagnose and fix issues without blocking the main queue.
Poison Message: A message that consistently causes consumers to crash. Without a DLQ, a poison message can loop forever, blocking the entire queue.
Idempotency: Because at-least-once delivery can deliver a message more than once, consumers must handle duplicate messages correctly. Processing PaymentCharged twice should not charge the customer twice. Use unique idempotency keys to detect and ignore duplicates.
| Technology | Delivery | Ordering | Throughput | Persistence | Replay | Use Case |
|---|---|---|---|---|---|---|
| RabbitMQ | At-least-once | Per-queue | High (100K/s) | Configurable | No | Task queues, routing |
| Apache Kafka | At-least-once | Per-partition | Very high (M/s) | Yes (days/weeks) | Yes | Event streaming, audit logs |
| Amazon SQS | At-least-once | FIFO option | High (managed) | Managed | No | Serverless, AWS-native |
| Google Pub/Sub | At-least-once | No guarantee | Very high | 7 days | Limited | GCP-native, global scale |
| Redis Streams | At-least-once | Per-stream | Very high | Optional | Yes | Low-latency, simple streaming |
Uber's Monday morning rush is solvable not because their servers are infinitely fast, but because their architecture separates receiving requests from processing requests. The queue is the buffer that makes this separation possible.
Message queues and event-driven architecture are what allow modern systems to be simultaneously reliable (messages are not dropped), scalable (consumers scale independently), and resilient (failures in one service do not cascade to others). LinkedIn's 7 trillion daily messages on Kafka is the extreme end — but the principles apply at every scale, from a startup processing 100 orders per day to a platform handling a million ride requests per minute.
Get this course's notes on Telegram!
Free cheat sheets, summaries & practice exercises