AiTechWorlds
AiTechWorlds
In 2011, Amazon engineers published a presentation that stunned the software industry. Amazon was deploying code to production at a rate of one deployment every 11.6 seconds on average — with peak rates exceeding one per second. This was not recklessness. It was the result of a deliberate engineering culture and infrastructure built around the principle that deploying frequently is safer than deploying rarely.
Netflix, by 2019, was performing thousands of deployments per day across hundreds of services. Not with a massive operations team manually managing each release — with automated pipelines that moved code from a developer's commit to production with minimal human intervention.
For most of software history, releasing software meant a stressful, all-hands, after-midnight event scheduled quarterly. Amazon and Netflix proved this is not a law of nature. It is a choice made by engineering culture and tooling.
Before DevOps, most technology organisations had a sharp divide:
These incentives were in direct conflict. Developers threw code "over the wall" to operations. Operations resisted deployments because deployments caused incidents. Both teams were optimising for their own goals, not the organisation's.
DevOps is the movement — and practice — of eliminating this wall. The core insight: the people who build software should also be responsible for running it in production. As Amazon's CTO Werner Vogels put it: "You build it, you run it."
DevOps principles:
Continuous Integration is the practice of every developer merging code into the shared main branch frequently — at least daily — with each merge triggering an automated suite of builds and tests.
Before CI, teams would work in isolation for weeks, then attempt to merge. The result was integration hell — conflicts, broken builds, and debugging sessions that took longer than the original development.
CI's core contract: the main branch is always in a deployable state. Automated tests run on every commit. If tests fail, the pipeline stops and alerts the developer immediately — not three weeks later when someone else breaks in a different way.
CI tools: GitHub Actions, GitLab CI, CircleCI, Jenkins, Buildkite.
These terms are often confused but represent different levels of automation:
Continuous Delivery (CD): Every successful CI build automatically deploys to a staging environment. The deployment to production requires a manual approval. Humans decide when to release, but the release process itself is automated.
Continuous Deployment: The most automated level. Every commit that passes all automated tests is automatically deployed to production with no human gate. Netflix and Amazon operate at this level for many services.
Most organisations practice Continuous Delivery. Continuous Deployment requires exceptional test coverage and confidence in the automated safety nets.
Stage 1 — Code Push: Developer pushes a feature branch to GitHub/GitLab. A pull request triggers the pipeline.
Stage 2 — Build: The code is compiled (if applicable), dependencies are installed, and a Docker image is built and tagged with the commit hash. The image is immutable — exactly what was tested is exactly what gets deployed.
Stage 3 — Test: Unit tests run first (fastest, fail early). Then integration tests against a real database in Docker. Then security scans (SAST, dependency audit). If any stage fails, the pipeline stops immediately.
Stage 4 — Artefact Storage: The passing Docker image is pushed to a container registry (Docker Hub, Amazon ECR, Google Artifact Registry). It is now available for any environment to pull.
Stage 5 — Staging Deployment: The image is deployed to a staging environment. Automated smoke tests (basic "is the app alive?" checks) and integration tests run against the real infrastructure.
Stage 6 — Production Deployment: Automated (for CD) or manual approval (for Delivery). The deployment strategy determines how traffic shifts to the new version.
The most famous phrase in software development: "It works on my machine."
Docker solves this by packaging an application and all its dependencies — the runtime, libraries, configuration — into a container. A container is a standardised unit that runs identically on a developer's laptop, a CI server, and a production server.
Key concepts:
Docker eliminated an entire category of environment mismatch bugs that previously consumed significant engineering time.
When you are running 700 services in Docker containers, you need software to manage them. Kubernetes (K8s), open-sourced by Google in 2014, is the standard container orchestration platform.
Core Kubernetes concepts:
What Kubernetes handles automatically:
| Strategy | Downtime | Risk | Rollback Speed | Complexity | Best For |
|---|---|---|---|---|---|
| Blue-Green | Zero | Low (instant rollback) | Instant | Medium | Databases migrations, major releases |
| Canary | Zero | Very Low (limited blast radius) | Fast | High | High-traffic services, gradual confidence |
| Rolling | Zero | Medium | Slow | Low | Stateless services with no DB changes |
| Recreate | Yes | High | Slow | Very Low | Dev/staging, acceptable downtime |
Blue-Green: Maintain two identical production environments. The current version (Blue) serves 100% of traffic. Deploy the new version to Green. When ready, switch the load balancer. If problems occur, switch back in seconds.
Canary: Route 1–5% of production traffic to the new version. Monitor error rates, latency, and business metrics. If healthy, gradually increase to 10%, 25%, 50%, 100%. If problems emerge, roll back by setting canary weight to 0%.
Netflix uses canary deployments for all service updates, monitoring thousands of metrics during the rollout period before committing to 100% traffic.
Feature Flags: Deploy code to 100% of servers but keep the feature disabled. Enable it for 1% of users, then 10%, then everyone. Roll back by toggling the flag — no code deployment needed. Tools: LaunchDarkly, Unleash, Flipt.
name: CI/CD Pipeline
on:
push:
branches: [main]
pull_request:
branches: [main]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Run unit tests
run: pytest tests/unit/
- name: Run integration tests
run: pytest tests/integration/
build-and-deploy:
needs: test
runs-on: ubuntu-latest
if: github.ref == 'refs/heads/main'
steps:
- name: Build Docker image
run: docker build -t myapp:${{ github.sha }} .
- name: Push to registry
run: docker push myapp:${{ github.sha }}
- name: Deploy to production (canary)
run: ./scripts/deploy-canary.sh ${{ github.sha }}
Amazon's 11.6-second deployment cadence is not a party trick — it is a competitive advantage. Frequent small deployments mean smaller blast radius when something goes wrong, faster time-to-market for features, and faster recovery when bugs are found.
The tools — Docker, Kubernetes, GitHub Actions — are means to an end. The end is a culture where deploying to production is a routine, low-anxiety event rather than a quarterly crisis. The pipeline enforces quality at every step, so by the time code reaches production, it has passed unit tests, integration tests, security scans, and staging validation. What Amazon and Netflix proved is that reliability and deployment speed are not in tension. With the right pipeline, they reinforce each other.
Get this course's notes on Telegram!
Free cheat sheets, summaries & practice exercises