In 1936, Alan Turing was 23 years old and working on a problem that had occupied the greatest mathematical minds of his generation: David Hilbert's Entscheidungsproblem (German for "decision problem"). Hilbert asked in 1928 whether there exists a mechanical procedure — an algorithm — that can determine the truth or falsity of any mathematical statement. Turing's answer was no. But to prove it, he first had to precisely define what an "algorithm" or "mechanical procedure" even means. His solution was a hypothetical machine so stripped-down it could be described in a paragraph, yet so powerful it can compute anything any modern computer can compute. We call it a Turing machine, and it remains the foundation of theoretical computer science nearly ninety years later.

The Intuition: A Machine with Unlimited Memory

A DFA has finitely many states and no external memory. A PDA adds a stack — but a stack is one-dimensional and accessed only at the top. A Turing machine (TM) replaces the stack with an infinite tape divided into cells, each holding one symbol. A read/write head can move left or right along the tape, reading and writing symbols at will. This seemingly small upgrade — unlimited, randomly accessible memory plus the ability to write — makes the TM infinitely more powerful than a PDA.

Formal Definition

A Turing machine is a 7-tuple M = (Q, Σ, Γ, δ, q₀, q_accept, q_reject) where:

Q — a finite set of states.
Σ — the input alphabet. Crucially, the blank symbol ⊔ is not in Σ.
Γ — the tape alphabet, with Σ ∪ {⊔} ⊆ Γ. The tape can hold any symbol in Γ; cells not yet written hold ⊔.
δ: Q × Γ → Q × Γ × {L, R} — the transition function. Given the current state and the tape symbol under the head, it produces a new state, a symbol to write (replacing the current symbol), and a direction to move (Left or Right).
q₀ ∈ Q — the initial state.
q_accept ∈ Q — the accept state. Entering this state immediately halts and accepts.
q_reject ∈ Q — the reject state. Entering this state immediately halts and rejects. q_accept ≠ q_reject.

How a Turing Machine Computes

The input string w is placed on the tape, one symbol per cell, starting at cell 1. All other cells hold ⊔.
The machine begins in state q₀ with the head at cell 1.
At each step: read the symbol under the head, apply δ, write the new symbol, move the head, enter the new state.
Halt-accept: the machine enters q_accept. The input is accepted.
Halt-reject: the machine enters q_reject. The input is rejected.
Loop: the machine never halts. The input is neither accepted nor rejected — the computation runs forever.

This third possibility — looping forever — is what makes TMs fundamentally different from DFAs and PDAs, which always halt.

TM vs. DFA vs. PDA

Model	Memory	Move Direction	Can Write?	Halting Guaranteed?	Power
DFA	States only	One direction	No	Yes (always halts)	Regular languages
PDA	Stack (LIFO)	One direction	Stack only	Yes (always halts)	Context-free languages
Turing Machine	Infinite tape (R/W)	Both L and R	Yes	No (may loop)	All decidable/RE languages

The three differences — read/write tape, bidirectional movement, infinite memory — together give the TM its extraordinary power.

Example: TM for {0ⁿ1ⁿ | n ≥ 1}

The algorithm: scan left to right. Mark the first unmarked 0 (write X). Scan right to find the first unmarked 1 (write Y). Return to the left end. Repeat. Accept if all 0s and 1s are marked when the scan finds no more unmarked 0s.

Tape evolution for input 001 1:

Step 0: [0][0][1][1][⊔]   head at pos 1, state q0
Step 1: [X][0][1][1][⊔]   write X, move R, → q1
Step 3: [X][0][Y][1][⊔]   skip 0, find 1, write Y, → q2
Step 5: [X][0][Y][1][⊔]   scan back to X
Step 6: [X][0][Y][1][⊔]   X seen, move R → q0
Step 7: [X][X][Y][1][⊔]   mark second 0 → q1
Step 9: [X][X][Y][Y][⊔]   find remaining 1, mark Y → q2
Step 11:[X][X][Y][Y][⊔]   scan back, see X → q0
         only Y's remain → q_accept  ✓

Multi-Tape Turing Machines

A k-tape Turing machine has k independent tapes and k read/write heads, with a transition function that reads all k heads simultaneously and moves each independently. This is more convenient to program.

Theorem: Every multi-tape TM has an equivalent single-tape TM. The single-tape TM simulates the multi-tape TM by encoding all k tapes on one tape, separated by delimiters. There is a polynomial time overhead, but the computational power is identical.

Nondeterministic Turing Machines

A nondeterministic TM has a transition function δ: Q × Γ → 𝒫(Q × Γ × {L, R}) — at each step, multiple transitions are possible. The machine accepts if any branch of computation accepts.

Theorem: Every nondeterministic TM has an equivalent deterministic TM. (The deterministic TM simulates all branches using a breadth-first search tree.) However, the simulation may require exponential time — this is exactly the P vs. NP question at the level of machine models.

The Church-Turing Thesis

The Church-Turing Thesis (1936): Every function that is effectively computable by any algorithm — by any reasonable model of computation — can be computed by some Turing machine.

Alonzo Church independently reached the same conclusion using lambda calculus in 1936, and proved that his system was equivalent to Turing machines. Later, Gödel's recursive functions, Post's tag systems, and modern programming languages like Python or Java were all shown to be Turing-equivalent.

The Church-Turing Thesis is not a theorem — it cannot be formally proved because "effectively computable" is an informal concept. But it is universally accepted. No one has ever produced an algorithm that can be described informally but that lies beyond TM capability.

The Universal Turing Machine: The Concept of Software

A Universal Turing Machine (UTM) is a single TM that takes as input the encoding ⟨M, w⟩ of another TM M and a string w, and simulates M on w. This is the theoretical origin of the concept of software: a program (M) is just data that another program (the UTM/computer) interprets. Every modern computer is a physical realisation of a UTM.

When John von Neumann designed the stored-program computer architecture in 1945, he was essentially building a UTM in hardware — the first machines before that were fixed-purpose; von Neumann's insight was that program instructions and data could live in the same memory, making the machine universal.

Sources: Turing, A. M. (1936). On computable numbers, with an application to the Entscheidungsproblem. Proceedings of the London Mathematical Society, 2(42), 230–265. | Sipser, M. (2013). Introduction to the Theory of Computation (3rd ed.). Cengage. | Church, A. (1936). An unsolvable problem of elementary number theory. American Journal of Mathematics, 58(2), 345–363.

💬 DiscussionPowered by GitHub Discussions

📱

Get this course's notes on Telegram!

Free cheat sheets, summaries & practice exercises

Get Notes Free →

38 minLesson 13 of 16

Course Contents(16 lessons)

▾

Chapter 1: Mathematical Foundations

What Is Theory of Computation? Why It Matters20 min

Mathematical Foundations: Sets, Functions, Proofs28 min

Chapter 2: Finite Automata

Deterministic Finite Automata (DFA): States and Transitions35 min

Non-Deterministic Finite Automata (NFA)32 min

NFA to DFA: Subset Construction Algorithm30 min

Chapter 3: Regular Languages

Regular Expressions: Pattern Matching Formalized32 min

Pumping Lemma: Proving Languages Are Not Regular28 min

Closure Properties of Regular Languages25 min

Chapter 4: Context-Free Languages

Context-Free Grammars: Defining Programming Languages35 min

Pushdown Automata: Adding a Stack to Finite Automata32 min

CNF and the CYK Parsing Algorithm30 min

Pumping Lemma for Context-Free Languages25 min

Chapter 5: Computability

Turing Machines: The Universal Computer38 min

Decidability, Undecidability, and the Halting Problem32 min

Chapter 6: Computational Complexity

P vs NP: The Million-Dollar Question35 min

NP-Completeness and Polynomial Reductions38 min