AiTechWorlds
AiTechWorlds
Consider a calculator that must correctly evaluate ((3 + 4) * (2 + 1)). It needs to track nested parentheses: every open parenthesis must eventually be matched by a close parenthesis, and nesting can go arbitrarily deep. A regular finite automaton — a DFA — has no way to count or track depth. It can only remember which state it is in, and the number of states is fixed at design time. The moment a language requires unbounded counting or matching, DFAs fail. The solution is to add a stack: an unlimited last-in, first-out memory that a machine can push to and pop from. A Pushdown Automaton (PDA) is exactly a nondeterministic finite automaton augmented with a stack, and it recognises precisely the class of context-free languages.
To accept 0ⁿ1ⁿ, a machine must count how many 0s appeared, then verify that exactly that many 1s follow. A DFA has only finitely many states. By the Pigeonhole Principle, if you feed more 0s than there are states, two different counts must map to the same state — the machine has lost track of the count. More formally, the Pumping Lemma for Regular Languages proves {0ⁿ1ⁿ} is not regular. A stack solves this instantly: push a marker for every 0, then pop one marker for every 1, and accept if the stack is empty at the end.
A pushdown automaton is a 7-tuple P = (Q, Σ, Γ, δ, q₀, Z₀, F) where:
The subscript ε in Σ_ε and Γ_ε means the symbol can optionally be ε, denoting "read nothing" or "do not touch the stack top."
| Operation | Meaning | Transition Notation |
|---|---|---|
| Push symbol A | Place A on top of stack | Pop ε, push A (net: stack grows by A) |
| Pop symbol A | Remove A from top | Read A from top, push ε |
| Replace A with BC | Swap top | Read A, push BC |
| Read without popping | Peek at top | Read A, push A back |
Stack trace for input 0011:
| Step | State | Remaining Input | Stack (top → bottom) |
|---|---|---|---|
| Start | q₀ | 0011 | Z₀ |
| Read 0, push 0 | q₀ | 011 | 0 Z₀ |
| Read 0, push 0 | q₀ | 11 | 0 0 Z₀ |
| ε-transition | q₁ | 11 | 0 0 Z₀ |
| Read 1, pop 0 | q₁ | 1 | 0 Z₀ |
| Read 1, pop 0 | q₁ | `` | Z₀ |
| Stack = Z₀ only | q_acc | `` | Z₀ — accept |
A PDA can accept in two equivalent ways:
These two modes are computationally equivalent — any PDA using one mode can be converted to an equivalent PDA using the other. Most textbook definitions use accept by final state.
| Property | DPDA | NPDA |
|---|---|---|
| At most one transition per (state, input, stack-top) | Yes | No |
| Language class recognised | Deterministic CFLs (DCFL) | All CFLs |
| Closed under complement | Yes | No |
| Real-world use | LL(k), LR(k) parsers | Theoretical analysis |
This is a crucial distinction: Deterministic CFLs ⊂ All CFLs (strictly). There are context-free languages that no DPDA can recognise — for example, {ww^R | w ∈ {0,1}*} (palindromes) requires nondeterminism to guess the midpoint. However, all programming language grammars in practice are designed to be deterministic, because parsers must run efficiently in linear time without backtracking.
Theorem (Sipser, 2013, Theorem 2.20): A language is context-free if and only if some pushdown automaton recognises it.
This theorem has two directions:
The two formalisms — grammars and automata — are two faces of the same class of languages.
| Model | Memory | Language Class | Example Language | Parser Type |
|---|---|---|---|---|
| DFA | Finite states only | Regular | Regex, tokens | Lexer (e.g., flex) |
| DPDA | Stack (deterministic) | Deterministic CFL | Most programming languages | LL(k), LALR(1) |
| NPDA | Stack (nondeterministic) | All CFLs | Ambiguous grammars | Earley, GLR |
| Turing Machine | Infinite tape | All decidable | Type checking, semantics | Semantic analyser |
Top-down parsing (LL parsers) simulates a PDA that pushes grammar rules onto the stack and matches terminals against input from left to right. Python's parser (cpython/Parser/) is an LL(1) parser — at each step it looks ahead exactly one token to decide which production to use.
Bottom-up parsing (LR parsers, LALR parsers) is used by most C and Java compilers. The LR parser uses a stack of states (rather than grammar symbols) and reduces input segments to variables when a complete rule right-hand side has been consumed.
Every time your IDE underlines a mismatched bracket, misplaced keyword, or unclosed string literal in red — a PDA-equivalent parser has rejected your input. The stack in the parser literally tracks the nesting depth of braces, parentheses, function calls, and block structures. The PDA's stack is not a metaphor: in recursive-descent parsers, the call stack of the parser function is the stack of the automaton.
Understanding PDAs means understanding why certain language features are easy for compilers (balanced parentheses — trivially handled by a stack) while others are hard (matching variable names across a scope — requires something beyond a CFG, handled by a symbol table in a later compiler phase).
Sources: Sipser, M. (2013). Introduction to the Theory of Computation (3rd ed.). Cengage. | Hopcroft, J. E., Motwani, R., & Ullman, J. D. (2006). Introduction to Automata Theory, Languages, and Computation (3rd ed.). Pearson. | Aho, A. V., Lam, M. S., Sethi, R., & Ullman, J. D. (2006). Compilers: Principles, Techniques, and Tools (2nd ed.). Pearson.
Get this course's notes on Telegram!
Free cheat sheets, summaries & practice exercises