AiTechWorlds
AiTechWorlds
Imagine a nightclub bouncer with a very bad memory. He can remember his current "mood" — but only one of a fixed number of moods. He has no notepad, no phone, no way to count.
Some rules he can enforce: "No one under 5'6"." "No sneakers." These are rules about the current state of the person in front of him — they require only finite memory.
But some rules he cannot enforce: "The number of people inside must always equal the number of people who entered." To enforce that, he would need to count — and counting requires memory that grows without bound. His finite moods are not enough.
A DFA is that bouncer. It can only remember which state it is currently in. Languages that require counting — like matching equal numbers of 0s and 1s — are beyond its ability. But how do we prove that no DFA can recognize such a language?
That proof tool is the Pumping Lemma.
Recall the Pigeonhole Principle: if you have more pigeons than holes, some hole contains more than one pigeon.
Apply this to a DFA with p states processing a string w with |w| ≥ p:
When a state is revisited, the DFA has gone around a loop. The substring read during that loop is a string y. If we repeat y any number of times (pump it), the DFA still goes around the same loop that many times — and still ends in the same final state.
If the original string was accepted, then pumping y (any number of times) must also produce an accepted string. This is the pumping property.
Pumping Lemma for Regular Languages:
If L is a regular language, then there exists a pumping length p ≥ 1 such that for every string w ∈ L with |w| ≥ p:
w can be split as w = xyz satisfying all three conditions:
- |y| ≥ 1 (y is non-empty — there is an actual loop)
- |xy| ≤ p (the loop occurs within the first p characters)
- For all k ≥ 0: xy^k z ∈ L (pumping y any number of times stays in L)
Note on y^k:
The pumping length p is the number of states in the minimum DFA for L. Any string at least as long as the number of states must trigger a loop.
The Pumping Lemma is a statement about regular languages. To use it to prove L is NOT regular, we use proof by contradiction:
Step 1: Assume L is regular. Then the Pumping Lemma guarantees a pumping length p.
Step 2: You choose a specific string w ∈ L with |w| ≥ p. Choose strategically — pick a string that will be hard to pump.
Step 3: Argue about all possible splits w = xyz satisfying |y| ≥ 1 and |xy| ≤ p. Show that for every such split, there exists some k ≥ 0 where xy^k z ∉ L.
Step 4: This contradicts condition 3 of the Pumping Lemma. Therefore, L is not regular.
Common mistake: Choosing a specific split and showing it fails. You must show all possible splits fail. You get to choose w; the Pumping Lemma chooses the split (adversarially). You must defeat every possible adversarial split.
Language: L = {0ⁿ1ⁿ | n ≥ 0} = {ε, 01, 0011, 000111, ...}
Claim: L is not regular.
Proof:
Assume L is regular with pumping length p.
Choose w = 0^p 1^p (p zeros followed by p ones). Note |w| = 2p ≥ p, and w ∈ L.
By the Pumping Lemma, w = xyz where |y| ≥ 1 and |xy| ≤ p.
Since |xy| ≤ p, the string xy lies entirely within the first p characters of w. The first p characters of w are all 0s. Therefore:
Now pump: let k = 0. Then xy⁰z = xz = 0^{p-b} 1^p.
Since b ≥ 1, we have p − b < p. So xy⁰z has fewer 0s than 1s. Therefore xy⁰z ∉ L.
This contradicts condition 3 (pumping must stay in L). Therefore L is not regular. □
Language: L = {ww | w ∈ {0,1}*} = {ε, 00, 11, 0101, 1010, 001001, ...}
Claim: L is not regular.
Proof:
Assume L is regular with pumping length p.
Choose w = 0^p 1 0^p 1 (this is in L because it equals ww where w = 0^p 1).
|w| = 2p + 2 ≥ p. ✓
By the Pumping Lemma, w = xyz where |y| ≥ 1 and |xy| ≤ p.
Since |xy| ≤ p and the string starts with p zeros, xy lies entirely within the leading 0s:
Pump with k = 2: xy²z = x(yy)z = 0^{p+b} 1 0^p 1.
For this to be in L = {ww}, it must equal some w'w' — two equal halves. The total length is 2p + b + 2. Since b is odd or even, this length might be even or odd. But crucially:
The two halves must be equal, meaning the string must be symmetric around its midpoint. The midpoint of 0^{p+b} 1 0^p 1 splits the string asymmetrically because p+b ≠ p. Therefore xy²z ∉ L.
This contradicts condition 3. Therefore L is not regular. □
Language: L = {1^n | n is prime}
Claim: L is not regular.
Proof:
Assume L is regular with pumping length p.
Let m be any prime with m ≥ p + 2. Choose w = 1^m ∈ L.
By the Pumping Lemma: w = xyz, |y| = b ≥ 1, |x| + |y| ≤ p, so |x| = a and |z| = m − a − b.
Consider pumping with k = m − b (where b = |y|): xy^{m-b}z has length:
a + b(m-b) + (m-a-b)
= a + bm - b² + m - a - b
= bm - b² + m - b
= m(b+1) - b(b+1)
= (m-b)(b+1)
Since b ≥ 1 and m ≥ p + 2 > 2, we have b+1 ≥ 2 and m − b ≥ 2. Therefore (m−b)(b+1) is composite (a product of two factors, both ≥ 2). So xy^{m-b}z ∉ L.
Contradiction. L is not regular. □
| Mistake | Why It's Wrong | Correction |
|---|---|---|
| Choosing a fixed split for w = xyz | You don't get to choose the split — only the string w | Argue over ALL possible splits satisfying the lemma's conditions |
| Showing one k fails to pump | You need to find one k that fails for every split | For each split, exhibit a failing k |
| Choosing a string shorter than p | You don't know what p is — but p ≥ 1, so use p in your string | Choose w in terms of p: w = 0^p 1^p, etc. |
| Using Pumping Lemma to prove regularity | The lemma only gives a necessary condition, not sufficient | Pumping Lemma can only disprove regularity |
The Pumping Lemma is a necessary condition for regularity, not a sufficient one. There exist non-regular languages that also satisfy the pumping lemma. If a language satisfies the pumping property, you cannot conclude it is regular — only that you have not yet found a contradiction.
To prove a language regular, construct a DFA/NFA/regex for it. The Pumping Lemma is only useful for proving languages are NOT regular.
| Language | Regular? | Proof Method | Key Insight |
|---|---|---|---|
| {0ⁿ1ⁿ | n ≥ 0} | No | Pumping Lemma | Any loop is in the 0s; removing it unbalances the count |
| {ww | w ∈ {0,1}*} | No | Pumping Lemma | Pumping breaks the mid-string symmetry |
| {1^p | p is prime} | No | Pumping Lemma | Pumped length becomes composite |
| {0,1}* | Yes | DFA with 1 state (accept all) | Trivially regular |
| Binary strings ending in 01 | Yes | 3-state DFA (Lesson 3) | Finite suffix condition |
| Binary strings with even # of 1s | Yes | 2-state DFA (Lesson 3) | Parity is finite memory |
| All valid C programs | No | Pumping Lemma (bracket matching) | Requires counting braces |
The Pumping Lemma is a test. Regular languages must pass it. Many languages fail it — and that failure is your proof that no DFA, no NFA, no regex can ever recognize them.
The boundary between regular and non-regular is the boundary between finite memory and infinite memory. A DFA can remember which state it is in — but states are finite. Anything that requires counting without bound, anything that requires remembering the entire history of the input, anything that requires matching distant parts of the string — none of these are regular.
In Lesson 8, you will see that closure properties give another powerful toolset for the same purpose — sometimes making it possible to prove non-regularity without invoking the Pumping Lemma at all.
Get this course's notes on Telegram!
Free cheat sheets, summaries & practice exercises