AiTechWorlds
AiTechWorlds
Right now, hidden inside your browser's JavaScript engine, form validator, and URL parser, there are machines with no RAM, no heap, no call stack — just a finite collection of states and a rulebook for switching between them.
When you type your phone number into a sign-up form and the field turns green after exactly ten digits, a finite automaton checked it. When grep searches a million-line log file for all lines matching a pattern, it compiles your pattern into a DFA and sweeps through the file in a single linear pass — never backtracking, never revisiting a character.
The DFA is the most restricted model of computation we will study. It has no writable memory beyond a single "current state." It reads its input exactly once, left to right. Yet it is fast (O(n) for input length n), predictable, and the theoretical foundation of all regular expression engines.
Understanding the DFA is not just an academic exercise. It is understanding what every grep, every lexer, and every validation library is actually doing.
A Deterministic Finite Automaton (DFA) is a 5-tuple:
M = (Q, Σ, δ, q₀, F)
Each component has a precise meaning:
| DFA Component | Symbol | Meaning | Example |
|---|---|---|---|
| States | Q | Finite set of all possible "situations" the machine can be in | {q₀, q₁, q₂} |
| Alphabet | Σ | Finite set of input symbols the machine recognizes | {0, 1} |
| Transition function | δ | Rules: given current state + input symbol → next state | δ(q₀, 1) = q₁ |
| Start state | q₀ | The state the machine begins in | q₀ ∈ Q |
| Accept states | F | States where the machine says "yes, I accept this input" | F ⊆ Q |
The word deterministic means: for every state q ∈ Q and every symbol a ∈ Σ, there is exactly one next state δ(q, a). No ambiguity. No choices. Given the same input, the machine always follows the same path.
We extend δ to strings: δ*(q, w) means "the state reached starting from q after reading string w."
δ*(q, ε) = q (reading the empty string leaves you where you are)δ*(q, wa) = δ(δ*(q, w), a) (process all of w first, then the last symbol a)DFA M accepts string w if and only if δ*(q₀, w) ∈ F.
That is: start at q₀, follow transitions for each character of w, and end in an accept state.
The language recognized by M is: L(M) = {w ∈ Σ* | δ*(q₀, w) ∈ F} — the set of all strings M accepts.
Problem: Build a DFA over Σ = {0, 1} that accepts all binary strings containing an even number of 1s. (Zero counts as even.)
Design thinking: The machine only needs to track parity — whether the count of 1s seen so far is even or odd. That requires exactly two states.
q₀ (EVEN): we have seen an even number of 1s so farq₁ (ODD): we have seen an odd number of 1s so farTransitions:
0 does not change parity: stay in current state1 flips parity: switch statesFormal definition:
Transition table:
| State | On input 0 | On input 1 |
|---|---|---|
| q₀ (even) | q₀ | q₁ |
| q₁ (odd) | q₁ | q₀ |
Trace on "1101":
Start: q₀
Read '1' → q₁
Read '1' → q₀
Read '0' → q₀
Read '1' → q₁
End state: q₁ ∉ F → REJECT (three 1s = odd)
Trace on "110":
Start: q₀
Read '1' → q₁
Read '1' → q₀
Read '0' → q₀
End state: q₀ ∈ F → ACCEPT (two 1s = even)
Problem: Build a DFA over Σ = {0, 1} that accepts all binary strings ending in "01".
Design thinking: To know if we end in "01", we need to track the last two characters. But we have to be careful — we only need to track the relevant suffix, not the entire string.
q₀ — neither condition holds (start; or last char was 1 after wrong prefix)q₁ — last character was 0 (we might be starting the "01" ending)q₂ — last two characters were "01" ✓ (accept state)Transitions:
q₀: reading 0 → go to q₁; reading 1 → stay in q₀q₁: reading 0 → stay in q₁ (two consecutive 0s, last was 0); reading 1 → go to q₂q₂: reading 0 → go to q₁ (new possible start); reading 1 → go to q₀ (the "01" is broken by a second 1)| State | Meaning | On '0' | On '1' |
|---|---|---|---|
| q₀ | Last char was 1 (or start) | q₁ | q₀ |
| q₁ | Last char was 0 | q₁ | q₂ |
| q₂ | Last two chars were "01" | q₁ | q₀ |
Trace on "0101":
Start: q₀
Read '0' → q₁
Read '1' → q₂ ← accept state
Read '0' → q₁
Read '1' → q₂ ← accept state
End state: q₂ ∈ F → ACCEPT ✓
Sometimes the transition function requires a state with no path to acceptance. A dead state (or trap state) is a non-accepting state where every transition loops back to itself — once you enter it, you can never reach an accept state.
Dead states are necessary to make δ total (defined for every state-symbol pair). In diagrams, dead states are sometimes omitted for clarity (implied by missing transitions), but in a complete DFA they must be present.
The class of languages recognized by DFAs is called the class of regular languages.
A language L is regular if and only if some DFA M exists such that L(M) = L.
This is not just a definition — it is an equivalence. In Lesson 8, we will see that regular languages are exactly the languages described by regular expressions. They are also exactly the languages recognized by NFAs. Three completely different-looking definitions describe the same class.
Two strings x and y are Myhill-Nerode equivalent (written x ~_L y) with respect to language L if: for every string z, xz ∈ L ↔ yz ∈ L. That is, no future input can distinguish between the two strings.
The Myhill-Nerode Theorem states: the minimum number of states in any DFA for L equals the number of equivalence classes of ~_L.
In practice, this means:
Real-world application: Google's RE2 library — used in Go's regexp package, Rust's regex crate, and many other systems — uses DFA minimization to guarantee O(n) matching time with no worst-case exponential blowup. This is why RE2 deliberately excludes backreferences: they break the DFA model.
A DFA is trivially implemented in any programming language:
# DFA: accepts binary strings with even number of 1s
transitions = {
('even', '0'): 'even',
('even', '1'): 'odd',
('odd', '0'): 'odd',
('odd', '1'): 'even',
}
start = 'even'
accept = {'even'}
def dfa_accepts(string):
state = start
for ch in string:
state = transitions[(state, ch)]
return state in accept
print(dfa_accepts("110")) # True (two 1s)
print(dfa_accepts("1101")) # False (three 1s)
Every re.compile() call in Python, every Pattern.compile() in Java, every /regex/ in JavaScript is doing something similar — but automatically constructing the DFA from your regex pattern.
The DFA is defined by its totality (δ is defined for every state-symbol pair) and its determinism (exactly one next state per input). These constraints make it simple to implement and analyze.
A DFA can only "remember" which state it is currently in — nothing else. It has no stack, no heap, no variables. This finite memory is its fundamental limitation. As we will see with the Pumping Lemma, there are simple languages — like {0ⁿ1ⁿ | n ≥ 0} — that no DFA can recognize, precisely because recognizing them requires counting, and counting requires memory that grows without bound.
Understanding what DFAs cannot do is just as important as understanding what they can. That understanding begins in Lesson 4, with Non-deterministic Finite Automata, and culminates in Lesson 7, with the Pumping Lemma.
Get this course's notes on Telegram!
Free cheat sheets, summaries & practice exercises