In 1900, the mathematician David Hilbert stood before the International Congress of Mathematics in Paris and listed 23 unsolved problems he believed would define the next century of mathematics. One of them — the Entscheidungsproblem ("decision problem") — asked whether mathematics itself could be mechanized: given a formal statement, could a procedure always determine its truth?

To even ask this question precisely required a shared language — the language of sets, functions, and formal logic. Without it, "problem," "procedure," and "truth" meant different things to different people.

This lesson teaches you that shared language. Every theorem in Theory of Computation — the pumping lemma, the undecidability of the halting problem, the equivalence of NFAs and DFAs — is built on these foundations. Skip them and the later results will feel like magic tricks. Learn them and they become inevitable.

Sets: The Atoms of Mathematics

A set is a collection of distinct objects, called elements or members. "Distinct" means no element appears twice. "Collection" means order does not matter.

Notation:

Explicit listing: S = {a, b, c} means S contains exactly a, b, and c
Set-builder: S = {x | x is a positive even integer} (read: "the set of all x such that...")
a ∈ S means "a is an element of S"
a ∉ S means "a is not an element of S"

Special sets every computer scientist must know:

∅ — the empty set, containing no elements
ℕ — the natural numbers {0, 1, 2, 3, ...} (some authors start at 1)
ℤ — the integers {..., -2, -1, 0, 1, 2, ...}
ℝ — the real numbers (all points on a number line)
{0,1} — the binary alphabet, the most common alphabet in TOC

Set Operations

Given sets A and B over a universe U:

Operation	Symbol	Definition	Example (A={1,2,3}, B={2,3,4})
Union	A ∪ B	Elements in A or B (or both)	{1, 2, 3, 4}
Intersection	A ∩ B	Elements in both A and B	{2, 3}
Difference	A − B	Elements in A but not B	{1}
Complement	Ā	Elements in U but not A	Depends on U
Subset	A ⊆ B	Every element of A is in B	{2,3} ⊆ {2,3,4} ✓

The Power Set

The power set of S, written P(S) or 2^S, is the set of all subsets of S.

If S = {a, b}, then P(S) = {∅, {a}, {b}, {a,b}} — four subsets.

The key fact: if |S| = n (S has n elements), then |P(S)| = 2ⁿ.

This explodes quickly. A set with 10 elements has 1,024 subsets. A set with 30 elements has over a billion. This exponential growth is exactly why NFA-to-DFA conversion can produce exponentially many states — each DFA state represents a subset of NFA states.

Countable vs Uncountable Infinity

Not all infinite sets are the same size. Georg Cantor proved this in 1874, shocking the mathematical world.

A set is countably infinite if its elements can be listed in a sequence: a first element, a second, a third, and so on — even if the list never ends. The natural numbers ℕ, integers ℤ, and even rational numbers ℚ are countably infinite.

A set is uncountably infinite if no such listing is possible. The real numbers ℝ are uncountably infinite. Cantor's diagonal argument — a proof technique we will use again with the halting problem — shows this by constructing a real number that cannot appear in any proposed listing.

Why this matters in TOC: the set of all Turing Machines is countably infinite (each can be encoded as a string). The set of all languages (sets of strings) is uncountably infinite. Therefore, most languages cannot be recognized by any Turing Machine. Most problems are unsolvable — not because we lack cleverness, but because there are more problems than machines.

Strings, Alphabets, and Languages

These three definitions are the raw material of every model we will study.

An alphabet Σ (sigma) is a finite, non-empty set of symbols.

Σ = {0, 1} — binary alphabet
Σ = {a, b, ..., z} — lowercase English letters
Σ = {0, 1, ..., 9, +, −, ×, ÷} — arithmetic symbols

A string (or word) over Σ is a finite sequence of symbols from Σ.

"0110" is a string over {0,1}
|w| denotes the length of string w: |"0110"| = 4
ε (epsilon) is the empty string — the unique string of length 0

String operations:

Concatenation: if x = "ab" and y = "cd", then xy = "abcd"
Concatenation with ε: wε = εw = w for any string w
Power notation: a³ = "aaa", w⁰ = ε

Σ* denotes the set of all strings over Σ (including ε). It is always countably infinite.

A language L over Σ is any subset of Σ*: L ⊆ Σ*.

This definition is deliberately broad. A language is not just a natural language like English — it is any set of strings. The set of all valid Python programs is a language. The set of all prime numbers written in binary is a language. The empty set ∅ is a language (containing no strings). The set {ε} is a language (containing only the empty string).

Functions and Relations

A function f: A → B maps each element of A (the domain) to exactly one element of B (the codomain). The set of all actual output values is the range.

Three properties a function may have:

Injective (one-to-one): different inputs produce different outputs. f(a) = f(b) implies a = b.
Surjective (onto): every element of B is the output of at least one input.
Bijective (one-to-one correspondence): both injective and surjective. A bijection proves two sets have the same "size," even if infinite.

Bijections are how we compare infinite sets. Cantor showed that ℕ and ℤ have the same cardinality (size) by constructing a bijection: 0→0, 1→1, 2→−1, 3→2, 4→−2, ... But no bijection exists between ℕ and ℝ.

Four Proof Techniques You Must Master

Every theorem we prove will use one (or a combination) of these four techniques. Learning to recognize which technique a proof uses is as important as following the details.

1. Direct Proof

Assume the hypothesis. Apply definitions and previously established facts. Arrive at the conclusion through a logical chain.

Example: Prove that if n is even, then n² is even. Proof: Since n is even, n = 2k for some integer k. Then n² = (2k)² = 4k² = 2(2k²). Since 2k² is an integer, n² is even. □

2. Proof by Contradiction

Assume the negation of what you want to prove. Derive a logical contradiction (something that is both true and false). Conclude the assumption was wrong.

Example: Prove that √2 is irrational. Proof: Assume √2 = p/q in lowest terms (p, q share no common factor). Then 2 = p²/q², so p² = 2q². Thus p² is even, so p is even: p = 2m. Then 4m² = 2q², so q² = 2m², making q even. But then p and q are both even — contradicting "lowest terms." □

This is the technique used to prove the Halting Problem is undecidable.

3. Proof by Mathematical Induction

To prove a statement P(n) holds for all n ≥ base:

Base case: Prove P(base) is true.
Inductive step: Assume P(k) is true (inductive hypothesis). Prove P(k+1) is true.

Example: Prove that the sum of the first n positive integers equals n(n+1)/2.

Base case (n=1): Sum = 1. Formula gives 1(2)/2 = 1. ✓

Inductive step: Assume sum of first k integers = k(k+1)/2. Then:

Sum of first (k+1) integers
= [sum of first k] + (k+1)
= k(k+1)/2 + (k+1)
= (k+1)[k/2 + 1]
= (k+1)(k+2)/2

This is exactly the formula for n = k+1. By induction, the formula holds for all n ≥ 1. □

Strong induction assumes P(1), P(2), ..., P(k) are all true (not just P(k)) to prove P(k+1). Use this when the proof needs more than one previous case.

4. Proof by Construction (Existence Proof)

Prove something exists by actually building it. Rather than arguing abstractly, you construct a specific example or algorithm.

Example: Prove that for any NFA, an equivalent DFA exists. Proof: Construct the DFA explicitly using the subset construction algorithm (covered in Lesson 5). □

This is the most common proof technique in automata theory — we prove NFAs and DFAs are equivalent by constructing the conversion.

The Pigeonhole Principle

Statement: If more than n objects are placed into n containers, then at least one container holds more than one object.

This sounds obvious, but it is extraordinarily powerful. In automata theory, it is the engine of the Pumping Lemma (Lesson 7):

A DFA with p states cannot distinguish more than p different string histories. Any string longer than p states must cause the machine to revisit a state — creating a loop. That loop can be "pumped" (repeated any number of times). Therefore, any regular language must have this pumping property, and any language without it cannot be regular.

Every time a regex fails to match something you think it should be able to match — say, ensuring balanced parentheses — the pigeonhole principle is the reason why.

Proof Techniques at a Glance

Technique	When to Use	Structure	Example Application in TOC
Direct Proof	Straightforward logical chain	Hypothesis → steps → conclusion	Prove DFA transition function is total
Contradiction	Proving impossibility or uniqueness	Assume ¬P, derive P ∧ ¬P	Halting Problem undecidability
Induction	Statements over all integers ≥ base	Base case + inductive step	Prove DFA with n states needs ≥ n states
Construction	Proving existence	Build the object explicitly	NFA-to-DFA conversion, regex-to-NFA

Building Toward Automata

Everything introduced here is used constantly in the lessons ahead.

Sets are states, alphabets, and languages
Power sets explain why NFA-to-DFA can create exponentially many states
Functions are transition functions of automata
Strings are the inputs automata process
Languages are what automata recognize — the central concept of the field
Induction proves properties of automata that hold for inputs of all lengths
Contradiction proves that certain languages cannot be recognized by any automaton
Construction proves that two models are equivalent by converting one to the other
Pigeonhole is the intuition behind every non-regularity proof

The machinery may seem dry in isolation. But by Lesson 4, you will use every single one of these ideas to build your first DFA — and understand exactly why it works.

Mathematics is not a collection of facts. It is a set of tools. Now you have the tools.

💬 DiscussionPowered by GitHub Discussions

📱

Get this course's notes on Telegram!

Free cheat sheets, summaries & practice exercises

Get Notes Free →

28 minLesson 2 of 16

Course Contents(16 lessons)

▾

Chapter 1: Mathematical Foundations

What Is Theory of Computation? Why It Matters20 min

Mathematical Foundations: Sets, Functions, Proofs28 min

Chapter 2: Finite Automata

Deterministic Finite Automata (DFA): States and Transitions35 min

Non-Deterministic Finite Automata (NFA)32 min

NFA to DFA: Subset Construction Algorithm30 min

Chapter 3: Regular Languages

Regular Expressions: Pattern Matching Formalized32 min

Pumping Lemma: Proving Languages Are Not Regular28 min

Closure Properties of Regular Languages25 min

Chapter 4: Context-Free Languages

Context-Free Grammars: Defining Programming Languages35 min

Pushdown Automata: Adding a Stack to Finite Automata32 min

CNF and the CYK Parsing Algorithm30 min

Pumping Lemma for Context-Free Languages25 min

Chapter 5: Computability

Turing Machines: The Universal Computer38 min

Decidability, Undecidability, and the Halting Problem32 min

Chapter 6: Computational Complexity

P vs NP: The Million-Dollar Question35 min

NP-Completeness and Polynomial Reductions38 min