A chemist tells you the temperature of absolute zero: -273.15°C. A census bureau records the world population: 7,888,000,000. A game engine stores a pixel color intensity: 127.

Your computer stores all three of those numbers in the same fundamental way — as patterns of 1s and 0s in a fixed number of bits. But a negative decimal fraction, a seven-billion integer, and a small whole number each need different rules for how those bits are arranged. Get the rules wrong and -273.15 becomes garbage, 7 billion overflows to zero, and the pixel flickers to the wrong color.

This lesson is about those rules.

Why Data Representation Rules Exist

Hardware doesn't know what a "number" means. A CPU register is just 64 switches. The meaning — is this a temperature? a memory address? a character? — comes entirely from the instructions operating on those bits and the convention the programmer agreed to follow.

"Data has no intrinsic meaning. A sequence of bits means whatever the program says it means."

This is why type systems in programming languages exist: to enforce the correct interpretation of bit patterns.

Unsigned Integers

Unsigned integers store only non-negative whole numbers (0 and above). With n bits, you can store values from 0 to 2ⁿ − 1.

Width	Maximum	Example Use
8-bit	255	Pixel color channel (R, G, B each 0–255)
16-bit	65,535	Port numbers (0–65535), Unicode BMP
32-bit	4,294,967,295 (~4.3 billion)	IPv4 addresses, array sizes
64-bit	18,446,744,073,709,551,615 (~1.8×10¹⁹)	File sizes, memory addresses

A 32-bit unsigned integer was once enough for memory addresses — until RAM exceeded 4 GB. That is precisely why 64-bit computing became mandatory in the mid-2000s.

Signed Integers: How Negatives Work

Sign-Magnitude (Rejected)

The naive approach: use the leftmost bit as a plus/minus sign, the rest as magnitude. Problems:

+0 (00000000) and -0 (10000000) are both "zero" — two representations of the same value
Addition hardware must check the sign bit and use different logic for negative numbers

One's Complement (Mostly Rejected)

Negate by flipping all bits. Still has the two-zeros problem.

Two's Complement (The Universal Standard)

All modern CPUs use two's complement for signed integers. To negate: flip all bits, add 1. This elegantly eliminates the double-zero problem and makes addition hardware work for both positive and negative numbers without modification.

With n bits, two's complement stores −2^(n−1) to 2^(n−1) − 1:

8-bit: −128 to +127
16-bit: −32,768 to +32,767
32-bit: −2,147,483,648 to +2,147,483,647
64-bit: −9.2×10¹⁸ to +9.2×10¹⁸

Why is the negative range one larger than the positive? Because zero is in the positive half. The bit pattern 10000000 in 8-bit two's complement is −128, not +128.

Character Encoding

ASCII (1963)

American Standard Code for Information Interchange maps 128 characters to 7-bit codes (0–127):

Digits: '0' = 48, '9' = 57
Uppercase: 'A' = 65, 'Z' = 90
Lowercase: 'a' = 97, 'z' = 122
Space = 32, Newline = 10

Notice: lowercase letters are exactly 32 higher than uppercase. To toggle case, flip bit 5 (value 32).

Unicode and UTF-8 (1991–Present)

ASCII covers only English. Unicode defines 1,114,112 code points (U+0000 to U+10FFFF), covering every language, emoji, ancient script, and mathematical symbol on Earth.

UTF-8 is the dominant encoding: variable-width (1 to 4 bytes per character):

U+0000 to U+007F (basic ASCII): 1 byte — backward compatible
U+0080 to U+07FF (Latin, Greek, Arabic, Hebrew): 2 bytes
U+0800 to U+FFFF (Chinese, Japanese, Korean): 3 bytes
U+10000 to U+10FFFF (emoji, rare scripts): 4 bytes

As of 2024, UTF-8 is used by 98.2% of websites (W3Techs data).

Floating-Point Numbers: IEEE 754

Integers can't represent 3.14159 or 6.022×10²³. For real-number approximations, computers use floating-point, standardized by the IEEE in 1985 in IEEE 754 — one of the most influential standards in computing history.

The Concept: Scientific Notation in Binary

Just as 6.022 × 10²³ separates the significant digits from the scale, IEEE 754 separates:

Sign: positive or negative
Mantissa (significand): the significant bits
Exponent: the scale (power of 2)

32-bit Single Precision Float

Bit 31    Bits 30–23     Bits 22–0
  S    |   EEEEEEEE  |  MMMMMMMMMMMMMMMMMMMMMMM
Sign     8-bit exponent   23-bit mantissa

1 sign bit: 0 = positive, 1 = negative
8 exponent bits: stored with a bias of 127 (actual exponent = stored value − 127)
23 mantissa bits: the fractional part (there's an implicit leading 1)

Range: approximately ±1.18×10⁻³⁸ to ±3.4×10³⁸ Precision: ~7 significant decimal digits

64-bit Double Precision Float

1 sign bit
11 exponent bits (bias 1023)
52 mantissa bits

Range: approximately ±2.2×10⁻³⁰⁸ to ±1.8×10³⁰⁸ Precision: ~15–17 significant decimal digits

Most programming languages use 64-bit doubles by default (double in C/Java, float in Python).

IEEE 754 Bit Layout Diagram

Special IEEE 754 Values

IEEE 754 reserves specific bit patterns for special cases:

Value	Description	Example Trigger
+Infinity	Overflow positive	`1.0 / 0.0`
-Infinity	Overflow negative	`-1.0 / 0.0`
NaN	Not a Number	`0.0 / 0.0`, `sqrt(-1)`
+0	Positive zero	`0.0`
-0	Negative zero	`-0.0` (mathematically equal to +0)

Why 0.1 + 0.2 ≠ 0.3

This is perhaps the most famous floating-point surprise:

>>> 0.1 + 0.2
0.30000000000000004

Why? Because 0.1 cannot be represented exactly in binary floating-point — just as 1/3 cannot be represented exactly in decimal.

0.1 in binary is a repeating pattern: 0.0001100110011001100110011... (infinite). A 64-bit double truncates this to 52 mantissa bits, introducing a tiny rounding error. When you add two such approximations, the errors accumulate.

The rule: Never use floating-point equality checks (==) for financial or safety-critical calculations. Use integer arithmetic (store prices in cents, not dollars) or dedicated decimal libraries.

Complete Data Type Reference Table

Data Type	Bits	Range / Precision	IEEE 754?	Common Use
uint8	8	0 to 255	No	Pixel color, byte
int8	8	−128 to +127	No	Small signed values
uint32	32	0 to ~4.3 billion	No	Array index, IPv4 address
int32	32	−2.1B to +2.1B	No	General integers (Java `int`)
int64	64	−9.2×10¹⁸ to +9.2×10¹⁸	No	File offsets, timestamps
float32	32	±1.18×10⁻³⁸ to ±3.4×10³⁸, ~7 digits	Yes (single)	Game graphics, ML weights
float64	64	±2.2×10⁻³⁰⁸ to ±1.8×10³⁰⁸, ~15 digits	Yes (double)	Scientific computing
char (ASCII)	8	128 characters	No	English text
char (UTF-8)	8–32	1.1M code points	No	International text

Key Takeaways

Unsigned integers store 0 to 2ⁿ−1; signed integers (two's complement) store −2^(n−1) to 2^(n−1)−1
Two's complement is the universal standard for signed integers — negation = flip bits + add 1
ASCII maps 128 characters to 7 bits; UTF-8 extends this to 1.1 million Unicode code points with variable-width encoding
IEEE 754 (1985) standardizes floating-point: sign + biased exponent + mantissa
A 32-bit float gives ~7 decimal digits of precision; a 64-bit double gives ~15–17
Special values (NaN, ±Infinity, ±0) are encoded in reserved exponent patterns
0.1 + 0.2 ≠ 0.3 because most decimal fractions cannot be represented exactly in binary — always use integer arithmetic for money

💬 DiscussionPowered by GitHub Discussions

📱

Get this course's notes on Telegram!

Free cheat sheets, summaries & practice exercises

Get Notes Free →

30 minLesson 3 of 16

Course Contents(16 lessons)

▾

Chapter 1: Foundations

What Is Computer Architecture? Von Neumann vs Harvard20 min

Number Systems: Binary, Octal, Hexadecimal28 min

Data Representation: Integers, Floats, and IEEE 75430 min

Chapter 2: Digital Logic

Boolean Algebra and Logic Gates32 min

Combinational Circuits: Adders, Multiplexers, Decoders28 min

Sequential Circuits: Flip-Flops, Registers, Counters30 min

Chapter 3: CPU Architecture

ALU, Registers, and the Datapath32 min

Instruction Set Architecture: RISC vs CISC35 min

CPU Pipeline: The 5-Stage Execution Engine35 min

Pipeline Hazards and Modern Solutions30 min

Chapter 4: Memory Systems

Cache Memory: Mapping, Associativity, Replacement35 min

Virtual Memory, Page Tables, and TLB32 min

Chapter 5: I/O and Advanced Topics

I/O Systems, Interrupts, and DMA28 min

Parallel Processing: Multicore and Flynn's Taxonomy30 min

Modern CPU Architectures: ARM, x86-64, Apple Silicon28 min

Chapter 6: Final Project

Final Project: Analyze and Compare CPU Architectures45 min