Database Design & Relationships
Database Design Fundamentals
The schema you design at the start of a project shapes every query you write, every API you build, and every scaling problem you'll face later. Good database design is about thinking through relationships, constraints, and growth before you write a line of code.
Thinking in Tables
Each table represents one thing — one entity. If you catch yourself putting two types of things in the same table, it's a sign to split it.
Common beginner mistake:
-- ❌ Bad: mixing users and their preferences in one table
CREATE TABLE users (
id SERIAL PRIMARY KEY,
email VARCHAR(255),
name VARCHAR(255),
preferred_theme VARCHAR(50),
email_frequency VARCHAR(50),
timezone VARCHAR(100),
notification_1 BOOLEAN,
notification_2 BOOLEAN
-- ... keeps growing
);
Better:
-- ✅ Good: separate tables for separate concerns
CREATE TABLE users (
id SERIAL PRIMARY KEY,
email VARCHAR(255) UNIQUE NOT NULL,
name VARCHAR(255) NOT NULL
);
CREATE TABLE user_settings (
user_id INTEGER PRIMARY KEY REFERENCES users(id) ON DELETE CASCADE,
theme VARCHAR(50) DEFAULT 'light',
timezone VARCHAR(100) DEFAULT 'UTC',
email_frequency VARCHAR(50) DEFAULT 'weekly'
);
CREATE TABLE notification_preferences (
id SERIAL PRIMARY KEY,
user_id INTEGER REFERENCES users(id) ON DELETE CASCADE,
type VARCHAR(100) NOT NULL,
enabled BOOLEAN DEFAULT true,
UNIQUE(user_id, type)
);
Now you can add new notification types without changing the schema.
Primary Keys
Always have a primary key. The choice of key type has long-term consequences:
-- Auto-incrementing integer (simple, fast, sequential)
id SERIAL PRIMARY KEY
-- Cons: exposes record count, predictable, doesn't work across distributed systems
-- UUID (globally unique, safe to expose)
id UUID DEFAULT gen_random_uuid() PRIMARY KEY
-- Cons: larger storage, slower indexes, random order means index fragmentation
-- CUID (collision-resistant, sortable, URL-safe)
-- Use with Prisma: @id @default(cuid())
-- Good balance: readable, non-sequential, URL-safe
-- Natural key (email, username, slug) — only when the value truly uniquely identifies the entity
slug VARCHAR(255) PRIMARY KEY
-- Cons: you can never change it without cascading updates
Use SERIAL or UUID in most cases. Reserve natural keys for lookup tables where the value is truly immutable (ISO country codes, currency codes, etc.).
Relationships
One-to-One: One user has exactly one profile.
CREATE TABLE profiles (
user_id INTEGER PRIMARY KEY REFERENCES users(id) ON DELETE CASCADE,
bio TEXT,
website VARCHAR(500),
avatar VARCHAR(500)
);
The foreign key is also the primary key — enforces the one-to-one constraint.
One-to-Many: One course has many lessons.
CREATE TABLE lessons (
id SERIAL PRIMARY KEY,
course_id INTEGER NOT NULL REFERENCES courses(id) ON DELETE CASCADE,
title VARCHAR(500) NOT NULL,
order_num INTEGER NOT NULL
);
-- Index the foreign key for fast lookups
CREATE INDEX idx_lessons_course_id ON lessons(course_id);
Many-to-Many: Users can enroll in many courses; courses can have many students.
CREATE TABLE enrollments (
user_id INTEGER REFERENCES users(id) ON DELETE CASCADE,
course_id INTEGER REFERENCES courses(id) ON DELETE CASCADE,
enrolled_at TIMESTAMP DEFAULT NOW(),
PRIMARY KEY (user_id, course_id) -- composite primary key prevents duplicates
);
The junction table (enrollments) can hold additional data about the relationship — when it happened, progress, completion status.
Normalization
Normalization removes data duplication. The goal is one fact in one place:
-- ❌ Denormalized: author name stored in every course
CREATE TABLE courses (
id SERIAL PRIMARY KEY,
title VARCHAR(500),
author_name VARCHAR(255), -- what if the author changes their name?
author_email VARCHAR(255) -- duplicated across every course
);
-- ✅ Normalized: author details stored once in users, referenced by id
CREATE TABLE courses (
id SERIAL PRIMARY KEY,
title VARCHAR(500),
author_id INTEGER REFERENCES users(id)
);
Go as far as "Third Normal Form" (3NF) by default:
- No repeating groups (don't store arrays as
tag1,tag2,tag3columns) - Every non-key column depends only on the primary key
- No transitive dependencies (if city determines country, store city→country separately)
It's okay to selectively denormalize for performance — but do it consciously, not by accident.
Constraints — Let the Database Enforce Your Rules
Don't only validate in application code. The database is the last line of defense:
CREATE TABLE courses (
id SERIAL PRIMARY KEY,
slug VARCHAR(255) UNIQUE NOT NULL,
title VARCHAR(500) NOT NULL,
price DECIMAL(10, 2) NOT NULL DEFAULT 0 CHECK (price >= 0),
status VARCHAR(50) NOT NULL DEFAULT 'draft'
CHECK (status IN ('draft', 'published', 'archived')),
author_id INTEGER NOT NULL REFERENCES users(id),
created_at TIMESTAMP NOT NULL DEFAULT NOW()
);
Constraint types:
NOT NULL— field must have a valueUNIQUE— no two rows can have the same valueCHECK— value must satisfy a conditionREFERENCES(foreign key) — value must exist in another tableDEFAULT— value if not specified
Cascade Behavior
What happens to child records when the parent is deleted?
-- ON DELETE CASCADE: delete all lessons when the course is deleted
course_id INTEGER REFERENCES courses(id) ON DELETE CASCADE
-- ON DELETE SET NULL: set author_id to NULL when the user is deleted
author_id INTEGER REFERENCES users(id) ON DELETE SET NULL
-- ON DELETE RESTRICT (default): prevent deletion if child records exist
user_id INTEGER REFERENCES users(id) ON DELETE RESTRICT
Think through each relationship:
- Enrollments → delete if user or course deleted (CASCADE)
- Course author → keep course if author deleted (SET NULL)
- Order → prevent deleting a user who has orders (RESTRICT)
Timestamps
Always include timestamps:
created_at TIMESTAMP NOT NULL DEFAULT NOW()
updated_at TIMESTAMP NOT NULL DEFAULT NOW()
-- deleted_at for soft deletes (NULL = active, timestamp = deleted)
deleted_at TIMESTAMP
Soft deletes are useful when you need an audit trail — mark records as deleted rather than actually removing them.
Naming Conventions
Consistent naming saves time when writing queries:
Tables: plural snake_case (users, course_lessons, enrollment_events)
Primary keys: id
Foreign keys: {table_singular}_id (user_id, course_id)
Boolean columns: is_ or has_ prefix (is_published, has_certificate)
Timestamps: _at suffix (created_at, updated_at, enrolled_at)
Junction tables: table1_table2 (user_courses, course_tags)
Indexes: idx_{table}_{column} (idx_users_email, idx_courses_slug)
Schema Design Example: Course Platform
-- Users and auth
CREATE TABLE users (
id UUID DEFAULT gen_random_uuid() PRIMARY KEY,
email VARCHAR(255) UNIQUE NOT NULL,
name VARCHAR(255) NOT NULL,
password VARCHAR(255) NOT NULL,
role VARCHAR(20) NOT NULL DEFAULT 'student'
CHECK (role IN ('student', 'instructor', 'admin')),
is_active BOOLEAN NOT NULL DEFAULT true,
created_at TIMESTAMP NOT NULL DEFAULT NOW()
);
-- Categories for courses
CREATE TABLE categories (
id SERIAL PRIMARY KEY,
slug VARCHAR(100) UNIQUE NOT NULL,
name VARCHAR(255) NOT NULL
);
-- Courses
CREATE TABLE courses (
id UUID DEFAULT gen_random_uuid() PRIMARY KEY,
slug VARCHAR(255) UNIQUE NOT NULL,
title VARCHAR(500) NOT NULL,
description TEXT,
price DECIMAL(10, 2) NOT NULL DEFAULT 0 CHECK (price >= 0),
is_published BOOLEAN NOT NULL DEFAULT false,
category_id INTEGER REFERENCES categories(id) ON DELETE SET NULL,
author_id UUID NOT NULL REFERENCES users(id) ON DELETE RESTRICT,
created_at TIMESTAMP NOT NULL DEFAULT NOW(),
updated_at TIMESTAMP NOT NULL DEFAULT NOW()
);
-- Lessons within courses
CREATE TABLE lessons (
id UUID DEFAULT gen_random_uuid() PRIMARY KEY,
course_id UUID NOT NULL REFERENCES courses(id) ON DELETE CASCADE,
slug VARCHAR(255) NOT NULL,
title VARCHAR(500) NOT NULL,
content TEXT NOT NULL,
order_num INTEGER NOT NULL,
created_at TIMESTAMP NOT NULL DEFAULT NOW(),
UNIQUE(course_id, slug),
UNIQUE(course_id, order_num)
);
-- Student enrollments
CREATE TABLE enrollments (
id UUID DEFAULT gen_random_uuid() PRIMARY KEY,
user_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
course_id UUID NOT NULL REFERENCES courses(id) ON DELETE CASCADE,
enrolled_at TIMESTAMP NOT NULL DEFAULT NOW(),
UNIQUE(user_id, course_id)
);
-- Lesson completion tracking
CREATE TABLE lesson_progress (
user_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
lesson_id UUID NOT NULL REFERENCES lessons(id) ON DELETE CASCADE,
completed BOOLEAN NOT NULL DEFAULT false,
completed_at TIMESTAMP,
PRIMARY KEY (user_id, lesson_id)
);
-- Indexes for common queries
CREATE INDEX idx_courses_slug ON courses(slug);
CREATE INDEX idx_courses_author ON courses(author_id);
CREATE INDEX idx_lessons_course ON lessons(course_id);
CREATE INDEX idx_enrollments_user ON enrollments(user_id);
CREATE INDEX idx_enrollments_course ON enrollments(course_id);
This schema supports the key queries a course platform needs: listing courses by category, getting a course with its lessons, checking if a user is enrolled, and tracking lesson completion — all efficiently.
Next lesson: Git and GitHub for developers — version control, branching, and collaboration workflows.
Get this course's notes on Telegram!
Free cheat sheets, summaries & practice exercises