Follow AiTechWorlds on LinkedIn for professional AI content!Follow Now →
20 minLesson 35 of 40
Databases

Database Design & Relationships

Database Design Fundamentals

The schema you design at the start of a project shapes every query you write, every API you build, and every scaling problem you'll face later. Good database design is about thinking through relationships, constraints, and growth before you write a line of code.

Thinking in Tables

Each table represents one thing — one entity. If you catch yourself putting two types of things in the same table, it's a sign to split it.

Common beginner mistake:

-- ❌ Bad: mixing users and their preferences in one table
CREATE TABLE users (
    id              SERIAL PRIMARY KEY,
    email           VARCHAR(255),
    name            VARCHAR(255),
    preferred_theme VARCHAR(50),
    email_frequency VARCHAR(50),
    timezone        VARCHAR(100),
    notification_1  BOOLEAN,
    notification_2  BOOLEAN
    -- ... keeps growing
);

Better:

-- ✅ Good: separate tables for separate concerns
CREATE TABLE users (
    id    SERIAL PRIMARY KEY,
    email VARCHAR(255) UNIQUE NOT NULL,
    name  VARCHAR(255) NOT NULL
);

CREATE TABLE user_settings (
    user_id         INTEGER PRIMARY KEY REFERENCES users(id) ON DELETE CASCADE,
    theme           VARCHAR(50) DEFAULT 'light',
    timezone        VARCHAR(100) DEFAULT 'UTC',
    email_frequency VARCHAR(50) DEFAULT 'weekly'
);

CREATE TABLE notification_preferences (
    id      SERIAL PRIMARY KEY,
    user_id INTEGER REFERENCES users(id) ON DELETE CASCADE,
    type    VARCHAR(100) NOT NULL,
    enabled BOOLEAN DEFAULT true,
    UNIQUE(user_id, type)
);

Now you can add new notification types without changing the schema.

Primary Keys

Always have a primary key. The choice of key type has long-term consequences:

-- Auto-incrementing integer (simple, fast, sequential)
id SERIAL PRIMARY KEY
-- Cons: exposes record count, predictable, doesn't work across distributed systems

-- UUID (globally unique, safe to expose)
id UUID DEFAULT gen_random_uuid() PRIMARY KEY
-- Cons: larger storage, slower indexes, random order means index fragmentation

-- CUID (collision-resistant, sortable, URL-safe)
-- Use with Prisma: @id @default(cuid())
-- Good balance: readable, non-sequential, URL-safe

-- Natural key (email, username, slug) — only when the value truly uniquely identifies the entity
slug VARCHAR(255) PRIMARY KEY
-- Cons: you can never change it without cascading updates

Use SERIAL or UUID in most cases. Reserve natural keys for lookup tables where the value is truly immutable (ISO country codes, currency codes, etc.).

Relationships

One-to-One: One user has exactly one profile.

CREATE TABLE profiles (
    user_id   INTEGER PRIMARY KEY REFERENCES users(id) ON DELETE CASCADE,
    bio       TEXT,
    website   VARCHAR(500),
    avatar    VARCHAR(500)
);

The foreign key is also the primary key — enforces the one-to-one constraint.

One-to-Many: One course has many lessons.

CREATE TABLE lessons (
    id        SERIAL PRIMARY KEY,
    course_id INTEGER NOT NULL REFERENCES courses(id) ON DELETE CASCADE,
    title     VARCHAR(500) NOT NULL,
    order_num INTEGER NOT NULL
);
-- Index the foreign key for fast lookups
CREATE INDEX idx_lessons_course_id ON lessons(course_id);

Many-to-Many: Users can enroll in many courses; courses can have many students.

CREATE TABLE enrollments (
    user_id   INTEGER REFERENCES users(id) ON DELETE CASCADE,
    course_id INTEGER REFERENCES courses(id) ON DELETE CASCADE,
    enrolled_at TIMESTAMP DEFAULT NOW(),
    PRIMARY KEY (user_id, course_id)   -- composite primary key prevents duplicates
);

The junction table (enrollments) can hold additional data about the relationship — when it happened, progress, completion status.

Normalization

Normalization removes data duplication. The goal is one fact in one place:

-- ❌ Denormalized: author name stored in every course
CREATE TABLE courses (
    id          SERIAL PRIMARY KEY,
    title       VARCHAR(500),
    author_name VARCHAR(255),    -- what if the author changes their name?
    author_email VARCHAR(255)    -- duplicated across every course
);

-- ✅ Normalized: author details stored once in users, referenced by id
CREATE TABLE courses (
    id        SERIAL PRIMARY KEY,
    title     VARCHAR(500),
    author_id INTEGER REFERENCES users(id)
);

Go as far as "Third Normal Form" (3NF) by default:

  1. No repeating groups (don't store arrays as tag1, tag2, tag3 columns)
  2. Every non-key column depends only on the primary key
  3. No transitive dependencies (if city determines country, store city→country separately)

It's okay to selectively denormalize for performance — but do it consciously, not by accident.

Constraints — Let the Database Enforce Your Rules

Don't only validate in application code. The database is the last line of defense:

CREATE TABLE courses (
    id          SERIAL PRIMARY KEY,
    slug        VARCHAR(255) UNIQUE NOT NULL,
    title       VARCHAR(500) NOT NULL,
    price       DECIMAL(10, 2) NOT NULL DEFAULT 0 CHECK (price >= 0),
    status      VARCHAR(50) NOT NULL DEFAULT 'draft' 
                    CHECK (status IN ('draft', 'published', 'archived')),
    author_id   INTEGER NOT NULL REFERENCES users(id),
    created_at  TIMESTAMP NOT NULL DEFAULT NOW()
);

Constraint types:

  • NOT NULL — field must have a value
  • UNIQUE — no two rows can have the same value
  • CHECK — value must satisfy a condition
  • REFERENCES (foreign key) — value must exist in another table
  • DEFAULT — value if not specified

Cascade Behavior

What happens to child records when the parent is deleted?

-- ON DELETE CASCADE: delete all lessons when the course is deleted
course_id INTEGER REFERENCES courses(id) ON DELETE CASCADE

-- ON DELETE SET NULL: set author_id to NULL when the user is deleted
author_id INTEGER REFERENCES users(id) ON DELETE SET NULL

-- ON DELETE RESTRICT (default): prevent deletion if child records exist
user_id INTEGER REFERENCES users(id) ON DELETE RESTRICT

Think through each relationship:

  • Enrollments → delete if user or course deleted (CASCADE)
  • Course author → keep course if author deleted (SET NULL)
  • Order → prevent deleting a user who has orders (RESTRICT)

Timestamps

Always include timestamps:

created_at TIMESTAMP NOT NULL DEFAULT NOW()
updated_at TIMESTAMP NOT NULL DEFAULT NOW()
-- deleted_at for soft deletes (NULL = active, timestamp = deleted)
deleted_at TIMESTAMP

Soft deletes are useful when you need an audit trail — mark records as deleted rather than actually removing them.

Naming Conventions

Consistent naming saves time when writing queries:

Tables:          plural snake_case (users, course_lessons, enrollment_events)
Primary keys:    id
Foreign keys:    {table_singular}_id (user_id, course_id)
Boolean columns: is_ or has_ prefix (is_published, has_certificate)
Timestamps:      _at suffix (created_at, updated_at, enrolled_at)
Junction tables: table1_table2 (user_courses, course_tags)
Indexes:         idx_{table}_{column} (idx_users_email, idx_courses_slug)

Schema Design Example: Course Platform

-- Users and auth
CREATE TABLE users (
    id         UUID DEFAULT gen_random_uuid() PRIMARY KEY,
    email      VARCHAR(255) UNIQUE NOT NULL,
    name       VARCHAR(255) NOT NULL,
    password   VARCHAR(255) NOT NULL,
    role       VARCHAR(20) NOT NULL DEFAULT 'student' 
                   CHECK (role IN ('student', 'instructor', 'admin')),
    is_active  BOOLEAN NOT NULL DEFAULT true,
    created_at TIMESTAMP NOT NULL DEFAULT NOW()
);

-- Categories for courses
CREATE TABLE categories (
    id   SERIAL PRIMARY KEY,
    slug VARCHAR(100) UNIQUE NOT NULL,
    name VARCHAR(255) NOT NULL
);

-- Courses
CREATE TABLE courses (
    id           UUID DEFAULT gen_random_uuid() PRIMARY KEY,
    slug         VARCHAR(255) UNIQUE NOT NULL,
    title        VARCHAR(500) NOT NULL,
    description  TEXT,
    price        DECIMAL(10, 2) NOT NULL DEFAULT 0 CHECK (price >= 0),
    is_published BOOLEAN NOT NULL DEFAULT false,
    category_id  INTEGER REFERENCES categories(id) ON DELETE SET NULL,
    author_id    UUID NOT NULL REFERENCES users(id) ON DELETE RESTRICT,
    created_at   TIMESTAMP NOT NULL DEFAULT NOW(),
    updated_at   TIMESTAMP NOT NULL DEFAULT NOW()
);

-- Lessons within courses
CREATE TABLE lessons (
    id         UUID DEFAULT gen_random_uuid() PRIMARY KEY,
    course_id  UUID NOT NULL REFERENCES courses(id) ON DELETE CASCADE,
    slug       VARCHAR(255) NOT NULL,
    title      VARCHAR(500) NOT NULL,
    content    TEXT NOT NULL,
    order_num  INTEGER NOT NULL,
    created_at TIMESTAMP NOT NULL DEFAULT NOW(),
    UNIQUE(course_id, slug),
    UNIQUE(course_id, order_num)
);

-- Student enrollments
CREATE TABLE enrollments (
    id          UUID DEFAULT gen_random_uuid() PRIMARY KEY,
    user_id     UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
    course_id   UUID NOT NULL REFERENCES courses(id) ON DELETE CASCADE,
    enrolled_at TIMESTAMP NOT NULL DEFAULT NOW(),
    UNIQUE(user_id, course_id)
);

-- Lesson completion tracking
CREATE TABLE lesson_progress (
    user_id     UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
    lesson_id   UUID NOT NULL REFERENCES lessons(id) ON DELETE CASCADE,
    completed   BOOLEAN NOT NULL DEFAULT false,
    completed_at TIMESTAMP,
    PRIMARY KEY (user_id, lesson_id)
);

-- Indexes for common queries
CREATE INDEX idx_courses_slug ON courses(slug);
CREATE INDEX idx_courses_author ON courses(author_id);
CREATE INDEX idx_lessons_course ON lessons(course_id);
CREATE INDEX idx_enrollments_user ON enrollments(user_id);
CREATE INDEX idx_enrollments_course ON enrollments(course_id);

This schema supports the key queries a course platform needs: listing courses by category, getting a course with its lessons, checking if a user is enrolled, and tracking lesson completion — all efficiently.

Next lesson: Git and GitHub for developers — version control, branching, and collaboration workflows.

📱

Get this course's notes on Telegram!

Free cheat sheets, summaries & practice exercises

Get Notes Free →
!