DreamBooth is a fine-tuning technique developed by Google Research that allows you to train a Stable Diffusion model on a small set of images (10–30 photos) of a specific subject — a person, object, or style. The trained model can then generate new images of that subject in any context, pose, or style you prompt. It was originally published as a research paper and has since been implemented in multiple open-source tools.

How many photos do you need to train DreamBooth?

15–25 images is the sweet spot for training a DreamBooth model on a person's face. Fewer images (under 10) tend to produce inconsistent results. More images (over 40) can cause the model to overfit and lose generalization ability. Use diverse, high-quality photos with varied backgrounds, lighting, and expressions.

Can you run DreamBooth for free?

Yes — using Google Colab's free tier with GPU runtime. This is the most accessible option for beginners, though training sessions on free Colab GPUs take longer (1–3 hours) and may be interrupted if Colab reclaims the GPU. Google Colab Pro ($10/month) provides more reliable GPU access. Locally, you need a powerful GPU (RTX 3090 or better recommended).

Is DreamBooth legal to use on photos of real people?

Training a DreamBooth model on your own photos for personal use is legal. Training on photos of others without consent raises privacy and potential legal concerns. Using DreamBooth-generated images to create misleading content (deepfakes, non-consensual intimate images) is illegal in many jurisdictions. Always use these tools responsibly and with consent.

What can you do with a DreamBooth model of yourself?

With a personalized DreamBooth model, you can generate: professional headshots in any style, yourself in historical or fantasy settings, consistent character images for personal branding or storytelling, profile photos in different visual styles, and concept art featuring yourself as a character. Professional photographers have used it for portfolio concepts; content creators use it for consistent visual identity.

DreamBooth Tutorial: Training Your Own AI Model on Your Face

When I first generated an image of myself standing on the surface of Mars, wearing an astronaut suit and looking entirely at ease, I had an experience I can only describe as mild existential surreality.

The image was accurate. It looked like me — my face, my proportions, my expressions — in a context that was obviously impossible. And the process that produced it took about two hours and cost nothing beyond my internet connection.

DreamBooth is the technology that makes this possible. It's a fine-tuning technique that trains an AI model on your specific face, allowing you to generate yourself into any scenario, style, or context. This tutorial covers the complete process from photos to working model.

What DreamBooth Actually Is

DreamBooth was published as a research paper by Google Research in 2022. The key innovation: using a small number of images (15–25) to "teach" a pre-trained image generation model what a specific subject looks like, while preserving the model's general generation ability.

Standard Stable Diffusion doesn't know what you look like. After DreamBooth training, the model has learned a new concept — a special token (like [john] or [sks person]) that represents your specific face. When you use that token in a prompt, the model generates images featuring you.

The technique works for:

Specific people (faces, full body)
Specific objects (products, pets, unique items)
Specific artistic styles
Specific environments or settings

This tutorial focuses on face training — the most common use case.

What You Need

Hardware Options

Option A — Google Colab (Free or $10/month):

Free tier: Works but training takes 1–3 hours and may be interrupted
Colab Pro: More reliable GPU access, faster training
No local hardware required

Option B — Local GPU (Fastest):

Minimum: NVIDIA RTX 3090 (24GB VRAM) for reliable SDXL DreamBooth
Recommended: RTX 4090 or equivalent
Local training completes in 15–45 minutes

Option C — Cloud GPU Services:

RunPod, Vast.ai, Lambda Labs — rent GPU by the hour
Cost: $0.30–$1.00/hour depending on GPU
Good balance of speed and cost for occasional use

For most people without a high-end local GPU, Google Colab or a cloud GPU service is the right starting point.

Software

Kohya_ss — the most popular DreamBooth training UI (Windows/Linux)
AUTOMATIC1111 — for running the trained model after training
Google Colab notebook — for cloud-based training without local setup

Step 1: Preparing Your Training Images

This step matters more than most tutorials acknowledge. Poor training images produce poor models, regardless of training settings.

Image Quantity

15–25 images is optimal. I've had best results with exactly 20.

Image Quality Requirements

Resolution: Minimum 512×512, ideally 768×768 or higher
Face visibility: Face clearly visible and sharp in every image
Varied lighting: Include outdoor natural light, indoor warm light, and neutral lighting
Varied backgrounds: Don't use the same background in more than 3 images
Varied expressions: Include neutral, smiling, serious, looking away
No accessories: Avoid sunglasses, hats, or anything obscuring the face in most images
No other people: Your training subject should be the only person visible

What to Avoid

Blurry or low-resolution images
Heavy filters or heavy editing that changes skin tone/texture
Multiple people in the frame
Very similar images (5 photos from the same minute)
Images where your face is small or turned far away

Image Cropping

Crop all images to focus on your face with consistent spacing — roughly from collarbone to slightly above the top of your head. Consistent framing helps the model learn your specific features.

Step 2: Training Setup with Kohya_ss

Installation

Clone the Kohya_ss repository from GitHub
Run the installation script for your OS
Launch the web UI (python kohya_gui.py)

Training Configuration

Key settings for face DreamBooth training:

Instance prompt: A unique token + descriptor for your subject a photo of [yourname] person The [yourname] should be a word not commonly used in other contexts — not "john" or "sarah" but something like "ohwxperson" or your actual unique identifier.

Class prompt: The general category your subject belongs to a photo of a person

Training images: Your 20 prepared images folder

Regularization images: Optional but recommended — generate 200 images of "a person" with your base model to use as regularization data. This prevents the model from forgetting how to generate generic people.

Recommended training parameters:

Learning rate: 1e-4 (UNet), 1e-5 (Text Encoder)
Training steps: 1000–1500 for most models
Resolution: 512 for SD 1.5, 1024 for SDXL
Optimizer: AdamW8bit
Batch size: 1–2 (depending on VRAM)

Step 3: Google Colab Method (No Local GPU)

If you're using Google Colab, the process is simpler:

Search "DreamBooth Colab notebook" — several maintained notebooks are available on GitHub (Linoy Tsaban's notebook and the TheLastBen notebook are commonly used)
Open the notebook in Colab and enable GPU runtime (Runtime → Change runtime type → GPU)
Upload your training images to a zip file
Configure your instance and class prompts
Run all cells in order
Training takes 1–3 hours on free Colab GPU
Download your trained model (LoRA file) when complete

Important: Save your LoRA file before your Colab session ends. Free Colab sessions don't persist files after the session closes.

Step 4: Using Your Trained Model

Once training is complete, you have a LoRA file (Low-Rank Adaptation) or full checkpoint file.

Loading in AUTOMATIC1111

Place your LoRA file in the models/Lora folder
In the generation interface, open the "Additional Networks" or click the LoRA icon
Select your trained LoRA
Set LoRA weight (0.7–0.85 is typical starting point)

Prompt Format

Always include your instance token in prompts: portrait of ohwxperson in a business suit, professional headshot, studio lighting ohwxperson as a medieval knight, full armor, epic fantasy setting, dramatic lighting ohwxperson in vintage 1960s fashion, film photography aesthetic

Example Outputs and Use Cases

After training a DreamBooth model on 20 photos, here's what I generated:

Professional headshots: Generated 15 headshots in different professional settings — office backgrounds, outdoor settings, different clothing. Kept 4 that I'd use for LinkedIn or business cards.

Artistic portraits: Generated portraits in watercolor, oil painting, and pencil sketch styles. The subject identity (my face) was maintained across styles.

Fantasy and sci-fi scenarios: The Mars astronaut image. Also generated myself as a fantasy wizard, a 1920s detective, and a cyberpunk character.

Consistent content creator identity: A series of images in a consistent visual style for YouTube thumbnails or social media that all feature the same person (me) in different poses.

Common Problems and Solutions

Problem: Face doesn't look like me Solution: Improve training image quality and variety. Check that all images are cropped consistently and have clear face visibility.

Problem: Generated images have artifact distortions Solution: Lower the LoRA weight to 0.6–0.7. Train for fewer steps (over-training causes this).

Problem: Model generates two faces merged together Solution: Ensure your regularization images are included. Without regularization, the model can confuse your face with its general understanding of faces.

Problem: Face looks correct but body/clothing is wrong Solution: DreamBooth face training doesn't train full-body consistency. For full-body training, include more full-body shots in your training set.

Ethical Use Considerations

DreamBooth on your own face for personal use is straightforward. Some clear ethical boundaries:

Never train on someone's face without their consent. This is ethically problematic and potentially illegal in many jurisdictions.
Never generate non-consensual intimate imagery. This is illegal in most jurisdictions.
Never create content designed to deceive people about who did or said something.
Deepfake content created to harm others' reputations is both unethical and potentially illegal.

The technology is powerful. Use it for creative and personal applications; don't use it to harm.