Follow AiTechWorlds on LinkedIn for professional AI content!Follow Now →

AiTechWorlds

🧩

AI Learning

ML Learning Paradigms: Complete Guide

Supervised, unsupervised, and reinforcement learning explained with sklearn examples, metrics, and decision framework.

#machine-learning #supervised #unsupervised #reinforcement-learning #sklearn

Back to Notes Library

ML Learning Paradigms: Supervised, Unsupervised & Reinforcement Learning

The Three Core ML Paradigms

Paradigm	Has Labels?	Learns From	Goal
Supervised	Yes	(input, label) pairs	Predict label for new input
Unsupervised	No	Input data only	Find structure/patterns
Reinforcement	No (uses rewards)	Agent-environment interactions	Maximize cumulative reward

Supervised Learning

Definition

Train a model on labeled examples (X, y) so it can predict y for unseen X.

Types

Type	Output	Example Algorithms	Example Problem
Classification	Discrete class	LogReg, SVM, RF, XGBoost	Spam detection, image classification
Regression	Continuous value	Linear Regression, SVR, Gradient Boosting	House price, stock prediction
Sequence labeling	Per-token label	CRF, BiLSTM, BERT fine-tuned	NER, POS tagging

Scikit-learn Template

python

from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X_train, y_train)

y_pred = model.predict(X_test)
print(classification_report(y_test, y_pred))

Key Metrics

Task	Metrics
Binary classification	Accuracy, Precision, Recall, F1, AUC-ROC
Multiclass	Macro/weighted F1, Confusion Matrix
Regression	MAE, MSE, RMSE, R²

Unsupervised Learning

Definition

Find structure in unlabeled data — no ground truth labels are provided.

Types

Type	Goal	Algorithms
Clustering	Group similar data points	K-Means, DBSCAN, Agglomerative
Dimensionality Reduction	Compress features	PCA, t-SNE, UMAP
Anomaly Detection	Identify outliers	Isolation Forest, One-Class SVM
Generative Models	Learn data distribution	VAE, GAN, Diffusion
Association Rules	Find co-occurrence patterns	Apriori, FP-Growth

K-Means Example

python

from sklearn.cluster import KMeans
from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

kmeans = KMeans(n_clusters=4, random_state=42, n_init='auto')
labels = kmeans.fit_predict(X_scaled)

# Evaluate cluster quality (no ground truth needed)
from sklearn.metrics import silhouette_score
score = silhouette_score(X_scaled, labels)
print(f"Silhouette: {score:.3f}")  # range -1 to 1, higher is better

Choosing Number of Clusters

python

# Elbow method
inertias = []
for k in range(2, 11):
    km = KMeans(n_clusters=k, random_state=42, n_init='auto')
    km.fit(X_scaled)
    inertias.append(km.inertia_)
# Plot inertias — choose k at the "elbow" (diminishing returns)

Reinforcement Learning

Definition

An agent takes actions in an environment, receives rewards, and learns a policy that maximizes cumulative reward over time.

Key Concepts

Term	Meaning
Agent	The learner/decision-maker
Environment	The world the agent interacts with
State (s)	Current situation
Action (a)	Possible moves the agent can take
Reward (r)	Feedback signal after action
Policy (π)	Strategy: state → action mapping
Value function (V)	Expected future reward from a state
Q-function	Expected reward from (state, action) pair

Core Algorithms

Algorithm	Type	Use Case
Q-Learning	Model-free, off-policy	Discrete action spaces
DQN	Deep Q-Network	Atari games, simple control
PPO	Policy gradient	Continuous control, RLHF
SAC	Soft actor-critic	Robotics, complex envs
A3C/A2C	Actor-critic	Parallel training
AlphaZero	MCTS + self-play	Board games

Simple Q-Learning

python

import numpy as np

# Q-table: states × actions
Q = np.zeros((n_states, n_actions))

# Training loop
for episode in range(1000):
    state = env.reset()
    for step in range(max_steps):
        # Epsilon-greedy action selection
        if np.random.rand() < epsilon:
            action = env.action_space.sample()   # explore
        else:
            action = np.argmax(Q[state])         # exploit

        next_state, reward, done, _ = env.step(action)

        # Bellman update
        Q[state, action] += alpha * (
            reward + gamma * np.max(Q[next_state]) - Q[state, action]
        )
        state = next_state
        if done: break

Semi-Supervised & Self-Supervised Learning

Type	Description	Example
Semi-supervised	Small labeled set + large unlabeled set	Label propagation, pseudo-labeling
Self-supervised	Create labels from data structure itself	BERT (masked token prediction), SimCLR
Transfer learning	Pre-train on one task, fine-tune on another	ImageNet → medical imaging

Choosing the Right Paradigm

text

Do you have labeled data?
  ├─ Yes → Supervised learning
  │    └─ Output is a category? Classification
  │    └─ Output is a number? Regression
  │
  └─ No → What is your goal?
       ├─ Find groups → Clustering (K-Means, DBSCAN)
       ├─ Detect outliers → Anomaly detection
       ├─ Reduce features → PCA, UMAP
       ├─ Sequential decisions → Reinforcement learning
       └─ Generate new data → Generative models (GAN, VAE)

Common Mistakes

Shuffling time-series data before train/test split — leaks future information into the past
Applying Standard Scaler before splitting — leaks test set statistics into training
Using accuracy on imbalanced datasets — precision/recall/F1 reveal the real picture
Choosing K-Means for non-spherical clusters — use DBSCAN for irregular shapes
Treating RL as a first choice — it's data-hungry and unstable; try supervised first if labels exist

Download ML Learning Paradigms: Complete Guide

Get this note + 100s more free on Telegram

Join Free →

📱

Get more notes like this daily on Telegram!

Free study notes, cheat sheets & AI tips

Join Free →

10K+ Members Growing Daily

Get Free AI Notes Daily

Join AiTechWorlds on Telegram and get daily AI tips, prompt engineering templates, coding resources, and exclusive content — 100% free!

📚 Free Study Notes🤖 AI Tips Daily⚡ Prompt Templates💻 Coding Resources

Join Free Channel

No spam. Leave anytime.

🧩

AI Learning

ML Learning Paradigms: Complete Guide

Supervised, unsupervised, and reinforcement learning explained with sklearn examples, metrics, and decision framework.

#machine-learning #supervised #unsupervised #reinforcement-learning #sklearn

Back to Notes Library

ML Learning Paradigms: Supervised, Unsupervised & Reinforcement Learning

The Three Core ML Paradigms

Paradigm	Has Labels?	Learns From	Goal
Supervised	Yes	(input, label) pairs	Predict label for new input
Unsupervised	No	Input data only	Find structure/patterns
Reinforcement	No (uses rewards)	Agent-environment interactions	Maximize cumulative reward

Supervised Learning

Definition

Train a model on labeled examples (X, y) so it can predict y for unseen X.

Types

Type	Output	Example Algorithms	Example Problem
Classification	Discrete class	LogReg, SVM, RF, XGBoost	Spam detection, image classification
Regression	Continuous value	Linear Regression, SVR, Gradient Boosting	House price, stock prediction
Sequence labeling	Per-token label	CRF, BiLSTM, BERT fine-tuned	NER, POS tagging

Scikit-learn Template

python

from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X_train, y_train)

y_pred = model.predict(X_test)
print(classification_report(y_test, y_pred))

Key Metrics

Task	Metrics
Binary classification	Accuracy, Precision, Recall, F1, AUC-ROC
Multiclass	Macro/weighted F1, Confusion Matrix
Regression	MAE, MSE, RMSE, R²

Unsupervised Learning

Definition

Find structure in unlabeled data — no ground truth labels are provided.

Types

Type	Goal	Algorithms
Clustering	Group similar data points	K-Means, DBSCAN, Agglomerative
Dimensionality Reduction	Compress features	PCA, t-SNE, UMAP
Anomaly Detection	Identify outliers	Isolation Forest, One-Class SVM
Generative Models	Learn data distribution	VAE, GAN, Diffusion
Association Rules	Find co-occurrence patterns	Apriori, FP-Growth

K-Means Example

python

from sklearn.cluster import KMeans
from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

kmeans = KMeans(n_clusters=4, random_state=42, n_init='auto')
labels = kmeans.fit_predict(X_scaled)

# Evaluate cluster quality (no ground truth needed)
from sklearn.metrics import silhouette_score
score = silhouette_score(X_scaled, labels)
print(f"Silhouette: {score:.3f}")  # range -1 to 1, higher is better

Choosing Number of Clusters

python

# Elbow method
inertias = []
for k in range(2, 11):
    km = KMeans(n_clusters=k, random_state=42, n_init='auto')
    km.fit(X_scaled)
    inertias.append(km.inertia_)
# Plot inertias — choose k at the "elbow" (diminishing returns)

Reinforcement Learning

Definition

An agent takes actions in an environment, receives rewards, and learns a policy that maximizes cumulative reward over time.

Key Concepts

Term	Meaning
Agent	The learner/decision-maker
Environment	The world the agent interacts with
State (s)	Current situation
Action (a)	Possible moves the agent can take
Reward (r)	Feedback signal after action
Policy (π)	Strategy: state → action mapping
Value function (V)	Expected future reward from a state
Q-function	Expected reward from (state, action) pair

Core Algorithms

Algorithm	Type	Use Case
Q-Learning	Model-free, off-policy	Discrete action spaces
DQN	Deep Q-Network	Atari games, simple control
PPO	Policy gradient	Continuous control, RLHF
SAC	Soft actor-critic	Robotics, complex envs
A3C/A2C	Actor-critic	Parallel training
AlphaZero	MCTS + self-play	Board games

Simple Q-Learning

python

import numpy as np

# Q-table: states × actions
Q = np.zeros((n_states, n_actions))

# Training loop
for episode in range(1000):
    state = env.reset()
    for step in range(max_steps):
        # Epsilon-greedy action selection
        if np.random.rand() < epsilon:
            action = env.action_space.sample()   # explore
        else:
            action = np.argmax(Q[state])         # exploit

        next_state, reward, done, _ = env.step(action)

        # Bellman update
        Q[state, action] += alpha * (
            reward + gamma * np.max(Q[next_state]) - Q[state, action]
        )
        state = next_state
        if done: break

Semi-Supervised & Self-Supervised Learning

Type	Description	Example
Semi-supervised	Small labeled set + large unlabeled set	Label propagation, pseudo-labeling
Self-supervised	Create labels from data structure itself	BERT (masked token prediction), SimCLR
Transfer learning	Pre-train on one task, fine-tune on another	ImageNet → medical imaging

Choosing the Right Paradigm

text

Do you have labeled data?
  ├─ Yes → Supervised learning
  │    └─ Output is a category? Classification
  │    └─ Output is a number? Regression
  │
  └─ No → What is your goal?
       ├─ Find groups → Clustering (K-Means, DBSCAN)
       ├─ Detect outliers → Anomaly detection
       ├─ Reduce features → PCA, UMAP
       ├─ Sequential decisions → Reinforcement learning
       └─ Generate new data → Generative models (GAN, VAE)

Common Mistakes

Shuffling time-series data before train/test split — leaks future information into the past
Applying Standard Scaler before splitting — leaks test set statistics into training
Using accuracy on imbalanced datasets — precision/recall/F1 reveal the real picture
Choosing K-Means for non-spherical clusters — use DBSCAN for irregular shapes
Treating RL as a first choice — it's data-hungry and unstable; try supervised first if labels exist

Download ML Learning Paradigms: Complete Guide

Get this note + 100s more free on Telegram

Join Free →

📱

Get more notes like this daily on Telegram!

Free study notes, cheat sheets & AI tips

Join Free →

10K+ Members Growing Daily

Get Free AI Notes Daily

Join AiTechWorlds on Telegram and get daily AI tips, prompt engineering templates, coding resources, and exclusive content — 100% free!

📚 Free Study Notes🤖 AI Tips Daily⚡ Prompt Templates💻 Coding Resources

Join Free Channel

No spam. Leave anytime.