AiTechWorlds
AiTechWorlds
A professional chef doesn't use one knife for everything. A serrated bread knife would mangle a tomato. A paring knife would struggle through a butternut squash. A cleaver would destroy delicate herbs. Each knife is precisely the right tool for a specific job.
Sorting algorithms are the same. No single algorithm wins in every situation. The "right" sorting algorithm depends on:
This lesson answers: for any situation, which algorithm should you reach for, and why?
| Algorithm | Best Case | Average Case | Worst Case | Space | Stable? | When to Use |
|---|---|---|---|---|---|---|
| Bubble Sort | O(n) | O(n²) | O(n²) | O(1) | Yes | Nearly sorted, educational |
| Selection Sort | O(n²) | O(n²) | O(n²) | O(1) | No | Minimum writes needed |
| Insertion Sort | O(n) | O(n²) | O(n²) | O(1) | Yes | Small arrays, nearly sorted, online |
| Merge Sort | O(n log n) | O(n log n) | O(n log n) | O(n) | Yes | Guaranteed performance, external sort, linked lists |
| Quick Sort | O(n log n) | O(n log n) | O(n²) | O(log n) | No | General purpose, cache-critical, in-place needed |
| Heap Sort | O(n log n) | O(n log n) | O(n log n) | O(1) | No | Guaranteed O(n log n) in-place |
All complexities verified. Space complexity refers to auxiliary space only.
Is n very small (< 20 elements)?
├── YES → Use INSERTION SORT
│ (low overhead beats asymptotic efficiency)
└── NO →
Is memory critically constrained (O(1) space required)?
├── YES → Need guaranteed O(n log n)?
│ ├── YES → Use HEAP SORT
│ └── NO → Use QUICK SORT (random pivot)
└── NO →
Is stability required?
├── YES → Use MERGE SORT
└── NO →
Is data already nearly sorted?
├── YES → Use INSERTION SORT or TIMSORT
└── NO →
Is cache efficiency critical?
├── YES → Use QUICK SORT (random pivot)
└── NO → Use MERGE SORT
In practice for most applications: use your language's built-in sort (Timsort in Python/Java, pdqsort in Rust). They're hybrids that handle all these cases automatically.
# Insertion sort on nearly sorted data — approaches O(n)
data = [1, 2, 3, 5, 4, 6, 7, 8, 9, 10] # only one swap needed
def insertion_sort(arr):
for i in range(1, len(arr)):
key = arr[i]
j = i - 1
while j >= 0 and arr[j] > key:
arr[j + 1] = arr[j]
j -= 1
arr[j + 1] = key
return arr
For this input, insertion sort does only 1 displacement — essentially O(n). Merge sort would still do O(n log n) work regardless. Insertion sort wins here.
# Merge sort on linked lists — no random access needed
# The merge step only needs sequential traversal
# This makes merge sort the ONLY practical O(n log n) sort for linked lists
# (quicksort requires random access to be efficient)
Merge sort is also the only algorithm suitable for external sorting — data too large for RAM. Its sequential access pattern minimizes expensive disk seeks.
import random
def quick_sort_random(arr, lo=0, hi=None):
if hi is None: hi = len(arr) - 1
if lo < hi:
# Random pivot prevents worst-case on sorted inputs
r = random.randint(lo, hi)
arr[r], arr[hi] = arr[hi], arr[r]
p = partition(arr, lo, hi)
quick_sort_random(arr, lo, p - 1)
quick_sort_random(arr, p + 1, hi)
def partition(arr, lo, hi):
pivot, i = arr[hi], lo - 1
for j in range(lo, hi):
if arr[j] <= pivot:
i += 1
arr[i], arr[j] = arr[j], arr[i]
arr[i+1], arr[hi] = arr[hi], arr[i+1]
return i + 1
For in-memory sorting of large random datasets, quicksort's cache efficiency typically makes it 2–3x faster than merge sort in wall-clock time, despite identical asymptotic complexity.
Heap sort achieves what seems impossible: guaranteed O(n log n) time with O(1) space. It builds a max-heap, then repeatedly extracts the maximum.
def heap_sort(arr):
n = len(arr)
# Build max heap
for i in range(n // 2 - 1, -1, -1):
heapify(arr, n, i)
# Extract elements from heap one by one
for i in range(n - 1, 0, -1):
arr[0], arr[i] = arr[i], arr[0] # move current max to end
heapify(arr, i, 0) # restore heap for remaining
def heapify(arr, n, i):
largest = i
left, right = 2*i + 1, 2*i + 2
if left < n and arr[left] > arr[largest]: largest = left
if right < n and arr[right] > arr[largest]: largest = right
if largest != i:
arr[i], arr[largest] = arr[largest], arr[i]
heapify(arr, n, largest)
Heap sort's weakness: poor cache performance due to non-sequential memory access in the heap. It's rarely faster than quicksort in practice despite better theoretical guarantees.
Python's list.sort() and sorted() use Timsort, invented by Tim Peters in 2002. It is arguably the most sophisticated general-purpose sorting algorithm in widespread use.
Timsort's strategy:
Why Timsort wins on real-world data:
# Python's sort — Timsort under the hood
students = [
("Alice", 90), ("Bob", 85), ("Charlie", 90), ("Diana", 85)
]
# Sort by grade (stable — equal grades preserve original order)
students.sort(key=lambda x: x[1])
print(students)
# Output: [('Bob', 85), ('Diana', 85), ('Alice', 90), ('Charlie', 90)]
# Bob before Diana (original order preserved), Alice before Charlie
import time
import random
def get_algorithms():
return {
"Insertion Sort": insertion_sort,
"Merge Sort": merge_sort,
"Quick Sort": lambda a: quick_sort_random(a[:]),
}
def run_benchmark(n=5000):
data = [random.randint(0, 10000) for _ in range(n)]
print(f"\nBenchmark on {n} random elements:")
print(f"{'Algorithm':<20} {'Time (ms)':>12}")
print("-" * 35)
for name, func in get_algorithms().items():
test_data = data[:]
start = time.perf_counter()
func(test_data)
elapsed = (time.perf_counter() - start) * 1000
print(f"{name:<20} {elapsed:>11.2f}ms")
# Python's built-in for comparison
test_data = data[:]
start = time.perf_counter()
test_data.sort()
elapsed = (time.perf_counter() - start) * 1000
print(f"{'Python built-in':<20} {elapsed:>11.2f}ms")
run_benchmark(5000)
Typical output (approximate):
Benchmark on 5000 random elements:
Algorithm Time (ms)
-----------------------------------
Insertion Sort 612.40ms
Merge Sort 12.80ms
Quick Sort 8.50ms
Python built-in 1.20ms
Python's built-in is fastest because Timsort is implemented in C, not Python — showing that language-level optimization matters too.
When your data is integers within a known range, you can break the O(n log n) lower bound that applies to comparison-based sorts.
Counting Sort — O(n + k) where k is the value range:
def counting_sort(arr, max_val):
count = [0] * (max_val + 1)
for x in arr: count[x] += 1 # count occurrences
result = []
for val, freq in enumerate(count):
result.extend([val] * freq) # expand counts back to sorted list
return result
arr = [4, 2, 2, 8, 3, 3, 1]
print(counting_sort(arr, 8))
# Output: [1, 2, 2, 3, 3, 4, 8]
# Time: O(n + 8) = O(n) for this input
Only practical when k (range) is not much larger than n (array size). For sorting 100 numbers between 0 and 1,000,000, counting sort uses 1MB of memory and is slower than quicksort.
Radix Sort — O(d·n) where d is number of digits: Processes integers digit by digit using counting sort. Sorts a million 32-bit integers in ~4 passes × O(n) = O(n) effectively. Used in specialized databases and graphics pipelines.
When an interviewer asks "what sort would you use here?":
| Situation | Answer |
|---|---|
| General purpose, don't care about stability | Quicksort (or built-in) |
| Stability required | Merge sort |
| Data mostly sorted | Insertion sort |
| Memory extremely constrained, need O(n log n) | Heap sort |
| Integers in bounded range | Counting sort or Radix sort |
| Data larger than RAM | External merge sort |
| Small array (< 20 elements) | Insertion sort |
| Linked list | Merge sort |
You now have a complete toolkit for sorting. The next step is applying these same analytical skills — algorithm selection, complexity analysis, and trade-off evaluation — to more complex algorithmic problems.
Get this course's notes on Telegram!
Free cheat sheets, summaries & practice exercises