Dictionaries & JSON

Dictionaries are Python's most important data structure after lists. They map keys to values, enable O(1) lookups, and are how almost all real-world structured data is represented — from API responses to database records.

Creating Dictionaries

# Basic dictionary
person = {
    "name": "Alice",
    "age": 30,
    "city": "New York"
}

# Empty dict
empty = {}

# From lists of key-value pairs
d = dict([("a", 1), ("b", 2), ("c", 3)])

# Dict comprehension
squares = {x: x**2 for x in range(6)}
# {0: 0, 1: 1, 2: 4, 3: 9, 4: 16, 5: 25}

Accessing Data

person = {"name": "Alice", "age": 30, "email": "alice@example.com"}

# Direct access — raises KeyError if missing
print(person["name"])       # "Alice"

# .get() — returns None if missing (safe)
print(person.get("phone"))          # None
print(person.get("phone", "N/A"))   # "N/A" — default value

# Check if key exists
if "email" in person:
    print(person["email"])

Always prefer .get() when a key might not exist. It prevents KeyError exceptions that crash your program.

Modifying Dictionaries

user = {"name": "Bob", "score": 0}

# Add or update
user["score"] = 100          # Update existing
user["level"] = "gold"       # Add new key

# Update multiple keys at once
user.update({"score": 150, "rank": 1, "active": True})

# Remove
del user["rank"]
value = user.pop("level")     # Removes and returns the value
user.pop("nonexistent", None) # Safe pop — returns None if missing

# Clear all
user.clear()

Iterating Over Dictionaries

product = {"name": "Laptop", "price": 999.99, "in_stock": True}

# Keys (default iteration)
for key in product:
    print(key)

# Values
for value in product.values():
    print(value)

# Both — most common pattern
for key, value in product.items():
    print(f"{key}: {value}")

# Convert to lists
keys = list(product.keys())     # ['name', 'price', 'in_stock']
values = list(product.values()) # ['Laptop', 999.99, True]

Nested Dictionaries

Real-world data is often nested:

users = {
    "alice": {
        "email": "alice@example.com",
        "role": "admin",
        "permissions": ["read", "write", "delete"]
    },
    "bob": {
        "email": "bob@example.com",
        "role": "user",
        "permissions": ["read"]
    }
}

# Access nested data
print(users["alice"]["role"])                    # "admin"
print(users["bob"]["permissions"][0])            # "read"

# Safe nested access
alice_phone = users.get("alice", {}).get("phone", "not set")
print(alice_phone)   # "not set"

The defaultdict — Never Check if Key Exists

from collections import defaultdict

# Regular dict — KeyError risk
word_count = {}
for word in text.split():
    if word not in word_count:
        word_count[word] = 0
    word_count[word] += 1

# defaultdict — automatically creates default value
word_count = defaultdict(int)    # Default is 0
for word in text.split():
    word_count[word] += 1        # No check needed!

# defaultdict with list
groups = defaultdict(list)
data = [("fruit", "apple"), ("veggie", "carrot"), ("fruit", "banana")]
for category, item in data:
    groups[category].append(item)
# {'fruit': ['apple', 'banana'], 'veggie': ['carrot']}

Working with JSON

JSON (JavaScript Object Notation) is the universal data format for APIs. Python dicts map perfectly to JSON objects.

import json

# Dict to JSON string (serialization)
data = {
    "user": "Alice",
    "scores": [92, 78, 85],
    "metadata": {"active": True, "level": 3}
}

json_string = json.dumps(data)
print(json_string)
# '{"user": "Alice", "scores": [92, 78, 85], "metadata": {"active": true, "level": 3}}'

# Pretty print for readability
print(json.dumps(data, indent=2))

# JSON string to dict (deserialization)
raw_json = '{"name": "Bob", "age": 25}'
obj = json.loads(raw_json)
print(obj["name"])   # "Bob"
print(type(obj))     # <class 'dict'>

Reading and Writing JSON Files

# Save to file
with open("data.json", "w") as f:
    json.dump(data, f, indent=2)

# Load from file
with open("data.json", "r") as f:
    loaded_data = json.load(f)

Working with API Responses

import requests

response = requests.get("https://api.example.com/users/1")
user = response.json()   # Automatically parses JSON to dict

print(f"Name: {user['name']}")
print(f"Email: {user['email']}")

# Handle nested data
for post in user.get("posts", []):
    print(f"  Post: {post['title']}")

Real-World Example: Analyzing Survey Data

import json
from collections import Counter, defaultdict

# Simulate survey responses
responses = [
    {"age": 28, "country": "US", "favorite_lang": "Python"},
    {"age": 34, "country": "UK", "favorite_lang": "JavaScript"},
    {"age": 22, "country": "US", "favorite_lang": "Python"},
    {"age": 45, "country": "DE", "favorite_lang": "Go"},
    {"age": 31, "country": "US", "favorite_lang": "Python"},
]

# Count languages
lang_counts = Counter(r["favorite_lang"] for r in responses)
print(lang_counts.most_common(3))
# [('Python', 3), ('JavaScript', 1), ('Go', 1)]

# Group by country
by_country = defaultdict(list)
for r in responses:
    by_country[r["country"]].append(r)

# Average age by country
for country, group in by_country.items():
    avg_age = sum(r["age"] for r in group) / len(group)
    print(f"{country}: avg age {avg_age:.1f}")

Key Takeaways

Use .get(key, default) whenever a key might not exist
Dicts are unordered (well, insertion-ordered since Python 3.7, but don't rely on it for logic)
defaultdict eliminates the "check before append" pattern
JSON and dicts are interchangeable — json.loads() / json.dumps() to convert
Nested dicts represent real-world hierarchical data — get comfortable accessing them

Next lesson: Classes and Objects — organizing code into reusable blueprints.