What are AI Agents? Beyond Chatbots
What Are AI Agents? The Mental Model
AI agents are the next evolution beyond chatbots. A chatbot answers questions. An agent takes actions — it browses the web, writes and runs code, manages files, calls APIs, and iterates until a task is complete. Understanding what agents are (and aren't) is essential before building one.
The Mental Model: LLM + Tools + Loop
An AI agent is built from three components:
┌─────────────────────────────────┐
│ AI AGENT │
│ │
│ 1. LLM (the brain) │
│ - Reasons about the task │
│ - Decides what to do next │
│ │
│ 2. Tools (the hands) │
│ - Web search │
│ - Code execution │
│ - File read/write │
│ - API calls │
│ - Database queries │
│ │
│ 3. Memory (the context) │
│ - Short-term: conversation │
│ - Long-term: vector store │
└─────────────────────────────────┘
The key addition over a chatbot is the action loop: the agent can take actions, observe their results, and reason about what to do next — repeating until the goal is achieved.
The ReAct Loop (Reason + Act)
The most important concept in agent design is the ReAct (Reasoning + Acting) loop:
Task: "Find the 3 most cited ML papers from 2024 and summarize each"
Iteration 1:
Reason: I need to search for highly cited ML papers from 2024
Act: [web_search("most cited machine learning papers 2024")]
Observe: Search results with paper titles and citation counts
Iteration 2:
Reason: I found some papers. I need to get the abstracts to summarize them.
Act: [web_fetch("https://arxiv.org/abs/2401.xxxxx")]
Observe: Paper abstract and key findings
Iteration 3:
Reason: I have enough information. I'll now write the summary.
Act: [final_answer("1. Paper X (5,234 citations): ...")]
The agent decides what to do, does it, sees the result, and decides what to do next — just like a human researcher would.
Chatbot vs Agent: The Key Difference
| Chatbot | Agent | |
|---|---|---|
| Actions | Only outputs text | Can take real actions |
| Tools | None (or only retrieval) | Web search, code exec, APIs |
| Execution | Single turn | Multi-step loop |
| Autonomy | None — waits for human | Can run unsupervised |
| Memory | Usually none | Short and long-term |
| Error handling | Can't retry | Can detect errors and retry |
A chatbot says "here's how to fix your code." An agent fixes your code directly.
Types of Agents
Single Agent: One LLM with multiple tools. Best for focused tasks.
User: "Analyze the sales data in sales.csv and write a report"
Agent: reads file → analyzes data → writes code to generate charts → writes report
Multi-Agent System: Multiple specialized agents that collaborate.
Orchestrator Agent
├── Research Agent (searches web, reads papers)
├── Analyst Agent (processes data, runs calculations)
├── Writer Agent (produces final document)
└── Reviewer Agent (checks quality, facts)
Multi-agent systems are more powerful but harder to build and debug.
Autonomous Agent: Runs continuously, monitoring and acting without human triggers.
Customer Support Agent:
- Monitors support email inbox
- Classifies issues
- Resolves simple ones automatically
- Escalates complex ones to humans
- Runs 24/7
Real Production Examples
GitHub Copilot Workspace — Agent that takes an issue description, understands the codebase, proposes a plan, and writes the full implementation across multiple files.
Devin (Cognition AI) — Software engineering agent that takes a task, sets up its own dev environment, writes and tests code, debugs errors, and commits working code.
Perplexity AI — Research agent that breaks down complex questions, searches multiple sources, synthesizes findings, and cites every claim.
OpenAI Operator — Web browser agent that can book flights, fill forms, and complete multi-step web tasks autonomously.
Why Agents Are Hard: The Real Challenges
Building agents is deceptively difficult. Here's what goes wrong:
Hallucination in action loops: An agent that hallucinates tool arguments can take incorrect actions that are hard to reverse. A wrong API call can delete data.
Cost and latency: Each step in an agent loop costs tokens and time. A 20-step task with GPT-4o can cost $0.50+ and take 2+ minutes.
Error recovery: When a tool returns an unexpected error, agents often spiral into loops or give up. Robust error handling is non-trivial.
Prompt injection: Malicious content in retrieved documents can hijack agent behavior. Web scraping agents are especially vulnerable.
Unreliable tool calls: LLMs sometimes call tools with wrong arguments or in the wrong order.
When Agents Are (and Aren't) the Right Choice
Use agents when:
- The task requires multiple steps that depend on each other
- You don't know in advance which steps are needed
- The task requires gathering information before acting
- Human review at each step is impractical
Don't use agents when:
- A simple pipeline (fixed sequence of steps) would work
- The task can be done in a single LLM call
- Reliability is critical and errors are costly to reverse
- Latency matters — agents are inherently slow
The golden rule: Start with the simplest solution. A well-prompted single LLM call beats a fragile 10-step agent every time.
The Agent Development Stack
The most popular tools for building agents in Python:
- LangChain — Most popular framework; large ecosystem of tools and integrations
- LangGraph — LangChain extension for multi-agent systems with state management
- AutoGen (Microsoft) — Framework for multi-agent conversations
- CrewAI — High-level framework for role-based multi-agent teams
- OpenAI Assistants API — Managed agents with built-in tools (code interpreter, file search)
- Claude Computer Use — Anthropic's agent that controls a computer directly
Next lesson: Tools & Function Calling — giving your agent the ability to actually do things in the world.
Get this course's notes on Telegram!
Free cheat sheets, summaries & practice exercises