Follow AiTechWorlds on LinkedIn for professional AI content!Follow Now →
16 minLesson 1 of 23
Agent Fundamentals

What are AI Agents? Beyond Chatbots

What Are AI Agents? The Mental Model

AI agents are the next evolution beyond chatbots. A chatbot answers questions. An agent takes actions — it browses the web, writes and runs code, manages files, calls APIs, and iterates until a task is complete. Understanding what agents are (and aren't) is essential before building one.

The Mental Model: LLM + Tools + Loop

An AI agent is built from three components:

┌─────────────────────────────────┐
│            AI AGENT             │
│                                 │
│  1. LLM (the brain)             │
│     - Reasons about the task    │
│     - Decides what to do next   │
│                                 │
│  2. Tools (the hands)           │
│     - Web search                │
│     - Code execution            │
│     - File read/write           │
│     - API calls                 │
│     - Database queries          │
│                                 │
│  3. Memory (the context)        │
│     - Short-term: conversation  │
│     - Long-term: vector store   │
└─────────────────────────────────┘

The key addition over a chatbot is the action loop: the agent can take actions, observe their results, and reason about what to do next — repeating until the goal is achieved.

The ReAct Loop (Reason + Act)

The most important concept in agent design is the ReAct (Reasoning + Acting) loop:

Task: "Find the 3 most cited ML papers from 2024 and summarize each"

Iteration 1:
  Reason: I need to search for highly cited ML papers from 2024
  Act: [web_search("most cited machine learning papers 2024")]
  Observe: Search results with paper titles and citation counts

Iteration 2:
  Reason: I found some papers. I need to get the abstracts to summarize them.
  Act: [web_fetch("https://arxiv.org/abs/2401.xxxxx")]
  Observe: Paper abstract and key findings

Iteration 3:
  Reason: I have enough information. I'll now write the summary.
  Act: [final_answer("1. Paper X (5,234 citations): ...")]

The agent decides what to do, does it, sees the result, and decides what to do next — just like a human researcher would.

Chatbot vs Agent: The Key Difference

ChatbotAgent
ActionsOnly outputs textCan take real actions
ToolsNone (or only retrieval)Web search, code exec, APIs
ExecutionSingle turnMulti-step loop
AutonomyNone — waits for humanCan run unsupervised
MemoryUsually noneShort and long-term
Error handlingCan't retryCan detect errors and retry

A chatbot says "here's how to fix your code." An agent fixes your code directly.

Types of Agents

Single Agent: One LLM with multiple tools. Best for focused tasks.

User: "Analyze the sales data in sales.csv and write a report"
Agent: reads file → analyzes data → writes code to generate charts → writes report

Multi-Agent System: Multiple specialized agents that collaborate.

Orchestrator Agent
├── Research Agent (searches web, reads papers)
├── Analyst Agent (processes data, runs calculations)
├── Writer Agent (produces final document)
└── Reviewer Agent (checks quality, facts)

Multi-agent systems are more powerful but harder to build and debug.

Autonomous Agent: Runs continuously, monitoring and acting without human triggers.

Customer Support Agent:
- Monitors support email inbox
- Classifies issues
- Resolves simple ones automatically
- Escalates complex ones to humans
- Runs 24/7

Real Production Examples

GitHub Copilot Workspace — Agent that takes an issue description, understands the codebase, proposes a plan, and writes the full implementation across multiple files.

Devin (Cognition AI) — Software engineering agent that takes a task, sets up its own dev environment, writes and tests code, debugs errors, and commits working code.

Perplexity AI — Research agent that breaks down complex questions, searches multiple sources, synthesizes findings, and cites every claim.

OpenAI Operator — Web browser agent that can book flights, fill forms, and complete multi-step web tasks autonomously.

Why Agents Are Hard: The Real Challenges

Building agents is deceptively difficult. Here's what goes wrong:

Hallucination in action loops: An agent that hallucinates tool arguments can take incorrect actions that are hard to reverse. A wrong API call can delete data.

Cost and latency: Each step in an agent loop costs tokens and time. A 20-step task with GPT-4o can cost $0.50+ and take 2+ minutes.

Error recovery: When a tool returns an unexpected error, agents often spiral into loops or give up. Robust error handling is non-trivial.

Prompt injection: Malicious content in retrieved documents can hijack agent behavior. Web scraping agents are especially vulnerable.

Unreliable tool calls: LLMs sometimes call tools with wrong arguments or in the wrong order.

When Agents Are (and Aren't) the Right Choice

Use agents when:

  • The task requires multiple steps that depend on each other
  • You don't know in advance which steps are needed
  • The task requires gathering information before acting
  • Human review at each step is impractical

Don't use agents when:

  • A simple pipeline (fixed sequence of steps) would work
  • The task can be done in a single LLM call
  • Reliability is critical and errors are costly to reverse
  • Latency matters — agents are inherently slow

The golden rule: Start with the simplest solution. A well-prompted single LLM call beats a fragile 10-step agent every time.

The Agent Development Stack

The most popular tools for building agents in Python:

  • LangChain — Most popular framework; large ecosystem of tools and integrations
  • LangGraph — LangChain extension for multi-agent systems with state management
  • AutoGen (Microsoft) — Framework for multi-agent conversations
  • CrewAI — High-level framework for role-based multi-agent teams
  • OpenAI Assistants API — Managed agents with built-in tools (code interpreter, file search)
  • Claude Computer Use — Anthropic's agent that controls a computer directly

Next lesson: Tools & Function Calling — giving your agent the ability to actually do things in the world.

📱

Get this course's notes on Telegram!

Free cheat sheets, summaries & practice exercises

Get Notes Free →
!