Tools, Memory & Planning: The Core Loop | AI Agent Development Course | AiTechWorlds

The Agent Core Loop: Implementing Reasoning, Acting, and Observing

The agent core loop is the engine that drives all AI agents. Understanding and implementing it correctly is the foundation of everything else in this course. This lesson builds the loop from scratch — starting with raw API calls and building up to a clean, reusable implementation.

The Loop in Plain English

An agent runs until either:

It determines the task is complete and returns a final answer
It reaches a maximum step limit (preventing infinite loops)
It encounters an unrecoverable error

Each iteration of the loop:

The LLM looks at the task and history, decides what to do next
The agent parses the decision (which tool? what input?)
The agent executes the tool call
The result is added to history
Back to step 1

Building the Core Loop

Basic Implementation with OpenAI

from openai import OpenAI
import json

client = OpenAI()

def run_agent(task: str, tools: list[dict], tool_functions: dict, max_steps: int = 10) -> str:
    """
    Run an agent until completion or max_steps.
    
    Args:
        task: The user's goal
        tools: OpenAI tool definitions (function schemas)
        tool_functions: dict mapping tool names to Python callables
        max_steps: Maximum iterations before giving up
    """
    messages = [
        {"role": "system", "content": "You are a helpful assistant with access to tools. Use them to complete the task."},
        {"role": "user", "content": task}
    ]
    
    for step in range(max_steps):
        print(f"\n--- Step {step + 1} ---")
        
        # Ask the LLM: what should I do next?
        response = client.chat.completions.create(
            model="gpt-4o",
            messages=messages,
            tools=tools,
            tool_choice="auto"  # LLM decides whether to use a tool
        )
        
        message = response.choices[0].message
        messages.append(message)  # Add LLM response to history
        
        # Check if LLM is done (no tool calls = final answer)
        if not message.tool_calls:
            print(f"Final answer: {message.content}")
            return message.content
        
        # Execute each tool the LLM requested
        for tool_call in message.tool_calls:
            tool_name = tool_call.function.name
            tool_args = json.loads(tool_call.function.arguments)
            
            print(f"Calling tool: {tool_name}({tool_args})")
            
            # Look up and run the tool
            if tool_name in tool_functions:
                result = tool_functions[tool_name](**tool_args)
            else:
                result = f"Error: tool '{tool_name}' not found"
            
            print(f"Tool result: {result}")
            
            # Add tool result to message history
            messages.append({
                "role": "tool",
                "tool_call_id": tool_call.id,
                "content": str(result)
            })
    
    # Max steps reached
    return "Agent reached maximum steps without completing the task."

Defining Tools

Tools are described to the LLM using a JSON schema. The LLM reads the description to understand what the tool does and what arguments it takes:

tools = [
    {
        "type": "function",
        "function": {
            "name": "web_search",
            "description": "Search the web for current information on a topic. Returns a summary of search results.",
            "parameters": {
                "type": "object",
                "properties": {
                    "query": {
                        "type": "string",
                        "description": "The search query"
                    }
                },
                "required": ["query"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "python_execute",
            "description": "Execute Python code and return the output. Useful for calculations, data processing.",
            "parameters": {
                "type": "object",
                "properties": {
                    "code": {
                        "type": "string",
                        "description": "Python code to execute"
                    }
                },
                "required": ["code"]
            }
        }
    }
]

Implementing the Tool Functions

import subprocess
import tempfile
import os

def web_search(query: str) -> str:
    """Real implementation would call a search API (SerpAPI, Tavily, etc.)"""
    # Example with Tavily
    from tavily import TavilyClient
    client = TavilyClient(api_key=os.environ["TAVILY_API_KEY"])
    result = client.search(query, max_results=3)
    return "\n\n".join([r["content"] for r in result["results"]])


def python_execute(code: str) -> str:
    """Execute Python code in a subprocess for safety."""
    with tempfile.NamedTemporaryFile(mode='w', suffix='.py', delete=False) as f:
        f.write(code)
        temp_path = f.name
    
    try:
        result = subprocess.run(
            ["python", temp_path],
            capture_output=True,
            text=True,
            timeout=30  # Kill if takes too long
        )
        output = result.stdout if result.returncode == 0 else f"Error: {result.stderr}"
        return output
    finally:
        os.unlink(temp_path)  # Clean up temp file


tool_functions = {
    "web_search": web_search,
    "python_execute": python_execute
}

Running the Agent

result = run_agent(
    task="What is the current weather in Tokyo, and how does it compare to the historical average for this time of year?",
    tools=tools,
    tool_functions=tool_functions,
    max_steps=5
)
print(result)

The agent will:

Call web_search("Tokyo weather today")
Call web_search("Tokyo average weather [current month]") or python_execute to compare
Return a synthesized answer

Handling Errors Gracefully

Agents need robust error handling — tools fail, APIs return unexpected results:

def execute_tool_safely(tool_name: str, tool_args: dict, tool_functions: dict) -> str:
    """Execute a tool and return its output or a structured error."""
    if tool_name not in tool_functions:
        return json.dumps({"error": f"Unknown tool: {tool_name}"})
    
    try:
        result = tool_functions[tool_name](**tool_args)
        return str(result)
    except Exception as e:
        error_msg = f"Tool '{tool_name}' failed: {type(e).__name__}: {str(e)}"
        print(f"Tool error: {error_msg}")
        return json.dumps({"error": error_msg})

When a tool fails, the LLM sees the error in the next iteration and can decide to try a different approach — if you tell it to:

system_prompt = """You are a helpful assistant with access to tools.

When a tool returns an error:
1. Understand what went wrong from the error message
2. Try an alternative approach if possible
3. If you've tried multiple approaches and can't proceed, explain what you tried and why you can't complete the task.

Don't give up after the first tool failure."""

Streaming for Real-Time Feedback

For user-facing agents, stream the output so users see progress:

response = client.chat.completions.create(
    model="gpt-4o",
    messages=messages,
    tools=tools,
    stream=True
)

for chunk in response:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)

Streaming requires more complex logic for tool calls (you receive them incrementally), but it dramatically improves perceived responsiveness.

Step Limits and Stopping Conditions

Always set a maximum step limit. Without one, a confused agent loops forever and burns money:

MAX_STEPS = 10

# Better: also check if the agent is going in circles
def is_making_progress(messages: list, last_n: int = 3) -> bool:
    """Check if recent tool calls are different (not stuck in a loop)."""
    recent_tool_calls = [
        (msg.tool_calls[0].function.name if hasattr(msg, 'tool_calls') and msg.tool_calls else None)
        for msg in messages[-last_n:]
        if hasattr(msg, 'role') and msg.role == 'assistant'
    ]
    # If all recent calls are the same tool with the same args, we're stuck
    return len(set(str(tc) for tc in recent_tool_calls if tc)) > 1

What the LLM Decides

At each step, the LLM can choose to:

Call a tool (continue working)
Return a final answer (task complete)
Ask for clarification (if the task is ambiguous)

Your system prompt shapes these decisions. If you want the agent to ask clarifying questions before starting, say so. If you want it to proceed with assumptions and note them in the answer, say that.

Next lesson: Setting up your agent development environment — tools, APIs, and local setup.