Build a LangChain Agent with Tool Calling (OpenAI Functions)
Master LangChain tool calling with OpenAI function calling. Bind tools, force execution, run parallel calls, and build production agents with structured output.
Get more content like this on Telegram!
Daily AI tips, notes & resources — free
Build a LangChain Agent with Tool Calling (OpenAI Functions)
The first time I saw a properly working tool-calling agent, I genuinely paused. Not because it was magic — the mechanism is straightforward once you understand it — but because watching a language model decide "I need to call this specific function with these exact arguments" and then act on the result felt qualitatively different from a chatbot.
Tool calling is the backbone of every serious LangChain application. It's what separates "an LLM that can answer questions" from "an agent that can actually do things."
This guide covers the full tool-calling stack: defining tools, binding them to models, handling responses, forcing specific tool use, running parallel calls, and building a complete working agent. Every code block runs against gpt-4o-mini.
How Tool Calling Actually Works
The mechanism is simpler than it sounds:
- You describe your tools to the model (name, description, parameter schema)
- The model decides whether to call a tool, and if so, returns JSON describing which one and with what arguments
- You execute the tool with those arguments
- You pass the result back to the model as a
ToolMessage - The model generates a final response using the tool's output
User: "What's the weather in Tokyo?"
↓
LLM output: ToolCall(name="get_weather", args={"city": "Tokyo"})
↓
Your code: result = get_weather("Tokyo") → "18°C, partly cloudy"
↓
LLM output: "It's currently 18°C and partly cloudy in Tokyo."
The model never actually calls your function. It just returns structured JSON describing what it wants to call. Your application code does the actual execution.
Defining Tools with @tool Decorator
from langchain_core.tools import tool
from langchain_openai import ChatOpenAI
import requests
@tool
def get_weather(city: str) -> str:
"""Get the current weather for a city. Use this when the user asks about weather."""
# In production, call a real weather API
return f"18°C, partly cloudy in {city}"
@tool
def calculate(expression: str) -> str:
"""Evaluate a mathematical expression. Input should be a valid Python math expression."""
try:
result = eval(expression, {"__builtins__": {}}, {})
return str(result)
except Exception as e:
return f"Error: {e}"
@tool
def search_web(query: str) -> str:
"""Search the web for current information about a topic."""
# Use DuckDuckGo or Tavily in production
return f"Search results for '{query}': [placeholder results]"
# Check what the tool schema looks like
print(get_weather.args_schema.schema())
# {'title': 'get_weather', 'type': 'object', 'properties': {'city': {'title': 'City', 'type': 'string'}}, 'required': ['city']}
The @tool decorator extracts the function name, docstring, and type hints to automatically build the JSON schema the model needs. The docstring is critical — it's what the model reads to decide when to use the tool.
Binding Tools to a Model
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
tools = [get_weather, calculate, search_web]
# Bind tools to the model
llm_with_tools = llm.bind_tools(tools)
# Test it
response = llm_with_tools.invoke("What's 47 * 89?")
print(response.tool_calls)
# [{'name': 'calculate', 'args': {'expression': '47 * 89'}, 'id': 'call_xyz...', 'type': 'tool_call'}]
When the model decides a tool call is needed, response.tool_calls contains the structured call. When it answers directly (no tool needed), response.content has the text and response.tool_calls is empty.
Executing Tool Calls Manually
Before using an agent framework, understanding the raw loop is valuable:
from langchain_core.messages import HumanMessage, ToolMessage
def run_tool_call_loop(user_input: str) -> str:
messages = [HumanMessage(content=user_input)]
# First LLM call — may return tool calls
response = llm_with_tools.invoke(messages)
messages.append(response)
# Keep executing tools until the model gives a final answer
while response.tool_calls:
for tool_call in response.tool_calls:
# Find and execute the right tool
tool_map = {t.name: t for t in tools}
tool_result = tool_map[tool_call["name"]].invoke(tool_call["args"])
# Add the tool result as a ToolMessage
messages.append(
ToolMessage(
content=str(tool_result),
tool_call_id=tool_call["id"]
)
)
# Call the model again with the tool results
response = llm_with_tools.invoke(messages)
messages.append(response)
return response.content
print(run_tool_call_loop("What's 256 * 18, and what's the weather in Paris?"))
Parallel Tool Calls
GPT-4o supports parallel tool calling — the model can request multiple tools in a single response:
response = llm_with_tools.invoke(
"What's the weather in Tokyo AND London, and what's 1000 / 33?"
)
print(f"Number of tool calls: {len(response.tool_calls)}")
# Number of tool calls: 3
# Execute all in parallel (or sequentially — both work)
tool_map = {t.name: t for t in tools}
tool_results = []
for tc in response.tool_calls:
result = tool_map[tc["name"]].invoke(tc["args"])
tool_results.append(
ToolMessage(content=str(result), tool_call_id=tc["id"])
)
# Get final answer with all results
final = llm_with_tools.invoke([
HumanMessage("What's the weather in Tokyo AND London, and what's 1000 / 33?"),
response,
*tool_results
])
print(final.content)
This matters for latency. If two tools are independent (weather in two cities), you can execute them concurrently:
import asyncio
async def execute_tools_parallel(tool_calls, tool_map):
async def run_one(tc):
result = await tool_map[tc["name"]].arun(tc["args"])
return ToolMessage(content=str(result), tool_call_id=tc["id"])
return await asyncio.gather(*[run_one(tc) for tc in tool_calls])
Forcing Tool Use
Sometimes you need the model to always call a tool regardless of the input:
# Force any tool call
llm_forced = llm.bind_tools(tools, tool_choice="required")
# Force a specific tool
llm_weather_only = llm.bind_tools(
tools,
tool_choice={"type": "function", "function": {"name": "get_weather"}}
)
# This will call get_weather even for unrelated questions
response = llm_weather_only.invoke("Tell me a joke")
print(response.tool_calls)
# [{'name': 'get_weather', 'args': {'city': 'somewhere'}, ...}]
Forcing a specific tool is useful for structured data extraction — you define an "extraction schema" as a tool, force the model to call it, and get clean structured output every time.
Structured Output via Tool Calling
The cleanest way to get structured data from an LLM:
from pydantic import BaseModel, Field
from typing import List
class ProductInfo(BaseModel):
name: str = Field(description="Product name")
price: float = Field(description="Price in USD")
features: List[str] = Field(description="Key product features")
rating: float = Field(description="Average customer rating 0-5")
# Use with_structured_output for clean extraction
structured_llm = llm.with_structured_output(ProductInfo)
result = structured_llm.invoke("""
The AirPods Pro 2 costs $249. They feature active noise cancellation,
spatial audio, and USB-C charging. Users rate them 4.7 out of 5.
""")
print(result.name) # "AirPods Pro 2"
print(result.price) # 249.0
print(result.features) # ['active noise cancellation', 'spatial audio', 'USB-C charging']
print(result.rating) # 4.7
with_structured_output uses tool calling internally — it binds a tool with your Pydantic schema and forces the model to call it. You get a typed Python object back.
Building a Full Tool-Calling Agent with LCEL
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.runnables import RunnablePassthrough
from langchain.agents.format_scratchpad.openai_tools import format_to_openai_tool_messages
from langchain.agents.output_parsers.openai_tools import OpenAIToolsAgentOutputParser
from langchain.agents import AgentExecutor
# Define the agent chain
prompt = ChatPromptTemplate.from_messages([
("system", "You are a helpful assistant with access to tools. Use them when needed."),
("human", "{input}"),
MessagesPlaceholder(variable_name="agent_scratchpad"),
])
agent = (
{
"input": RunnablePassthrough(),
"agent_scratchpad": lambda x: format_to_openai_tool_messages(x["intermediate_steps"]),
}
| prompt
| llm.bind_tools(tools)
| OpenAIToolsAgentOutputParser()
)
agent_executor = AgentExecutor(
agent=agent,
tools=tools,
verbose=True,
max_iterations=5,
handle_parsing_errors=True
)
# Run it
result = agent_executor.invoke({
"input": "What's 15% of 840, and what's the weather in Berlin?"
})
print(result["output"])
Pydantic-Based Tool Definitions
For more control over validation:
from langchain_core.tools import BaseTool
from pydantic import BaseModel, validator
class WeatherInput(BaseModel):
city: str
units: str = "celsius"
@validator("units")
def validate_units(cls, v):
if v not in ["celsius", "fahrenheit"]:
raise ValueError("units must be 'celsius' or 'fahrenheit'")
return v
class WeatherTool(BaseTool):
name = "get_weather"
description = "Get weather for a city. Specify units as celsius or fahrenheit."
args_schema = WeatherInput
def _run(self, city: str, units: str = "celsius") -> str:
return f"Weather in {city}: 22°{units[0].upper()}"
async def _arun(self, city: str, units: str = "celsius") -> str:
return self._run(city, units)
weather_tool = WeatherTool()
Error Handling in Tool Calls
from langchain_core.tools import ToolException
@tool
def safe_divide(numerator: float, denominator: float) -> float:
"""Divide two numbers safely."""
if denominator == 0:
raise ToolException("Cannot divide by zero.")
return numerator / denominator
# Configure error handling in AgentExecutor
agent_executor = AgentExecutor(
agent=agent,
tools=[safe_divide],
handle_parsing_errors=True, # Recover from output parsing errors
verbose=True
)
When a tool raises ToolException, LangChain catches it, formats the error message, and passes it back to the model as a tool result so the agent can try a different approach.
Comparison: Tool Calling vs Direct Chain
| Approach | When to Use | Latency | Complexity |
|---|---|---|---|
| Direct LLM call | Simple Q&A, no external data needed | Lowest | Minimal |
| Tool calling (single) | One specific action or lookup | Low | Low |
| Tool calling (parallel) | Multiple independent lookups | Low | Medium |
| Agent loop | Multi-step reasoning with unknown steps | Medium | Medium |
| LangGraph agent | Complex state, human-in-loop, branching | Medium-High | High |
For this topic's relationship to building complete agents, see Build AI agent with LangChain which covers LangGraph patterns. The OpenAI API integration guide covers the underlying API if you want lower-level control. And LangChain tutorial 2025 has the foundational concepts if any of this felt unfamiliar.
Conclusion
Tool calling is what makes LangChain agents actually useful. The core loop — bind tools, invoke model, execute tool calls, pass results back, repeat — is simple once you've seen it a few times. The tricky parts are edge cases: what happens when the model hallucinates a tool that doesn't exist, what happens when tool execution throws an exception, what happens when the model keeps looping.
Three things to remember: use handle_parsing_errors=True in AgentExecutor, set max_iterations to prevent infinite loops, and write clear tool docstrings — the model reads those to decide which tool to call.
For deeper exploration of how agents manage state across tool calls, the AI agent memory and planning guide covers the memory patterns that make long-running agents work. The AI research agent build shows a complete production example using tool calling as its core mechanism.
Frequently Asked Questions
What is the difference between tool calling and function calling in LangChain? They're essentially the same thing with different names. OpenAI introduced 'function calling' in 2023 and renamed it 'tool use' in 2024. In LangChain, bind_tools() is the modern API that works with any tool-capable model (OpenAI, Anthropic, Google). The underlying mechanism — the model returning structured JSON describing which tool to call and with what arguments — is identical.
How do I force a LangChain agent to always use a specific tool?
Pass tool_choice='required' to bind_tools() to force any tool call, or tool_choice={'type': 'function', 'function': {'name': 'your_tool_name'}} to force a specific tool. This is useful when you want guaranteed structured output from every LLM call, regardless of whether the model thinks a tool call is needed.
Can LangChain make multiple tool calls in a single LLM request? Yes — with parallel tool calling enabled (which is the default for gpt-4o), the model can return multiple ToolCall objects in a single response. You execute all of them, collect the ToolMessages, and pass everything back in one follow-up call. This can cut latency significantly when multiple independent lookups are needed.
Frequently Asked Questions
AiTechWorlds Team
✓ Verified WriterThe AiTechWorlds team is passionate about AI, technology, and education. We create high-quality, research-backed content to help you learn, grow, and succeed in the modern digital world.
Related Articles
5 AutoGen Agent Roles (Assistant, UserProxy, CodeExecutor)
Understand the 5 core AutoGen agent types — AssistantAgent, UserProxyAgent, CodeExecutorAgent, and more — with code examples and a comparison table for each role.
How to Deploy AutoGen Agents as APIs with FastAPI (2026)
Learn to serve AutoGen multi-agent systems as production REST APIs using FastAPI with async endpoints and real-time streaming responses.
How to Use AutoGen with Azure OpenAI (Enterprise Security)
Connect Microsoft AutoGen to Azure OpenAI for enterprise-grade AI agents. Step-by-step setup with private endpoints, OAI_CONFIG_LIST, and deployment config.
Build a Code Debugging Agent with AutoGen (Auto-Fix PRs)
Build an AutoGen agent that reviews code, analyzes PR diffs, suggests fixes, and automates code quality improvements with a full working implementation.