Creating Custom Tools

Building Custom Tools for LangChain Agents

Custom tools are what make agents actually useful in real applications. They connect your agent to your data, your APIs, and your business logic. This lesson shows you how to build robust, well-documented tools that agents use reliably.

The Tool Contract

A tool has three parts the agent uses:

Name: What the agent calls it (no spaces, underscore_case)
Description: How the agent decides when to use it — this is the most important part
Function: What it actually does

The agent never sees your code — only the name, description, and the output it returns.

Method 1: `@tool` Decorator (Simplest)

from langchain.tools import tool

@tool
def get_weather(city: str) -> str:
    """Get the current weather for a city. Returns temperature, conditions, and humidity.
    
    Use this when the user asks about current weather conditions anywhere in the world.
    Input should be a city name like 'Tokyo' or 'New York'.
    """
    import httpx
    # In production, use a real weather API (OpenWeatherMap, WeatherAPI)
    # Demo: returning mock data
    api_key = os.environ.get("WEATHER_API_KEY")
    response = httpx.get(
        f"https://api.openweathermap.org/data/2.5/weather?q={city}&appid={api_key}&units=metric"
    )
    data = response.json()
    
    return f"{city}: {data['weather'][0]['description']}, {data['main']['temp']:.1f}°C, humidity {data['main']['humidity']}%"

# The decorator makes it a Tool object with name="get_weather"
print(get_weather.name)         # "get_weather"
print(get_weather.description)  # The docstring

Method 2: Structured Tool with Input Schema

For tools with multiple inputs or complex types, use Pydantic for validation:

from langchain.tools import StructuredTool
from pydantic import BaseModel, Field

class DatabaseQueryInput(BaseModel):
    table: str = Field(description="The database table to query")
    filter_column: str = Field(description="Column to filter on")
    filter_value: str = Field(description="Value to filter for")
    limit: int = Field(default=10, description="Maximum rows to return (1-100)")

def query_database(table: str, filter_column: str, filter_value: str, limit: int = 10) -> str:
    """Execute a filtered database query and return results."""
    from sqlalchemy import create_engine, text
    engine = create_engine(os.environ["DATABASE_URL"])
    
    with engine.connect() as conn:
        query = text(f"SELECT * FROM {table} WHERE {filter_column} = :value LIMIT :limit")
        result = conn.execute(query, {"value": filter_value, "limit": limit})
        rows = result.fetchall()
    
    if not rows:
        return f"No records found in {table} where {filter_column} = {filter_value}"
    
    # Format as readable output
    return "\n".join([str(dict(row)) for row in rows])

db_tool = StructuredTool.from_function(
    func=query_database,
    name="query_database",
    description=(
        "Query a database table with a filter. "
        "Use this to look up customer records, orders, products, or any other business data. "
        "Always specify the table, filter column, and filter value."
    ),
    args_schema=DatabaseQueryInput
)

Method 3: BaseTool Class (Most Control)

For tools needing complex initialization or async support:

from langchain.tools import BaseTool
from typing import Optional, Type
from pydantic import BaseModel

class CRMSearchInput(BaseModel):
    query: str = Field(description="Search query — company name, email, or contact name")
    search_type: str = Field(default="any", description="'company', 'contact', or 'any'")

class CRMSearchTool(BaseTool):
    name = "search_crm"
    description = (
        "Search the CRM for customer and company information. "
        "Returns contact details, deal history, and company information. "
        "Use when looking up a specific customer or company."
    )
    args_schema: Type[BaseModel] = CRMSearchInput
    
    # Store state that needs initialization
    api_client: object = None
    
    def __init__(self, api_key: str):
        super().__init__()
        import crm_sdk  # hypothetical
        self.api_client = crm_sdk.Client(api_key=api_key)
    
    def _run(self, query: str, search_type: str = "any") -> str:
        """Synchronous execution."""
        results = self.api_client.search(query=query, type=search_type)
        if not results:
            return f"No CRM records found for '{query}'"
        
        formatted = []
        for r in results[:5]:  # Limit response size
            formatted.append(f"Name: {r.name}, Email: {r.email}, Company: {r.company}")
        return "\n".join(formatted)
    
    async def _arun(self, query: str, search_type: str = "any") -> str:
        """Async execution for concurrent tool calls."""
        results = await self.api_client.async_search(query=query, type=search_type)
        return self._format_results(results)

crm_tool = CRMSearchTool(api_key=os.environ["CRM_API_KEY"])

Tool Output Best Practices

Return useful, structured text:

@tool
def get_order_status(order_id: str) -> str:
    """Look up an order's current status and estimated delivery date.
    Input: order ID (e.g., 'ORD-12345')"""
    
    order = db.get_order(order_id)
    
    if not order:
        return f"Order {order_id} not found. Please verify the order ID."
    
    # Return structured, parseable text
    return (
        f"Order {order_id}:\n"
        f"Status: {order.status}\n"
        f"Items: {', '.join(order.item_names)}\n"
        f"Estimated delivery: {order.estimated_delivery}\n"
        f"Tracking number: {order.tracking_number or 'Not yet assigned'}"
    )

Handle errors explicitly — don't let exceptions propagate:

@tool
def send_email(to: str, subject: str, body: str) -> str:
    """Send an email. Returns 'success' or an error message."""
    try:
        email_client.send(to=to, subject=subject, body=body)
        return f"Email sent successfully to {to}"
    except email_client.InvalidEmailError:
        return f"Error: '{to}' is not a valid email address"
    except email_client.RateLimitError:
        return "Error: Email rate limit exceeded. Try again in 1 minute."
    except Exception as e:
        return f"Error sending email: {str(e)}"

Limit response size: The tool output goes into the LLM's context. Don't return 50,000 characters from a database query:

@tool
def search_knowledge_base(query: str) -> str:
    """Search the knowledge base. Returns the 3 most relevant articles."""
    results = kb.search(query, limit=3)
    
    if not results:
        return "No relevant articles found."
    
    # Truncate each result to prevent context overflow
    formatted = []
    for r in results:
        content = r.content[:500] + "..." if len(r.content) > 500 else r.content
        formatted.append(f"Title: {r.title}\n{content}")
    
    return "\n\n---\n\n".join(formatted)

Tool Descriptions That Work

The description determines when and how the agent uses the tool. Write it like documentation for a developer:

description = """
Search for customer information in our database.

Use this tool when you need to:
- Look up a specific customer's account details
- Find contact information for a customer
- Check a customer's subscription status or plan

Input: Customer's name, email, or account ID
Output: Customer profile with contact info, plan, and account status

Do NOT use this tool to search for products, orders, or general information.
Use 'search_product_catalog' for product queries.
"""

Key elements:

What the tool does (one sentence)
When to use it (specific scenarios)
What the input should be
What the output will contain
When NOT to use it (if there's a potential for confusion with other tools)

Testing Custom Tools

Always test tools before giving them to an agent:

# Direct invocation test
result = my_tool.invoke("test input")
print(result)

# Test with various edge cases
test_cases = [
    ("normal input", "expected output pattern"),
    ("empty input", "should handle gracefully"),
    ("very long input", "should truncate or handle"),
    ("invalid input", "should return error message"),
]

for input_val, expected_pattern in test_cases:
    result = my_tool.invoke(input_val)
    print(f"Input: {input_val[:30]}")
    print(f"Output: {result[:100]}")
    print()

Tool Collections (Toolkits)

Organize related tools into toolkits:

def get_customer_service_tools(db_client, email_client, crm_client) -> list:
    """Return the full toolkit for customer service agents."""
    return [
        get_order_status,
        CRMSearchTool(api_key=os.environ["CRM_API_KEY"]),
        StructuredTool.from_function(
            func=lambda order_id: db_client.get_order(order_id),
            name="get_order_details",
            description="Get detailed information about an order including line items and shipping"
        ),
        # ... other tools
    ]

Next lesson: LangChain memory — giving agents context across conversations.