AI Tips Prompting Python AI Tools Web Dev ChatGPT LLM Agent Dev Reviews Notes Free Books

AiTechWorlds

AI customer service agent handling support conversation — AutoGen multi-turn

Build a Customer Service Agent with AutoGen (Multi-Turn)

⚡ Quick Answer

Build a production-ready customer service agent with AutoGen featuring multi-turn conversations, escalation logic, FAQ tools, and handoff patterns.

AiTechWorlds Team May 31, 2026 11 min read

#AutoGen #customer service #multi-turn #support automation

📚Part of the Autogpt Autogen guide — explore all Autogpt Autogen articles→

Share:Facebook Twitter/X LinkedIn Telegram WhatsApp

📱

Get more content like this on Telegram!

Daily AI tips, notes & resources — free

Join Free →

Support teams at scale face a simple math problem: the volume of incoming tickets grows faster than you can hire agents. AI-powered customer service is not about replacing human agents — it is about making sure humans only handle the conversations that actually need them.

AutoGen is particularly well-suited for this problem because of how it handles multi-agent conversations. You can build a layered system where a first-line AI agent handles common questions, a specialist agent handles product-specific queries, and a human handoff mechanism activates when the AI reaches its limits.

This guide builds that entire system from scratch.

What You Will Build

By the end of this guide, you will have a working customer service agent that:

Answers FAQs using a structured knowledge tool
Maintains context across a multi-turn conversation
Detects when to escalate to a human
Hands off gracefully with a conversation summary

The final code is production-structured, meaning you can extend it directly rather than treating it as a toy example.

Prerequisites

pip install pyautogen openai python-dotenv

Set your API key:

export OPENAI_API_KEY=sk-proj-your-key-here

For the full context on what AutoGen is and how its agent model works, start with AI agents explained.

Setting Up the Knowledge Base

Real customer service agents need access to actual product information. We will represent this as a structured FAQ tool — in production, you would replace this with a vector search over your documentation.

# knowledge_base.py

FAQ_DATA = {
    "refund": {
        "policy": "Refunds are accepted within 30 days of purchase for unused items.",
        "process": "Submit a refund request through your account portal under Orders > Request Refund.",
        "timeline": "Refunds are processed within 5-7 business days."
    },
    "shipping": {
        "standard": "Standard shipping takes 5-7 business days.",
        "express": "Express shipping takes 1-2 business days for an additional $12.99.",
        "international": "International shipping takes 10-14 business days. Customs duties are the customer's responsibility."
    },
    "account": {
        "password_reset": "Go to login page and click 'Forgot Password'. A reset link will be emailed within 2 minutes.",
        "email_change": "Email changes require verification from both the old and new email address. Allow 24 hours.",
        "deletion": "Account deletion requests are processed within 30 days. Contact support@example.com."
    },
    "billing": {
        "payment_methods": "We accept Visa, Mastercard, Amex, PayPal, and Apple Pay.",
        "invoice": "Invoices are emailed automatically after each purchase. You can also download them from Account > Billing.",
        "subscription": "Subscriptions can be cancelled at any time. Access continues until the end of the billing period."
    }
}

def search_faq(topic: str, subtopic: str = None) -> str:
    """Search the FAQ knowledge base."""
    topic = topic.lower()
    
    if topic not in FAQ_DATA:
        available = ", ".join(FAQ_DATA.keys())
        return f"Topic '{topic}' not found. Available topics: {available}"
    
    if subtopic:
        subtopic = subtopic.lower()
        if subtopic in FAQ_DATA[topic]:
            return FAQ_DATA[topic][subtopic]
        else:
            # Return all entries for this topic
            entries = FAQ_DATA[topic]
            return "\n".join([f"{k}: {v}" for k, v in entries.items()])
    
    # Return all entries for the topic
    entries = FAQ_DATA[topic]
    return "\n".join([f"{k}: {v}" for k, v in entries.items()])


def detect_escalation_needed(message: str, turn_count: int) -> dict:
    """Determine if conversation should be escalated to a human."""
    escalation_triggers = [
        "speak to human",
        "talk to agent",
        "real person",
        "supervisor",
        "manager",
        "this is unacceptable",
        "legal action",
        "complaint",
        "frustrated",
        "angry",
        "lawsuit"
    ]
    
    message_lower = message.lower()
    triggered_keywords = [t for t in escalation_triggers if t in message_lower]
    
    # Also escalate after too many turns without resolution
    too_many_turns = turn_count > 8
    
    return {
        "should_escalate": len(triggered_keywords) > 0 or too_many_turns,
        "reason": triggered_keywords[0] if triggered_keywords else ("unresolved after multiple turns" if too_many_turns else None),
        "priority": "high" if triggered_keywords else "normal"
    }

Building the Core Agent

Now let us build the AutoGen agent setup. The key insight here is that AutoGen uses a ConversableAgent with registered tools — the agent decides when to call a tool based on the conversation context.

# customer_service_agent.py

import os
import json
from autogen import ConversableAgent, UserProxyAgent
from knowledge_base import search_faq, detect_escalation_needed

# Configuration
llm_config = {
    "config_list": [
        {
            "model": "gpt-4-turbo",
            "api_key": os.environ.get("OPENAI_API_KEY"),
        }
    ],
    "temperature": 0.3,
    "timeout": 30,
}

SYSTEM_MESSAGE = """You are a helpful customer service agent for Acme Store.

Your responsibilities:
- Answer customer questions about refunds, shipping, accounts, and billing
- Use the search_faq tool to look up accurate information before answering
- Be friendly, concise, and helpful
- If you cannot find relevant information, be honest about it
- If the customer is frustrated, upset, or explicitly requests a human, use check_escalation

Rules:
- Always use search_faq before answering product or policy questions
- Never make up policies or timelines — only use information from the tool
- Keep responses under 150 words unless the customer asks for detail
- Use the customer's name if they provide it
- End each response by asking if the customer has additional questions

You cannot help with: legal disputes, fraud investigations, or custom enterprise pricing.
For those topics, always escalate to a human immediately."""


# Create the customer service agent
service_agent = ConversableAgent(
    name="CustomerServiceAgent",
    system_message=SYSTEM_MESSAGE,
    llm_config=llm_config,
    human_input_mode="NEVER",
    max_consecutive_auto_reply=10,
)

# Create a user proxy that represents the customer
customer_proxy = UserProxyAgent(
    name="Customer",
    human_input_mode="ALWAYS",
    max_consecutive_auto_reply=0,
    code_execution_config=False,
)


# Register tools with the service agent
@service_agent.register_for_llm(description="Search the FAQ knowledge base for customer questions about refunds, shipping, accounts, or billing.")
def faq_search(topic: str, subtopic: str = "") -> str:
    """Search FAQ knowledge base."""
    return search_faq(topic, subtopic if subtopic else None)


@service_agent.register_for_llm(description="Check if the current conversation should be escalated to a human agent.")
def check_escalation(customer_message: str, turn_number: int) -> str:
    """Check escalation criteria."""
    result = detect_escalation_needed(customer_message, turn_number)
    return json.dumps(result)


# Register tools for execution
@customer_proxy.register_for_execution()
def faq_search(topic: str, subtopic: str = "") -> str:
    return search_faq(topic, subtopic if subtopic else None)


@customer_proxy.register_for_execution()
def check_escalation(customer_message: str, turn_number: int) -> str:
    result = detect_escalation_needed(customer_message, turn_number)
    return json.dumps(result)

Adding Escalation and Handoff Logic

The escalation pattern is where most customer service agents fall short. When the AI cannot help, it needs to do three things: recognize the situation, stop trying, and transfer context cleanly.

# escalation_handler.py

from dataclasses import dataclass
from typing import Optional
import datetime


@dataclass
class EscalationTicket:
    ticket_id: str
    customer_id: Optional[str]
    conversation_summary: str
    escalation_reason: str
    priority: str
    timestamp: str
    conversation_history: list


def create_escalation_ticket(
    conversation_history: list,
    reason: str,
    priority: str = "normal",
    customer_id: str = None
) -> EscalationTicket:
    """Create a structured handoff ticket for human agents."""
    
    # Generate a simple ticket ID
    ticket_id = f"ESC-{datetime.datetime.now().strftime('%Y%m%d%H%M%S')}"
    
    # Summarize the conversation
    summary = summarize_conversation(conversation_history)
    
    return EscalationTicket(
        ticket_id=ticket_id,
        customer_id=customer_id,
        conversation_summary=summary,
        escalation_reason=reason,
        priority=priority,
        timestamp=datetime.datetime.now().isoformat(),
        conversation_history=conversation_history
    )


def summarize_conversation(history: list) -> str:
    """Create a concise summary for the human agent."""
    if not history:
        return "No conversation history available."
    
    # Extract key points from the conversation
    customer_messages = [
        msg["content"] for msg in history 
        if msg.get("role") == "user" or msg.get("name") == "Customer"
    ]
    
    if not customer_messages:
        return "Customer contacted support."
    
    first_message = customer_messages[0][:200]
    turn_count = len(customer_messages)
    last_message = customer_messages[-1][:200] if len(customer_messages) > 1 else ""
    
    summary = f"Customer initiated contact about: {first_message}\n"
    summary += f"Conversation length: {turn_count} customer messages\n"
    if last_message and last_message != first_message:
        summary += f"Most recent customer message: {last_message}\n"
    
    return summary


def format_handoff_message(ticket: EscalationTicket) -> str:
    """Format the message shown to the customer during handoff."""
    return f"""I understand this situation needs more attention than I can provide.

I've created a priority support ticket (ID: {ticket.ticket_id}) and a human agent will follow up shortly.

Here is what I've shared with the team:
- Summary: {ticket.conversation_summary[:200]}
- Priority: {ticket.priority.upper()}

A human agent will contact you within:
- High priority: 15 minutes
- Normal priority: 2-4 hours

Is there anything else I can note for the agent before they reach out?"""

The Multi-Turn Conversation Loop

Now let us wire everything together into a conversation manager that handles the full flow:

# conversation_manager.py

import os
from autogen import ConversableAgent, UserProxyAgent, GroupChat, GroupChatManager
from customer_service_agent import service_agent, customer_proxy
from escalation_handler import create_escalation_ticket, format_handoff_message


class CustomerServiceSession:
    def __init__(self, customer_id: str = None):
        self.customer_id = customer_id
        self.turn_count = 0
        self.escalated = False
        self.conversation_history = []
    
    def run_conversation(self, initial_message: str):
        """Start and manage a customer service conversation."""
        print(f"\n{'='*60}")
        print("Customer Service Session Started")
        print(f"{'='*60}\n")
        
        # Start the conversation
        result = customer_proxy.initiate_chat(
            service_agent,
            message=initial_message,
            max_turns=15,
        )
        
        # Store history
        self.conversation_history = result.chat_history
        return result
    
    def check_and_handle_escalation(self, message: str) -> bool:
        """Check if escalation is needed and handle it."""
        from knowledge_base import detect_escalation_needed
        
        escalation_check = detect_escalation_needed(message, self.turn_count)
        
        if escalation_check["should_escalate"]:
            ticket = create_escalation_ticket(
                conversation_history=self.conversation_history,
                reason=escalation_check["reason"],
                priority=escalation_check["priority"],
                customer_id=self.customer_id
            )
            
            handoff_message = format_handoff_message(ticket)
            print(f"\nAgent: {handoff_message}")
            
            # In production, you would:
            # - Create a ticket in your helpdesk (Zendesk, Freshdesk, etc.)
            # - Notify the human agent queue
            # - Store conversation in your CRM
            print(f"\n[SYSTEM: Escalation ticket {ticket.ticket_id} created]")
            
            self.escalated = True
            return True
        
        return False


# Specialist agent for billing issues
billing_specialist = ConversableAgent(
    name="BillingSpecialist",
    system_message="""You are a billing specialist with access to account information.
    You handle complex billing disputes, payment failures, and subscription issues.
    You have authority to issue credits up to $50 without manager approval.
    Always verify the customer's identity before discussing account details.""",
    llm_config={
        "config_list": [{"model": "gpt-4-turbo", "api_key": os.environ.get("OPENAI_API_KEY")}],
        "temperature": 0.2,
    },
    human_input_mode="NEVER",
)


def run_multi_agent_session(customer_message: str):
    """Run a session with automatic routing between agents."""
    
    # Determine initial routing based on message content
    billing_keywords = ["charge", "payment", "invoice", "subscription", "refund denied", "dispute"]
    
    is_billing = any(kw in customer_message.lower() for kw in billing_keywords)
    
    if is_billing:
        print("[Routing to billing specialist]")
        initial_agent = billing_specialist
    else:
        print("[Routing to general support]")
        initial_agent = service_agent
    
    session = CustomerServiceSession()
    result = customer_proxy.initiate_chat(
        initial_agent,
        message=customer_message,
        max_turns=12,
    )
    
    return result


if __name__ == "__main__":
    print("Customer Service Agent (AutoGen)")
    print("Type 'quit' to exit\n")
    
    initial_message = input("How can we help you today? ")
    
    if initial_message.lower() != "quit":
        run_multi_agent_session(initial_message)

Testing the Agent

Let us walk through what a real interaction looks like:

# test_agent.py

from conversation_manager import run_multi_agent_session

# Test 1: Simple FAQ question
result = run_multi_agent_session("What is your refund policy?")

# Test 2: Multi-turn with escalation trigger
result = run_multi_agent_session(
    "I need to return an item I bought 45 days ago and your system won't let me"
)

# Test 3: Angry customer requiring human
result = run_multi_agent_session(
    "This is completely unacceptable. I want to speak to a manager immediately."
)

Expected flow for Test 2:

Agent calls search_faq("refund", "policy")
Returns the 30-day policy
Agent explains the policy limitation
Customer pushes back
Agent calls check_escalation
Escalation ticket created
Handoff message delivered

Performance and Cost Considerations

Customer service at scale means understanding the economics:

Configuration	Cost per 10-turn conversation	Quality
GPT-4 Turbo, 0 tools	~$0.18	High
GPT-4 Turbo, with tools	~$0.22	Higher (accurate)
GPT-3.5 Turbo, with tools	~$0.03	Good for simple queries
GPT-4o mini, with tools	~$0.01	Good baseline

For high-volume deployments, run GPT-4o mini as the first-line agent and only escalate to GPT-4 Turbo when the mini agent flags a complex issue.

For building memory across sessions (recognizing returning customers), see AI agent memory and planning.

What to Build Next

This is a functional foundation. Production extensions worth adding:

Session persistence: Store conversation history in Redis so customers can resume across page loads
Sentiment tracking: Log sentiment scores per turn to identify struggling conversations early
Handoff to live chat: Integrate with Intercom, Zendesk, or Freshdesk via their APIs
Analytics dashboard: Track resolution rates, escalation reasons, and average turns to resolution

If you want a deeper look at how multi-agent patterns work in a research context, the AI research agent build guide shows similar orchestration patterns applied to knowledge retrieval.

For the CrewAI approach to multi-agent customer service, CrewAI tutorial covers how role-based agents handle specialization differently.

Frequently Asked Questions

How does AutoGen handle conversation state across multiple turns? AutoGen maintains conversation history in each agent's message list. The ConversableAgent class stores all prior messages and includes them in subsequent LLM calls, enabling context-aware multi-turn dialogue without manual state management.

Can I connect an AutoGen customer service agent to a live chat system? Yes. AutoGen agents expose a simple Python interface. You can wrap any AutoGen agent in a FastAPI or WebSocket server and route live chat messages through it. The agent's reply method accepts message strings and returns response strings.

How do I prevent the agent from answering questions outside its scope? Define a strict system message that limits the agent's domain, and implement a topic classifier as a tool. If the classifier returns out-of-scope, the agent can route to escalation rather than attempting to answer.

Share this article:Facebook Twitter/X LinkedIn Telegram WhatsApp

💬 DiscussionPowered by GitHub Discussions

Frequently Asked Questions

AutoGen maintains conversation history in each agent's message list. The ConversableAgent class stores all prior messages and includes them in subsequent LLM calls, enabling context-aware multi-turn dialogue without manual state management.

AiTechWorlds Team

✓ Verified Writer

The AiTechWorlds team is passionate about AI, technology, and education. We create high-quality, research-backed content to help you learn, grow, and succeed in the modern digital world.

📱 Follow on Telegram 🐦 Follow on X Learn More →

AI agent role assignment diagram — AutoGen agent types roles

Agent Development

5 AutoGen Agent Roles (Assistant, UserProxy, CodeExecutor)

Understand the 5 core AutoGen agent types — AssistantAgent, UserProxyAgent, CodeExecutorAgent, and more — with code examples and a comparison table for each role.

May 31, 2026 11 min read

AutoGen agent served as REST API endpoint — FastAPI deployment

Agent Development

How to Deploy AutoGen Agents as APIs with FastAPI (2026)

Learn to serve AutoGen multi-agent systems as production REST APIs using FastAPI with async endpoints and real-time streaming responses.

May 31, 2026 10 min read

Azure OpenAI enterprise integration with AutoGen — managed private instances

Agent Development

How to Use AutoGen with Azure OpenAI (Enterprise Security)

Connect Microsoft AutoGen to Azure OpenAI for enterprise-grade AI agents. Step-by-step setup with private endpoints, OAI_CONFIG_LIST, and deployment config.

May 31, 2026 10 min read

AI agent automatically fixing code bugs — AutoGen code debugging auto-fix

Agent Development

Build a Code Debugging Agent with AutoGen (Auto-Fix PRs)

Build an AutoGen agent that reviews code, analyzes PR diffs, suggests fixes, and automates code quality improvements with a full working implementation.

May 31, 2026 11 min read

10K+ Members Growing Daily

Get Free AI Notes Daily

Join AiTechWorlds on Telegram and get daily AI tips, prompt engineering templates, coding resources, and exclusive content — 100% free!

📚 Free Study Notes🤖 AI Tips Daily⚡ Prompt Templates💻 Coding Resources

Join Free Channel

No spam. Leave anytime.

Autogpt Autogen

Build a Customer Service Agent with AutoGen (Multi-Turn)

⚡ Quick Answer

Build a production-ready customer service agent with AutoGen featuring multi-turn conversations, escalation logic, FAQ tools, and handoff patterns.

AiTechWorlds Team May 31, 2026 11 min read

#AutoGen #customer service #multi-turn #support automation

📚Part of the Autogpt Autogen guide — explore all Autogpt Autogen articles→

Share:Facebook Twitter/X LinkedIn Telegram WhatsApp

📱

Get more content like this on Telegram!

Daily AI tips, notes & resources — free

Join Free →

This guide builds that entire system from scratch.

What You Will Build

By the end of this guide, you will have a working customer service agent that:

Answers FAQs using a structured knowledge tool
Maintains context across a multi-turn conversation
Detects when to escalate to a human
Hands off gracefully with a conversation summary

The final code is production-structured, meaning you can extend it directly rather than treating it as a toy example.

Prerequisites

pip install pyautogen openai python-dotenv

Set your API key:

export OPENAI_API_KEY=sk-proj-your-key-here

For the full context on what AutoGen is and how its agent model works, start with AI agents explained.

Setting Up the Knowledge Base

# knowledge_base.py

FAQ_DATA = {
    "refund": {
        "policy": "Refunds are accepted within 30 days of purchase for unused items.",
        "process": "Submit a refund request through your account portal under Orders > Request Refund.",
        "timeline": "Refunds are processed within 5-7 business days."
    },
    "shipping": {
        "standard": "Standard shipping takes 5-7 business days.",
        "express": "Express shipping takes 1-2 business days for an additional $12.99.",
        "international": "International shipping takes 10-14 business days. Customs duties are the customer's responsibility."
    },
    "account": {
        "password_reset": "Go to login page and click 'Forgot Password'. A reset link will be emailed within 2 minutes.",
        "email_change": "Email changes require verification from both the old and new email address. Allow 24 hours.",
        "deletion": "Account deletion requests are processed within 30 days. Contact support@example.com."
    },
    "billing": {
        "payment_methods": "We accept Visa, Mastercard, Amex, PayPal, and Apple Pay.",
        "invoice": "Invoices are emailed automatically after each purchase. You can also download them from Account > Billing.",
        "subscription": "Subscriptions can be cancelled at any time. Access continues until the end of the billing period."
    }
}

def search_faq(topic: str, subtopic: str = None) -> str:
    """Search the FAQ knowledge base."""
    topic = topic.lower()
    
    if topic not in FAQ_DATA:
        available = ", ".join(FAQ_DATA.keys())
        return f"Topic '{topic}' not found. Available topics: {available}"
    
    if subtopic:
        subtopic = subtopic.lower()
        if subtopic in FAQ_DATA[topic]:
            return FAQ_DATA[topic][subtopic]
        else:
            # Return all entries for this topic
            entries = FAQ_DATA[topic]
            return "\n".join([f"{k}: {v}" for k, v in entries.items()])
    
    # Return all entries for the topic
    entries = FAQ_DATA[topic]
    return "\n".join([f"{k}: {v}" for k, v in entries.items()])


def detect_escalation_needed(message: str, turn_count: int) -> dict:
    """Determine if conversation should be escalated to a human."""
    escalation_triggers = [
        "speak to human",
        "talk to agent",
        "real person",
        "supervisor",
        "manager",
        "this is unacceptable",
        "legal action",
        "complaint",
        "frustrated",
        "angry",
        "lawsuit"
    ]
    
    message_lower = message.lower()
    triggered_keywords = [t for t in escalation_triggers if t in message_lower]
    
    # Also escalate after too many turns without resolution
    too_many_turns = turn_count > 8
    
    return {
        "should_escalate": len(triggered_keywords) > 0 or too_many_turns,
        "reason": triggered_keywords[0] if triggered_keywords else ("unresolved after multiple turns" if too_many_turns else None),
        "priority": "high" if triggered_keywords else "normal"
    }

Building the Core Agent

# customer_service_agent.py

import os
import json
from autogen import ConversableAgent, UserProxyAgent
from knowledge_base import search_faq, detect_escalation_needed

# Configuration
llm_config = {
    "config_list": [
        {
            "model": "gpt-4-turbo",
            "api_key": os.environ.get("OPENAI_API_KEY"),
        }
    ],
    "temperature": 0.3,
    "timeout": 30,
}

SYSTEM_MESSAGE = """You are a helpful customer service agent for Acme Store.

Your responsibilities:
- Answer customer questions about refunds, shipping, accounts, and billing
- Use the search_faq tool to look up accurate information before answering
- Be friendly, concise, and helpful
- If you cannot find relevant information, be honest about it
- If the customer is frustrated, upset, or explicitly requests a human, use check_escalation

Rules:
- Always use search_faq before answering product or policy questions
- Never make up policies or timelines — only use information from the tool
- Keep responses under 150 words unless the customer asks for detail
- Use the customer's name if they provide it
- End each response by asking if the customer has additional questions

You cannot help with: legal disputes, fraud investigations, or custom enterprise pricing.
For those topics, always escalate to a human immediately."""


# Create the customer service agent
service_agent = ConversableAgent(
    name="CustomerServiceAgent",
    system_message=SYSTEM_MESSAGE,
    llm_config=llm_config,
    human_input_mode="NEVER",
    max_consecutive_auto_reply=10,
)

# Create a user proxy that represents the customer
customer_proxy = UserProxyAgent(
    name="Customer",
    human_input_mode="ALWAYS",
    max_consecutive_auto_reply=0,
    code_execution_config=False,
)


# Register tools with the service agent
@service_agent.register_for_llm(description="Search the FAQ knowledge base for customer questions about refunds, shipping, accounts, or billing.")
def faq_search(topic: str, subtopic: str = "") -> str:
    """Search FAQ knowledge base."""
    return search_faq(topic, subtopic if subtopic else None)


@service_agent.register_for_llm(description="Check if the current conversation should be escalated to a human agent.")
def check_escalation(customer_message: str, turn_number: int) -> str:
    """Check escalation criteria."""
    result = detect_escalation_needed(customer_message, turn_number)
    return json.dumps(result)


# Register tools for execution
@customer_proxy.register_for_execution()
def faq_search(topic: str, subtopic: str = "") -> str:
    return search_faq(topic, subtopic if subtopic else None)


@customer_proxy.register_for_execution()
def check_escalation(customer_message: str, turn_number: int) -> str:
    result = detect_escalation_needed(customer_message, turn_number)
    return json.dumps(result)

Adding Escalation and Handoff Logic

The escalation pattern is where most customer service agents fall short. When the AI cannot help, it needs to do three things: recognize the situation, stop trying, and transfer context cleanly.

# escalation_handler.py

from dataclasses import dataclass
from typing import Optional
import datetime


@dataclass
class EscalationTicket:
    ticket_id: str
    customer_id: Optional[str]
    conversation_summary: str
    escalation_reason: str
    priority: str
    timestamp: str
    conversation_history: list


def create_escalation_ticket(
    conversation_history: list,
    reason: str,
    priority: str = "normal",
    customer_id: str = None
) -> EscalationTicket:
    """Create a structured handoff ticket for human agents."""
    
    # Generate a simple ticket ID
    ticket_id = f"ESC-{datetime.datetime.now().strftime('%Y%m%d%H%M%S')}"
    
    # Summarize the conversation
    summary = summarize_conversation(conversation_history)
    
    return EscalationTicket(
        ticket_id=ticket_id,
        customer_id=customer_id,
        conversation_summary=summary,
        escalation_reason=reason,
        priority=priority,
        timestamp=datetime.datetime.now().isoformat(),
        conversation_history=conversation_history
    )


def summarize_conversation(history: list) -> str:
    """Create a concise summary for the human agent."""
    if not history:
        return "No conversation history available."
    
    # Extract key points from the conversation
    customer_messages = [
        msg["content"] for msg in history 
        if msg.get("role") == "user" or msg.get("name") == "Customer"
    ]
    
    if not customer_messages:
        return "Customer contacted support."
    
    first_message = customer_messages[0][:200]
    turn_count = len(customer_messages)
    last_message = customer_messages[-1][:200] if len(customer_messages) > 1 else ""
    
    summary = f"Customer initiated contact about: {first_message}\n"
    summary += f"Conversation length: {turn_count} customer messages\n"
    if last_message and last_message != first_message:
        summary += f"Most recent customer message: {last_message}\n"
    
    return summary


def format_handoff_message(ticket: EscalationTicket) -> str:
    """Format the message shown to the customer during handoff."""
    return f"""I understand this situation needs more attention than I can provide.

I've created a priority support ticket (ID: {ticket.ticket_id}) and a human agent will follow up shortly.

Here is what I've shared with the team:
- Summary: {ticket.conversation_summary[:200]}
- Priority: {ticket.priority.upper()}

A human agent will contact you within:
- High priority: 15 minutes
- Normal priority: 2-4 hours

Is there anything else I can note for the agent before they reach out?"""

The Multi-Turn Conversation Loop

Now let us wire everything together into a conversation manager that handles the full flow:

# conversation_manager.py

import os
from autogen import ConversableAgent, UserProxyAgent, GroupChat, GroupChatManager
from customer_service_agent import service_agent, customer_proxy
from escalation_handler import create_escalation_ticket, format_handoff_message


class CustomerServiceSession:
    def __init__(self, customer_id: str = None):
        self.customer_id = customer_id
        self.turn_count = 0
        self.escalated = False
        self.conversation_history = []
    
    def run_conversation(self, initial_message: str):
        """Start and manage a customer service conversation."""
        print(f"\n{'='*60}")
        print("Customer Service Session Started")
        print(f"{'='*60}\n")
        
        # Start the conversation
        result = customer_proxy.initiate_chat(
            service_agent,
            message=initial_message,
            max_turns=15,
        )
        
        # Store history
        self.conversation_history = result.chat_history
        return result
    
    def check_and_handle_escalation(self, message: str) -> bool:
        """Check if escalation is needed and handle it."""
        from knowledge_base import detect_escalation_needed
        
        escalation_check = detect_escalation_needed(message, self.turn_count)
        
        if escalation_check["should_escalate"]:
            ticket = create_escalation_ticket(
                conversation_history=self.conversation_history,
                reason=escalation_check["reason"],
                priority=escalation_check["priority"],
                customer_id=self.customer_id
            )
            
            handoff_message = format_handoff_message(ticket)
            print(f"\nAgent: {handoff_message}")
            
            # In production, you would:
            # - Create a ticket in your helpdesk (Zendesk, Freshdesk, etc.)
            # - Notify the human agent queue
            # - Store conversation in your CRM
            print(f"\n[SYSTEM: Escalation ticket {ticket.ticket_id} created]")
            
            self.escalated = True
            return True
        
        return False


# Specialist agent for billing issues
billing_specialist = ConversableAgent(
    name="BillingSpecialist",
    system_message="""You are a billing specialist with access to account information.
    You handle complex billing disputes, payment failures, and subscription issues.
    You have authority to issue credits up to $50 without manager approval.
    Always verify the customer's identity before discussing account details.""",
    llm_config={
        "config_list": [{"model": "gpt-4-turbo", "api_key": os.environ.get("OPENAI_API_KEY")}],
        "temperature": 0.2,
    },
    human_input_mode="NEVER",
)


def run_multi_agent_session(customer_message: str):
    """Run a session with automatic routing between agents."""
    
    # Determine initial routing based on message content
    billing_keywords = ["charge", "payment", "invoice", "subscription", "refund denied", "dispute"]
    
    is_billing = any(kw in customer_message.lower() for kw in billing_keywords)
    
    if is_billing:
        print("[Routing to billing specialist]")
        initial_agent = billing_specialist
    else:
        print("[Routing to general support]")
        initial_agent = service_agent
    
    session = CustomerServiceSession()
    result = customer_proxy.initiate_chat(
        initial_agent,
        message=customer_message,
        max_turns=12,
    )
    
    return result


if __name__ == "__main__":
    print("Customer Service Agent (AutoGen)")
    print("Type 'quit' to exit\n")
    
    initial_message = input("How can we help you today? ")
    
    if initial_message.lower() != "quit":
        run_multi_agent_session(initial_message)

Testing the Agent

Let us walk through what a real interaction looks like:

# test_agent.py

from conversation_manager import run_multi_agent_session

# Test 1: Simple FAQ question
result = run_multi_agent_session("What is your refund policy?")

# Test 2: Multi-turn with escalation trigger
result = run_multi_agent_session(
    "I need to return an item I bought 45 days ago and your system won't let me"
)

# Test 3: Angry customer requiring human
result = run_multi_agent_session(
    "This is completely unacceptable. I want to speak to a manager immediately."
)

Expected flow for Test 2:

Agent calls search_faq("refund", "policy")
Returns the 30-day policy
Agent explains the policy limitation
Customer pushes back
Agent calls check_escalation
Escalation ticket created
Handoff message delivered

Performance and Cost Considerations

Customer service at scale means understanding the economics:

Configuration	Cost per 10-turn conversation	Quality
GPT-4 Turbo, 0 tools	~$0.18	High
GPT-4 Turbo, with tools	~$0.22	Higher (accurate)
GPT-3.5 Turbo, with tools	~$0.03	Good for simple queries
GPT-4o mini, with tools	~$0.01	Good baseline

For high-volume deployments, run GPT-4o mini as the first-line agent and only escalate to GPT-4 Turbo when the mini agent flags a complex issue.

For building memory across sessions (recognizing returning customers), see AI agent memory and planning.

What to Build Next

This is a functional foundation. Production extensions worth adding:

Session persistence: Store conversation history in Redis so customers can resume across page loads
Sentiment tracking: Log sentiment scores per turn to identify struggling conversations early
Handoff to live chat: Integrate with Intercom, Zendesk, or Freshdesk via their APIs
Analytics dashboard: Track resolution rates, escalation reasons, and average turns to resolution

If you want a deeper look at how multi-agent patterns work in a research context, the AI research agent build guide shows similar orchestration patterns applied to knowledge retrieval.

For the CrewAI approach to multi-agent customer service, CrewAI tutorial covers how role-based agents handle specialization differently.

Frequently Asked Questions

Share this article:Facebook Twitter/X LinkedIn Telegram WhatsApp

💬 DiscussionPowered by GitHub Discussions

Frequently Asked Questions

AiTechWorlds Team

✓ Verified Writer

The AiTechWorlds team is passionate about AI, technology, and education. We create high-quality, research-backed content to help you learn, grow, and succeed in the modern digital world.

📱 Follow on Telegram 🐦 Follow on X Learn More →

Agent Development

5 AutoGen Agent Roles (Assistant, UserProxy, CodeExecutor)

Understand the 5 core AutoGen agent types — AssistantAgent, UserProxyAgent, CodeExecutorAgent, and more — with code examples and a comparison table for each role.

May 31, 2026 11 min read

Agent Development

How to Deploy AutoGen Agents as APIs with FastAPI (2026)

Learn to serve AutoGen multi-agent systems as production REST APIs using FastAPI with async endpoints and real-time streaming responses.

May 31, 2026 10 min read

Agent Development

How to Use AutoGen with Azure OpenAI (Enterprise Security)

Connect Microsoft AutoGen to Azure OpenAI for enterprise-grade AI agents. Step-by-step setup with private endpoints, OAI_CONFIG_LIST, and deployment config.

May 31, 2026 10 min read

Agent Development

Build a Code Debugging Agent with AutoGen (Auto-Fix PRs)

Build an AutoGen agent that reviews code, analyzes PR diffs, suggests fixes, and automates code quality improvements with a full working implementation.

May 31, 2026 11 min read

10K+ Members Growing Daily

Get Free AI Notes Daily

Join AiTechWorlds on Telegram and get daily AI tips, prompt engineering templates, coding resources, and exclusive content — 100% free!

📚 Free Study Notes🤖 AI Tips Daily⚡ Prompt Templates💻 Coding Resources

Join Free Channel

No spam. Leave anytime.

Build a Customer Service Agent with AutoGen (Multi-Turn)

What You Will Build

Prerequisites

Setting Up the Knowledge Base

Building the Core Agent

Adding Escalation and Handoff Logic

The Multi-Turn Conversation Loop

Testing the Agent

Performance and Cost Considerations

What to Build Next

Frequently Asked Questions

💬 DiscussionPowered by GitHub Discussions

Frequently Asked Questions

AiTechWorlds Team

Related Articles

5 AutoGen Agent Roles (Assistant, UserProxy, CodeExecutor)

How to Deploy AutoGen Agents as APIs with FastAPI (2026)

How to Use AutoGen with Azure OpenAI (Enterprise Security)

Build a Code Debugging Agent with AutoGen (Auto-Fix PRs)

Get Free AI Notes Daily

Build a Customer Service Agent with AutoGen (Multi-Turn)

What You Will Build

Prerequisites

Setting Up the Knowledge Base

Building the Core Agent

Adding Escalation and Handoff Logic

The Multi-Turn Conversation Loop

Testing the Agent

Performance and Cost Considerations

What to Build Next

Frequently Asked Questions

💬 DiscussionPowered by GitHub Discussions

Frequently Asked Questions

AiTechWorlds Team

Related Articles

5 AutoGen Agent Roles (Assistant, UserProxy, CodeExecutor)

How to Deploy AutoGen Agents as APIs with FastAPI (2026)

How to Use AutoGen with Azure OpenAI (Enterprise Security)

Build a Code Debugging Agent with AutoGen (Auto-Fix PRs)

Get Free AI Notes Daily