5 AutoGen Conversational Patterns (One-Shot, Multi-Turn, Hierarchical)
Master AutoGen's 5 core agent interaction models — from one-shot requests to hierarchical orchestration — with full code examples and use case comparisons.
Get more content like this on Telegram!
Daily AI tips, notes & resources — free
When most developers first encounter AutoGen, they set up two agents, have them chat, and wonder why the conversation spirals or loops unnecessarily. The missing piece is understanding that AutoGen supports multiple distinct conversational patterns — and choosing the wrong one for your use case is the most common reason AutoGen projects underperform.
This guide covers five patterns in depth: One-Shot, Multi-Turn, Hierarchical, Round-Robin, and Selector. Each section includes complete working code, the specific use cases where the pattern shines, and the failure modes to watch for.
For broader context on agent architectures, see AI agents explained and Build AI agent with LangChain for comparison with LangChain's approach.
Setup
All examples use pyautogen (install with pip install pyautogen). Configuration:
import autogen
config_list = [
{
"model": "gpt-4o",
"api_key": "YOUR_OPENAI_API_KEY"
}
]
llm_config = {
"config_list": config_list,
"temperature": 0,
"timeout": 120
}
Pattern 1: One-Shot (Request-Response)
Best for: Single, well-defined tasks that don't require clarification or iteration.
The One-Shot pattern is the simplest: a user agent sends one message, an assistant agent responds once, and the conversation terminates. No back-and-forth, no loops.
# pattern_1_oneshot.py
import autogen
llm_config = {
"config_list": [{"model": "gpt-4o", "api_key": "YOUR_KEY"}],
"temperature": 0
}
# Assistant agent — the worker
assistant = autogen.AssistantAgent(
name="CodeWriter",
llm_config=llm_config,
system_message="""You are a Python expert.
When asked to write code, return only the code block.
Do not explain unless asked."""
)
# User proxy — the requester (human_input_mode=NEVER means fully autonomous)
user = autogen.UserProxyAgent(
name="User",
human_input_mode="NEVER",
max_consecutive_auto_reply=1, # KEY: limits to one exchange
code_execution_config={"work_dir": "output", "use_docker": False}
)
# Initiate — single request, single response
user.initiate_chat(
assistant,
message="Write a Python function that validates an email address using regex."
)
The critical setting is max_consecutive_auto_reply=1 on the user proxy. Without it, AutoGen defaults to allowing many reply cycles, and a "one-shot" task turns into an extended conversation.
When it breaks down: Tasks where the first response is incomplete or needs iteration. Don't use One-Shot for code that needs testing and fixing.
Pattern 2: Multi-Turn (Iterative Refinement)
Best for: Code debugging, document drafting, research tasks that improve through iteration.
Multi-Turn lets agents exchange messages until a termination condition is met — either a keyword in the response, a max reply count, or a custom function.
# pattern_2_multiturn.py
import autogen
llm_config = {
"config_list": [{"model": "gpt-4o", "api_key": "YOUR_KEY"}],
"temperature": 0
}
assistant = autogen.AssistantAgent(
name="Debugger",
llm_config=llm_config,
system_message="""You are a debugging expert.
Analyze errors, suggest fixes, and validate when code runs correctly.
When the code runs without errors, end your response with TASK_COMPLETE."""
)
user = autogen.UserProxyAgent(
name="Developer",
human_input_mode="NEVER",
max_consecutive_auto_reply=8, # Allow up to 8 rounds
code_execution_config={"work_dir": "debug_output", "use_docker": False},
is_termination_msg=lambda msg: "TASK_COMPLETE" in msg.get("content", "")
)
buggy_code = """
def calculate_average(numbers):
total = sum(numbers)
average = total / len(numbers # Missing closing paren
return average
result = calculate_average([1, 2, 3, 4, 5])
print(f"Average: {result}")
"""
user.initiate_chat(
assistant,
message=f"Fix and run this code:\n```python\n{buggy_code}\n```"
)
The is_termination_msg lambda is the clean exit signal. Without it, you're relying entirely on max_consecutive_auto_reply — which often ends the conversation before the task is actually complete.
Iteration count vs. task complexity: For code debugging, 5-8 rounds is usually sufficient. For complex document editing, you might need 10-15. Set the limit based on task type, not as a universal default.
Pattern 3: Hierarchical (Manager + Workers)
Best for: Complex projects decomposed into parallel or sequential subtasks.
This is where AutoGen really differentiates itself. A manager agent breaks down a high-level goal and delegates to specialized worker agents. Think of it as a software team: PM assigns tasks, developers execute them.
# pattern_3_hierarchical.py
import autogen
llm_config = {
"config_list": [{"model": "gpt-4o", "api_key": "YOUR_KEY"}],
"temperature": 0
}
# Manager — orchestrates the team
manager_config = {
**llm_config,
"functions": None
}
# Specialized workers
researcher = autogen.AssistantAgent(
name="Researcher",
llm_config=llm_config,
system_message="""You research topics and return structured summaries.
Always cite specific data points or statistics when available."""
)
writer = autogen.AssistantAgent(
name="Writer",
llm_config=llm_config,
system_message="""You write clear, engaging content based on research provided.
Match the tone to the target audience specified in your brief."""
)
editor = autogen.AssistantAgent(
name="Editor",
llm_config=llm_config,
system_message="""You review content for accuracy, clarity, and style.
Flag specific issues with line references. End with APPROVED when content is ready."""
)
# Group chat brings them together
group_chat = autogen.GroupChat(
agents=[researcher, writer, editor],
messages=[],
max_round=12,
speaker_selection_method="auto" # Manager decides who speaks next
)
# GroupChatManager acts as the orchestrator
group_manager = autogen.GroupChatManager(
groupchat=group_chat,
llm_config=llm_config
)
# Initiator — sends the top-level task
initiator = autogen.UserProxyAgent(
name="ProjectLead",
human_input_mode="NEVER",
code_execution_config=False,
is_termination_msg=lambda msg: "APPROVED" in msg.get("content", "")
)
initiator.initiate_chat(
group_manager,
message="""Create a 300-word blog post about AutoGen for Python developers.
Researcher: find 2-3 key AutoGen features with stats.
Writer: draft the post from the research.
Editor: review and approve."""
)
Speaker selection matters. "auto" lets the LLM decide who should respond next based on context — this works well. "round_robin" forces a fixed order which is more predictable but less flexible. Use "auto" for complex tasks, "round_robin" when you want deterministic flow.
Pattern 4: Round-Robin (Structured Collaboration)
Best for: Tasks where every agent must contribute in a predictable sequence — code review pipelines, structured debates, checklist-driven workflows.
# pattern_4_roundrobin.py
import autogen
llm_config = {
"config_list": [{"model": "gpt-4o", "api_key": "YOUR_KEY"}],
"temperature": 0.3
}
security_reviewer = autogen.AssistantAgent(
name="SecurityReviewer",
llm_config=llm_config,
system_message="""Review code specifically for security vulnerabilities.
Check for SQL injection, XSS, auth bypasses, and insecure dependencies.
Rate severity: LOW/MEDIUM/HIGH/CRITICAL."""
)
performance_reviewer = autogen.AssistantAgent(
name="PerformanceReviewer",
llm_config=llm_config,
system_message="""Review code for performance issues.
Look for N+1 queries, unnecessary loops, memory leaks, and blocking I/O."""
)
style_reviewer = autogen.AssistantAgent(
name="StyleReviewer",
llm_config=llm_config,
system_message="""Review code for readability and style.
Check naming conventions, docstrings, complexity, and adherence to PEP 8."""
)
summarizer = autogen.AssistantAgent(
name="Summarizer",
llm_config=llm_config,
system_message="""Synthesize all review feedback into a prioritized action list.
Group by severity. End with REVIEW_COMPLETE."""
)
group_chat = autogen.GroupChat(
agents=[security_reviewer, performance_reviewer, style_reviewer, summarizer],
messages=[],
max_round=5,
speaker_selection_method="round_robin" # Each agent speaks exactly once, in order
)
manager = autogen.GroupChatManager(
groupchat=group_chat,
llm_config=llm_config
)
user = autogen.UserProxyAgent(
name="Developer",
human_input_mode="NEVER",
code_execution_config=False,
is_termination_msg=lambda msg: "REVIEW_COMPLETE" in msg.get("content", "")
)
code_to_review = """
def get_user(user_id):
query = f"SELECT * FROM users WHERE id = {user_id}"
return db.execute(query).fetchone()
"""
user.initiate_chat(
manager,
message=f"Review this code:\n```python\n{code_to_review}\n```"
)
Round-Robin is particularly good for review pipelines because you guarantee every perspective gets heard. In "auto" mode, the LLM might skip a reviewer if it thinks the conversation is done.
Pattern 5: Selector (Dynamic Agent Routing)
Best for: Tasks that require different specialists depending on the content of each message — customer support, multi-domain research, adaptive tutoring.
# pattern_5_selector.py
import autogen
llm_config = {
"config_list": [{"model": "gpt-4o", "api_key": "YOUR_KEY"}],
"temperature": 0
}
python_expert = autogen.AssistantAgent(
name="PythonExpert",
llm_config=llm_config,
system_message="You answer questions specifically about Python programming."
)
ml_expert = autogen.AssistantAgent(
name="MLExpert",
llm_config=llm_config,
system_message="You answer questions about machine learning, model training, and evaluation."
)
devops_expert = autogen.AssistantAgent(
name="DevOpsExpert",
llm_config=llm_config,
system_message="You answer questions about deployment, CI/CD, and infrastructure."
)
# Custom selector function — routes to the right expert
def select_agent(last_speaker, groupchat):
messages = groupchat.messages
if not messages:
return groupchat.agents[0]
last_msg = messages[-1]["content"].lower()
if any(word in last_msg for word in ["deploy", "docker", "kubernetes", "ci/cd", "server"]):
return devops_expert
elif any(word in last_msg for word in ["train", "model", "accuracy", "loss", "epoch", "neural"]):
return ml_expert
else:
return python_expert
group_chat = autogen.GroupChat(
agents=[python_expert, ml_expert, devops_expert],
messages=[],
max_round=10,
speaker_selection_method=select_agent # Custom routing function
)
manager = autogen.GroupChatManager(
groupchat=group_chat,
llm_config=llm_config
)
user = autogen.UserProxyAgent(
name="User",
human_input_mode="NEVER",
code_execution_config=False,
max_consecutive_auto_reply=5
)
# This conversation will route to different experts based on content
user.initiate_chat(
manager,
message="How do I deploy a PyTorch model as a REST API on Kubernetes?"
)
The selector function receives the last_speaker and groupchat object, giving you full context to make routing decisions. You can make this as sophisticated as needed — including calling a classifier model to route complex queries.
Pattern Comparison Table
| Pattern | Use Case | Complexity | Token Cost | Predictability | Best For |
|---|---|---|---|---|---|
| One-Shot | Single Q&A, code generation | Low | Low | High | Scripted workflows, data extraction |
| Multi-Turn | Debugging, drafting, refinement | Medium | Medium | Medium | Any iterative task |
| Hierarchical | Complex projects, parallel work | High | High | Medium | Full projects, research pipelines |
| Round-Robin | Structured reviews, checklists | Medium | Medium | High | Code review, compliance checks |
| Selector | Multi-domain routing | High | Low-Medium | Low | Chatbots, adaptive assistants |
Token cost reality check: Hierarchical patterns with 5+ agents and 12 rounds can run 15,000-30,000 tokens per task. At GPT-4o pricing, budget $0.05-$0.15 per complex run. For high-frequency tasks, consider dropping to GPT-4o-mini for worker agents while keeping GPT-4o for the manager.
Combining Patterns
Real applications rarely use a single pattern in isolation. A common production setup:
# hybrid_example.py — Selector routes to specialized sub-hierarchies
def advanced_selector(last_speaker, groupchat):
"""Route to sub-teams based on task type."""
messages = groupchat.messages
last_content = messages[-1]["content"].lower() if messages else ""
# Research tasks go to a 3-agent research team
if "research" in last_content or "find information" in last_content:
return research_team_manager
# Code tasks go through the multi-turn debugger
elif "code" in last_content or "implement" in last_content:
return code_team_manager
# Default to a single expert
return general_assistant
This "meta-pattern" approach is what powers production AI research agents like the ones described in AI research agent build.
Common Mistakes
Using Hierarchical when One-Shot is enough. The overhead of spinning up a GroupChat with a manager and 3 workers for a simple code generation task is wasteful. Reserve Hierarchical for tasks that genuinely benefit from specialization.
Forgetting termination conditions. Every multi-agent conversation needs an exit. Use is_termination_msg, max_consecutive_auto_reply, or both. Without them, agents will happily talk until your API budget is gone.
Making worker agents too general. The more specific and constrained each agent's system message, the better the overall output. A "Researcher" that only does research beats a "Helper" that does everything mediocrely.
For more on agent memory and how context persists across patterns, see AI agent memory and planning.
FAQs
Which AutoGen pattern is best for production systems?
The Hierarchical pattern works best in production because it separates task management from execution, making failures isolated and easier to debug. The Manager agent can retry failed subtasks without restarting the entire workflow.
Can I mix patterns within a single AutoGen application?
Yes. A common pattern is to use a Hierarchical orchestrator that spins up One-Shot agents for simple subtasks and Multi-Turn agents for tasks requiring clarification. AutoGen's GroupChat supports mixing agent types in a single session.
How do I limit token costs in multi-turn AutoGen conversations?
Set max_consecutive_auto_reply on your agents to cap the number of back-and-forth turns. You can also pass a summary_method to GroupChats so the context is compressed at intervals rather than growing indefinitely.
Frequently Asked Questions
AiTechWorlds Team
✓ Verified WriterThe AiTechWorlds team is passionate about AI, technology, and education. We create high-quality, research-backed content to help you learn, grow, and succeed in the modern digital world.
Related Articles
5 AutoGen Agent Roles (Assistant, UserProxy, CodeExecutor)
Understand the 5 core AutoGen agent types — AssistantAgent, UserProxyAgent, CodeExecutorAgent, and more — with code examples and a comparison table for each role.
How to Deploy AutoGen Agents as APIs with FastAPI (2026)
Learn to serve AutoGen multi-agent systems as production REST APIs using FastAPI with async endpoints and real-time streaming responses.
How to Use AutoGen with Azure OpenAI (Enterprise Security)
Connect Microsoft AutoGen to Azure OpenAI for enterprise-grade AI agents. Step-by-step setup with private endpoints, OAI_CONFIG_LIST, and deployment config.
Build a Code Debugging Agent with AutoGen (Auto-Fix PRs)
Build an AutoGen agent that reviews code, analyzes PR diffs, suggests fixes, and automates code quality improvements with a full working implementation.