How to Use AutoGen with Docker (Containerized Agents)
Learn how to run AutoGen agents in Docker containers for isolated, reproducible code execution. Covers DockerCommandLineCodeExecutor, docker-compose, and custom images.
Get more content like this on Telegram!
Daily AI tips, notes & resources — free
Running agent-generated code directly on your laptop is fine for learning and prototyping. Running it in production without isolation is asking for trouble. Agent-generated code can be unpredictable — and unpredictable code running on your production server with access to your filesystem is a security incident waiting to happen.
Docker solves this cleanly. AutoGen's DockerCommandLineCodeExecutor routes all code execution through a container, giving you process isolation, a controlled execution environment, and the ability to define exactly what packages and permissions are available. This guide covers the full setup from local development to a production-ready docker-compose configuration.
For context on how AutoGen handles code execution in general, AI agents explained is useful background. And for a complete look at production deployment considerations, Deploy AI model to production covers the broader infrastructure picture.
Why Container Isolation Matters
When an AutoGen agent executes code, it's running arbitrary Python generated by an LLM. That code could:
- Write to the filesystem in unexpected locations
- Make outbound network requests
- Import and use system libraries
- Consume excessive CPU or memory
- Attempt to access environment variables with credentials
Without isolation, all of this runs with your user's permissions on your host. Inside a Docker container with appropriate restrictions, the blast radius of any misbehavior is contained.
A 2025 OWASP survey on LLM security found that unrestricted code execution in agentic systems was among the top 5 vulnerabilities in production AI deployments. Container isolation directly addresses this.
Prerequisites
You need Docker Desktop (Windows/Mac) or Docker Engine (Linux) installed and running. Verify:
docker --version
# Docker version 25.0.x or later
docker run hello-world
# Should print "Hello from Docker!"
Install AutoGen with the Docker extras:
pip install "pyautogen[docker]"
The [docker] extra includes the docker Python SDK that AutoGen uses to manage containers programmatically.
Basic DockerCommandLineCodeExecutor Setup
The simplest containerized setup uses AutoGen's built-in Docker executor:
import os
import tempfile
import autogen
from autogen.coding import DockerCommandLineCodeExecutor
# Create a temporary directory for code files shared with the container
work_dir = tempfile.mkdtemp(prefix="autogen_docker_")
# Initialize the Docker executor
docker_executor = DockerCommandLineCodeExecutor(
image="python:3.11-slim", # Docker image to run code in
timeout=60, # Max seconds per code execution
work_dir=work_dir, # Directory mounted into container
)
# LLM configuration
llm_config = {
"config_list": [
{"model": "gpt-4o", "api_key": os.getenv("OPENAI_API_KEY")}
]
}
# Assistant agent — generates code
assistant = autogen.AssistantAgent(
name="assistant",
llm_config=llm_config,
system_message=(
"You are a Python expert. Write executable Python code to complete tasks. "
"Always use print() to show results. Keep code self-contained."
),
)
# UserProxy with Docker executor
user_proxy = autogen.UserProxyAgent(
name="user_proxy",
human_input_mode="NEVER",
max_consecutive_auto_reply=10,
code_execution_config={"executor": docker_executor},
is_termination_msg=lambda msg: "TASK_COMPLETE" in (msg.get("content") or ""),
)
# Start the executor (starts the container)
docker_executor.start()
try:
user_proxy.initiate_chat(
assistant,
message="Write a Python script that generates the first 20 Fibonacci numbers and prints them with their index.",
)
finally:
# Always stop the container when done
docker_executor.stop()
When this runs, the assistant writes Python code, the executor sends it to the Docker container, runs it, and returns the output. The python:3.11-slim image has a minimal Python installation — sufficient for standard library code.
Building a Custom Docker Image
The python:3.11-slim base image only has the standard library. For real agent tasks, you need additional packages.
Create a Dockerfile for your agent environment:
# Dockerfile.autogen
FROM python:3.11-slim
# Set working directory
WORKDIR /workspace
# Install system dependencies
RUN apt-get update && apt-get install -y \
curl \
git \
&& rm -rf /var/lib/apt/lists/*
# Install Python packages commonly needed by agents
RUN pip install --no-cache-dir \
requests==2.31.0 \
beautifulsoup4==4.12.2 \
pandas==2.1.4 \
numpy==1.26.2 \
matplotlib==3.8.2 \
scikit-learn==1.3.2 \
python-dotenv==1.0.0 \
httpx==0.25.2
# Create a non-root user for security
RUN useradd --create-home --shell /bin/bash agentuser
USER agentuser
# Ensure the workspace is writable
RUN mkdir -p /workspace/output
Build the image:
docker build -f Dockerfile.autogen -t autogen-agent:latest .
Use it in your executor:
docker_executor = DockerCommandLineCodeExecutor(
image="autogen-agent:latest", # Your custom image
timeout=120,
work_dir=work_dir,
)
Docker Compose for Multi-Agent Systems
For production setups with multiple agents or persistent services, docker-compose is the right approach.
Create docker-compose.yml:
version: "3.9"
services:
# AutoGen orchestrator — runs the Python agent code
autogen-orchestrator:
build:
context: .
dockerfile: Dockerfile.orchestrator
environment:
- OPENAI_API_KEY=${OPENAI_API_KEY}
- AGENT_WORK_DIR=/workspace/agent_output
- DOCKER_HOST=tcp://docker-proxy:2375
volumes:
- ./agent_output:/workspace/agent_output
- ./scripts:/workspace/scripts:ro
depends_on:
- docker-proxy
- redis-memory
networks:
- agent-net
# Docker-in-Docker proxy for safe container management
docker-proxy:
image: tecnativa/docker-socket-proxy:latest
environment:
- CONTAINERS=1
- IMAGES=1
- NETWORKS=1
- POST=1
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
networks:
- agent-net
# Redis for agent memory persistence
redis-memory:
image: redis:7-alpine
volumes:
- redis-data:/data
networks:
- agent-net
command: redis-server --appendonly yes
# Code execution sandbox — isolated from main network
code-sandbox:
build:
context: .
dockerfile: Dockerfile.autogen
networks:
- sandbox-net # Separate network, no internet access
volumes:
- ./agent_output:/workspace/output
security_opt:
- no-new-privileges:true
read_only: true
tmpfs:
- /tmp
- /workspace/temp
volumes:
redis-data:
networks:
agent-net:
driver: bridge
sandbox-net:
driver: bridge
internal: true # No external internet access
Create Dockerfile.orchestrator:
FROM python:3.11-slim
WORKDIR /app
RUN pip install --no-cache-dir \
pyautogen[docker]==0.4.0 \
python-dotenv==1.0.0 \
redis==5.0.1
COPY scripts/ ./scripts/
COPY main.py .
CMD ["python", "main.py"]
The key security insight in this compose file: the code-sandbox service uses internal: true on its network, which means no outbound internet access. Agent code runs in that sandbox and can't make unexpected network calls.
Main Orchestrator Script
Create main.py for the orchestrator container:
# main.py
import os
import tempfile
import autogen
from autogen.coding import DockerCommandLineCodeExecutor
from dotenv import load_dotenv
load_dotenv()
def create_executor(work_dir: str) -> DockerCommandLineCodeExecutor:
"""Create a Docker executor pointing at the sandbox container."""
return DockerCommandLineCodeExecutor(
image="autogen-agent:latest",
timeout=int(os.getenv("CODE_EXECUTION_TIMEOUT", "60")),
work_dir=work_dir,
# Additional Docker run arguments for security
docker_run_kwargs={
"network_mode": "sandbox-net",
"mem_limit": "512m",
"cpu_period": 100000,
"cpu_quota": 50000, # 50% CPU limit
"read_only": True,
"tmpfs": {"/tmp": "size=100m"},
}
)
def run_coding_agent(task: str, output_dir: str) -> dict:
"""Run a coding agent for a given task."""
llm_config = {
"config_list": [
{"model": "gpt-4o", "api_key": os.getenv("OPENAI_API_KEY")}
],
"temperature": 0,
}
work_dir = tempfile.mkdtemp(dir=output_dir)
executor = create_executor(work_dir)
assistant = autogen.AssistantAgent(
name="CodingAssistant",
llm_config=llm_config,
system_message=(
"You are a Python expert. Write clean, well-commented code. "
"Always include error handling. When complete, say TASK_COMPLETE."
),
)
user_proxy = autogen.UserProxyAgent(
name="Executor",
human_input_mode="NEVER",
max_consecutive_auto_reply=8,
code_execution_config={"executor": executor},
is_termination_msg=lambda m: "TASK_COMPLETE" in (m.get("content") or ""),
)
executor.start()
try:
chat_result = user_proxy.initiate_chat(
assistant,
message=task,
max_turns=10,
)
return {
"status": "success",
"work_dir": work_dir,
"turns": len(chat_result.chat_history),
}
except Exception as e:
return {"status": "error", "error": str(e)}
finally:
executor.stop()
if __name__ == "__main__":
result = run_coding_agent(
task="Analyze the CSV file at /workspace/output/data.csv and create a summary report.",
output_dir=os.getenv("AGENT_WORK_DIR", "/workspace/agent_output"),
)
print(f"Agent completed: {result}")
Resource Limits for Production
Without resource limits, a runaway agent can consume all available CPU and memory. The docker_run_kwargs in the executor let you set limits:
docker_executor = DockerCommandLineCodeExecutor(
image="autogen-agent:latest",
timeout=120,
work_dir=work_dir,
docker_run_kwargs={
# Memory limits
"mem_limit": "1g", # Max 1GB RAM
"memswap_limit": "1g", # No swap beyond RAM limit
# CPU limits
"cpu_period": 100000,
"cpu_quota": 50000, # 50% of one CPU core
"cpuset_cpus": "0,1", # Use only CPUs 0 and 1
# Security
"security_opt": ["no-new-privileges:true"],
"cap_drop": ["ALL"], # Drop all Linux capabilities
"cap_add": ["NET_BIND_SERVICE"], # Add back only what's needed
# Filesystem
"read_only": True,
"tmpfs": {
"/tmp": "size=200m,mode=1777",
"/workspace": "size=500m",
},
# Network
"network_mode": "none", # Complete network isolation
}
)
Use "network_mode": "none" for maximum isolation when the agent doesn't need internet access. Use a named network (like sandbox-net from the compose file) when you need controlled access.
Running the Complete Setup
# Set your OpenAI key
export OPENAI_API_KEY=sk-your-key-here
# Build images
docker-compose build
# Start services
docker-compose up -d
# View logs
docker-compose logs -f autogen-orchestrator
# Stop everything
docker-compose down
For local development without full compose:
# Build just the agent image
docker build -f Dockerfile.autogen -t autogen-agent:latest .
# Run a single agent task
python main.py
Debugging Container Execution
When things go wrong in Docker execution, these commands help:
# List running containers
docker ps --filter "name=autogen"
# Stream container logs
docker logs -f <container_id>
# Open a shell in the running sandbox container
docker exec -it <container_id> /bin/bash
# Check resource usage
docker stats <container_id>
# Inspect container configuration
docker inspect <container_id>
For code execution failures, check the work directory on your host — it's mounted from the container, so you can see the exact code files the agent wrote:
ls -la ./agent_output/
cat ./agent_output/agent_code_20260531_123456.py
Container vs Local Execution Comparison
| Factor | Local Execution | Docker Execution |
|---|---|---|
| Security | None | Strong isolation |
| Setup time | Zero | 5-10 minutes |
| Reproducibility | Host-dependent | Consistent |
| Resource control | None | Full (CPU, RAM, I/O) |
| Debugging | Easy | Moderate |
| Speed (cold start) | Instant | +1-3 seconds |
| Speed (warm) | Fast | Near-equal |
| Production suitability | No | Yes |
| Multi-tenant use | Unsafe | Safe |
Connecting to Larger Systems
Container isolation is one layer of a production agent system. For the memory and state management layer, AI agent memory and planning covers patterns that work well alongside Docker-isolated execution. For a complete multi-agent deployment architecture, Deploy AI model to production covers the infrastructure pieces that surround the agent runtime.
The AutoGen patterns in Build AI agent with LangChain use similar isolation concepts, so the Docker setup knowledge transfers across frameworks.
FAQ
Why run AutoGen agents in Docker instead of locally? Docker provides process isolation, so agent-generated code can't affect your host system. It also ensures reproducible environments — the same container runs identically in development, staging, and production. For multi-tenant or production use, container isolation is essentially required.
Does Docker slow down AutoGen agent execution? Container startup adds roughly 1-3 seconds to the first code execution per session. After that, code runs inside the already-running container with minimal overhead. For long-running agent sessions, this startup cost is negligible.
What is DockerCommandLineCodeExecutor?
DockerCommandLineCodeExecutor is AutoGen's built-in executor that runs agent-generated code inside a Docker container instead of the host process. It starts a container, copies code into it, executes it, and returns the output — all transparently to the agents.
Can I install custom Python packages in the AutoGen Docker container? Yes. You can build a custom Docker image based on the AutoGen base image and add any packages in the Dockerfile. Alternatively, the executor can install packages at runtime using pip inside the container, though building them into the image is faster and more reliable.
How do I share files between my host machine and the AutoGen container?
Use Docker volume mounts. Mount a host directory to the container's work directory so files written by the agent are accessible on your host. The DockerCommandLineCodeExecutor has a work_dir parameter you can configure for this purpose.
Frequently Asked Questions
AiTechWorlds Team
✓ Verified WriterThe AiTechWorlds team is passionate about AI, technology, and education. We create high-quality, research-backed content to help you learn, grow, and succeed in the modern digital world.
Related Articles
5 AutoGen Agent Roles (Assistant, UserProxy, CodeExecutor)
Understand the 5 core AutoGen agent types — AssistantAgent, UserProxyAgent, CodeExecutorAgent, and more — with code examples and a comparison table for each role.
How to Deploy AutoGen Agents as APIs with FastAPI (2026)
Learn to serve AutoGen multi-agent systems as production REST APIs using FastAPI with async endpoints and real-time streaming responses.
How to Use AutoGen with Azure OpenAI (Enterprise Security)
Connect Microsoft AutoGen to Azure OpenAI for enterprise-grade AI agents. Step-by-step setup with private endpoints, OAI_CONFIG_LIST, and deployment config.
Build a Code Debugging Agent with AutoGen (Auto-Fix PRs)
Build an AutoGen agent that reviews code, analyzes PR diffs, suggests fixes, and automates code quality improvements with a full working implementation.