What is the difference between Assistants API and Chat Completions?

Chat Completions: stateless, you manage message history, tools, and context. Simple, flexible, no vendor lock-in. Best for: custom architectures, multi-model systems, custom RAG, full control. Assistants API: stateful, OpenAI manages threads and tool execution server-side. Best for: built-in Code Interpreter (run Python, create plots), File Search (RAG over PDFs without building it), persistent multi-session conversations, simpler implementation for standard agent patterns. The main downside: vendor lock-in — you can't use Claude or Gemini with Assistants API threads.

How does Code Interpreter work in the Assistants API?

Code Interpreter runs Python in a sandboxed environment managed by OpenAI. The assistant can write and execute Python code to: analyze data (upload a CSV, ask 'find the top 10 customers'), create visualizations (asks for a chart, generates and returns a PNG), solve math problems (runs calculations rather than hallucinating), process files (read, transform, write), and debug its own code (if execution fails, it sees the error and fixes it). Files created by Code Interpreter (charts, processed CSVs) are available as file IDs that you can download. No infrastructure required — OpenAI manages the execution environment.

How does File Search work in the Assistants API?

File Search is built-in RAG. Upload documents to a Vector Store, attach the Vector Store to an Assistant, and the assistant automatically searches uploaded files when relevant. Supports: PDFs, DOCX, TXT, Markdown, code files. OpenAI handles chunking, embedding, and retrieval automatically. You don't need a vector database or embedding model. Limitations: less control over chunking strategy, retrieval quality compared to custom RAG implementations, and all files must go through OpenAI (privacy consideration for sensitive documents). File Search is best for: quick document Q&A without RAG infrastructure, teams without ML engineering resources.

What are Threads and how does persistent memory work?

A Thread is a conversation container — it stores all Messages in order. Create one Thread per user conversation. Threads persist indefinitely (until you delete them). When a Thread exceeds the model's context window, OpenAI automatically truncates oldest messages (with smart summarization for continuity). You don't write any context management code. Store thread_id in your database per user/session to maintain continuity. Limitation: you can't control exactly which messages are kept when truncation occurs. For custom truncation strategies (e.g., always keep the first message as a summary), use Chat Completions with your own context management instead.

AI Tips Prompting Python AI Tools Web Dev ChatGPT LLM Agent Dev Reviews Notes Free Books

AiTechWorlds

AI agent workflow automation on development screen — openai assistants api guide

Agent Development

OpenAI Assistants API Guide: Build Agents with Code Interpreter and File Search

⚡ Quick Answer

OpenAI Assistants API guide — build AI agents with persistent threads, Code Interpreter, File Search, and function calling. Complete Python tutorial with production patterns.

AiTechWorlds Team May 27, 2026 7 min read

#openai-assistants-api #openai-assistants #code-interpreter-api #agent-development

📚Part of the Agent Development guide — explore all Agent Development articles→

Share:Facebook Twitter/X LinkedIn Telegram WhatsApp

📱

Get more content like this on Telegram!

Daily AI tips, notes & resources — free

Join Free →

OpenAI Assistants API Guide: Build Agents with Code Interpreter and File Search

When a user uploaded a 500-row CSV and asked for "a chart of the top 10 customers by revenue," I had two choices: build a complete data processing pipeline with pandas and matplotlib, or use the Assistants API's Code Interpreter.

Code Interpreter had it running in an afternoon. The custom pipeline would have taken a week, plus ongoing maintenance.

That's the proposition of the Assistants API: trade flexibility and control for managed infrastructure. Here's when it's worth it and how to use it.

Core Concepts

Assistants API Object Model:

Assistant
  → Configured once: model, instructions, tools, files
  → Reusable across many conversations

Thread
  → One per user conversation
  → Stores all messages automatically
  → Persists until deleted
  → Handles context overflow automatically

Message
  → User or Assistant turn in a thread
  → Can contain text, images, files

Run
  → One execution of the LLM against the thread
  → Has status: queued → in_progress → completed/failed
  → Must poll or use webhooks for completion

Run Step
  → Individual action within a run
  → Tool call, message creation, etc.

Setup and First Assistant

from openai import OpenAI
import time

client = OpenAI()

# Create an assistant (do this once, reuse the ID)
assistant = client.beta.assistants.create(
    name="Research Assistant",
    instructions="""You are a helpful research assistant. 
    Use Code Interpreter for data analysis and calculations.
    Use File Search to answer questions from uploaded documents.
    Always provide sources when using document information.""",
    model="gpt-4o",
    tools=[
        {"type": "code_interpreter"},
        {"type": "file_search"}
    ]
)

print(f"Assistant created: {assistant.id}")
# Save this ID — reference it instead of recreating

# List existing assistants
assistants = client.beta.assistants.list()
for a in assistants.data:
    print(f"{a.id}: {a.name}")

# Update assistant
client.beta.assistants.update(
    assistant.id,
    instructions="Updated instructions here"
)

Threads and Messages

def create_and_run(
    assistant_id: str,
    user_message: str,
    thread_id: str | None = None,
    attachments: list | None = None
) -> tuple[str, str]:
    """
    Start or continue a conversation.
    Returns (response_text, thread_id)
    """
    
    # Create new thread or use existing
    if not thread_id:
        thread = client.beta.threads.create()
        thread_id = thread.id
    
    # Add user message
    message_params = {
        "thread_id": thread_id,
        "role": "user",
        "content": user_message
    }
    if attachments:
        message_params["attachments"] = attachments
    
    client.beta.threads.messages.create(**message_params)
    
    # Start a run
    run = client.beta.threads.runs.create(
        thread_id=thread_id,
        assistant_id=assistant_id,
        # Override assistant settings if needed:
        # temperature=0.5,
        # max_completion_tokens=1000
    )
    
    # Poll until complete
    while run.status in ["queued", "in_progress", "cancelling"]:
        time.sleep(1)
        run = client.beta.threads.runs.retrieve(
            thread_id=thread_id,
            run_id=run.id
        )
    
    if run.status == "failed":
        raise RuntimeError(f"Run failed: {run.last_error}")
    
    if run.status == "requires_action":
        # Handle function calling (see next section)
        pass
    
    # Get messages
    messages = client.beta.threads.messages.list(
        thread_id=thread_id,
        order="desc",
        limit=1
    )
    
    response = messages.data[0].content[0].text.value
    return response, thread_id

# Usage
response, thread_id = create_and_run(
    assistant.id,
    "What are the main causes of inflation?"
)
print(f"Response: {response}")
print(f"Thread ID: {thread_id}")

# Continue conversation (persistent memory)
response2, _ = create_and_run(
    assistant.id,
    "How do central banks typically respond?",
    thread_id=thread_id  # Same thread
)
print(f"Follow-up: {response2}")

Code Interpreter: Data Analysis

import os

def analyze_data_file(assistant_id: str, file_path: str, analysis_request: str) -> dict:
    """Upload a CSV and ask the assistant to analyze it."""
    
    # Upload file
    with open(file_path, "rb") as f:
        uploaded_file = client.files.create(
            file=f,
            purpose="assistants"
        )
    
    print(f"Uploaded file: {uploaded_file.id}")
    
    # Create thread with file attachment
    thread = client.beta.threads.create()
    
    client.beta.threads.messages.create(
        thread_id=thread.id,
        role="user",
        content=analysis_request,
        attachments=[
            {
                "file_id": uploaded_file.id,
                "tools": [{"type": "code_interpreter"}]
            }
        ]
    )
    
    # Run
    run = client.beta.threads.runs.create_and_poll(
        thread_id=thread.id,
        assistant_id=assistant_id
    )
    
    # Get response and any generated files
    messages = client.beta.threads.messages.list(thread_id=thread.id, order="desc")
    
    result = {
        "text_response": "",
        "generated_files": []
    }
    
    for content_block in messages.data[0].content:
        if content_block.type == "text":
            result["text_response"] = content_block.text.value
        elif content_block.type == "image_file":
            # Download generated chart/image
            file_id = content_block.image_file.file_id
            file_data = client.files.content(file_id)
            chart_path = f"chart_{file_id}.png"
            with open(chart_path, "wb") as f:
                f.write(file_data.read())
            result["generated_files"].append(chart_path)
            print(f"Chart saved: {chart_path}")
    
    return result

# Example: Analyze sales data
result = analyze_data_file(
    assistant.id,
    "sales_data.csv",
    "Analyze this sales data: show top 10 products by revenue, "
    "calculate month-over-month growth, and create a bar chart."
)
print(result["text_response"])

File Search: Document Q&A

def setup_document_assistant(
    documents: list[str],  # File paths
    assistant_name: str = "Document Assistant"
) -> tuple[str, str]:
    """Create an assistant with RAG over documents."""
    
    # Create vector store
    vector_store = client.beta.vector_stores.create(
        name=f"{assistant_name} Knowledge Base"
    )
    
    # Upload all files
    file_ids = []
    for doc_path in documents:
        with open(doc_path, "rb") as f:
            uploaded = client.files.create(file=f, purpose="assistants")
            file_ids.append(uploaded.id)
            print(f"Uploaded: {doc_path} → {uploaded.id}")
    
    # Add files to vector store
    client.beta.vector_stores.file_batches.create_and_poll(
        vector_store_id=vector_store.id,
        file_ids=file_ids
    )
    
    # Create assistant with vector store
    assistant = client.beta.assistants.create(
        name=assistant_name,
        instructions="""You are a helpful assistant that answers questions based on 
        the provided documents. Always cite which document your information comes from.
        If the answer isn't in the documents, say so clearly.""",
        model="gpt-4o-mini",
        tools=[{"type": "file_search"}],
        tool_resources={
            "file_search": {
                "vector_store_ids": [vector_store.id]
            }
        }
    )
    
    return assistant.id, vector_store.id

# Setup with your documents
assistant_id, vs_id = setup_document_assistant([
    "product_manual.pdf",
    "faq.txt",
    "policies.docx"
])

# Now answer questions from documents
response, thread_id = create_and_run(
    assistant_id,
    "What is the warranty period for the product?"
)
print(response)  # Answers from your documents with citations

Function Calling in Assistants API

import json

# Create assistant with custom functions
assistant_with_functions = client.beta.assistants.create(
    name="Database Assistant",
    instructions="Help users query the database. Use the query_database tool.",
    model="gpt-4o-mini",
    tools=[
        {
            "type": "function",
            "function": {
                "name": "query_database",
                "description": "Query the database for customer information",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "customer_id": {"type": "string"},
                        "fields": {
                            "type": "array",
                            "items": {"type": "string"},
                            "description": "Fields to return: name, email, orders, etc."
                        }
                    },
                    "required": ["customer_id"]
                }
            }
        }
    ]
)

def handle_requires_action(thread_id: str, run_id: str) -> str:
    """Handle function calling in the Assistants API."""
    
    run = client.beta.threads.runs.retrieve(thread_id=thread_id, run_id=run_id)
    
    if run.required_action and run.required_action.type == "submit_tool_outputs":
        tool_outputs = []
        
        for tool_call in run.required_action.submit_tool_outputs.tool_calls:
            func_name = tool_call.function.name
            args = json.loads(tool_call.function.arguments)
            
            # Execute your function
            if func_name == "query_database":
                # Your actual database query here
                result = {"name": "John Smith", "email": "john@example.com", "orders": 15}
                output = json.dumps(result)
            else:
                output = "Function not found"
            
            tool_outputs.append({
                "tool_call_id": tool_call.id,
                "output": output
            })
        
        # Submit results back to run
        run = client.beta.threads.runs.submit_tool_outputs_and_poll(
            thread_id=thread_id,
            run_id=run_id,
            tool_outputs=tool_outputs
        )
    
    # Get final response
    messages = client.beta.threads.messages.list(thread_id=thread_id, order="desc", limit=1)
    return messages.data[0].content[0].text.value

Streaming Responses

def stream_assistant_response(assistant_id: str, message: str, thread_id: str | None = None) -> str:
    """Stream response for real-time display."""
    
    if not thread_id:
        thread = client.beta.threads.create()
        thread_id = thread.id
    
    client.beta.threads.messages.create(
        thread_id=thread_id,
        role="user",
        content=message
    )
    
    full_response = ""
    
    with client.beta.threads.runs.stream(
        thread_id=thread_id,
        assistant_id=assistant_id
    ) as stream:
        for text in stream.text_deltas:
            print(text, end="", flush=True)
            full_response += text
    
    print()  # Newline
    return full_response, thread_id

Conclusion

The Assistants API is the right choice when you need Code Interpreter (data analysis without infrastructure), File Search (document Q&A without building RAG), or managed conversation state. The tradeoff is less control and vendor lock-in to OpenAI.

For production systems where you need flexibility — custom models, custom RAG, multi-model routing — use Chat Completions API with your own state management. For applications where OpenAI-managed infrastructure saves significant engineering time, the Assistants API earns its place.

For custom agents with full control, see our LangGraph agent tutorial. For the Chat Completions API, see our OpenAI API integration guide.

Frequently Asked Questions

The Assistants API is OpenAI's managed agent platform. Unlike the Chat Completions API where you manage conversation history and tool execution yourself, the Assistants API manages Threads (conversation history), Messages (individual turns), Runs (LLM executions), and Run Steps (individual tool calls) server-side. Built-in tools: Code Interpreter (executes Python, creates files/charts), File Search (RAG over uploaded files), Function Calling (custom tools). Use it when you want OpenAI to manage stateful conversations, need Code Interpreter without building it yourself, or need file upload and search out of the box.

AiTechWorlds Team

✓ Verified Writer

The AiTechWorlds team is passionate about AI, technology, and education. We create high-quality, research-backed content to help you learn, grow, and succeed in the modern digital world.

📱 Follow on Telegram 🐦 Follow on X Learn More →

AI agent workflow automation on development screen — ai agent memory and planning ai agent memory planning

AI Learning

AI Agent Memory and Planning: How Agents Remember and Reason About Long Tasks

AI agent memory and planning explained — how agents store context across sessions, plan multi-step tasks, and use working memory, episodic memory, and semantic memory effectively.

May 27, 2026 8 min read

AI agent workflow automation on development screen — ai agents explained

AI Learning

🔥 Trending

AI Agents Explained: How Autonomous AI Systems Work and What They Can Do

AI agents explained — how autonomous AI systems perceive, reason, and act to complete complex tasks, the architectures powering them, and practical examples from ReAct to LangGraph.

May 27, 2026 7 min read

AI agent workflow automation on development screen — ai agents and the future of work ai agents future work

AI Learning

AI Agents and the Future of Work: What's Actually Changing in 2025-2030

AI agents and the future of work — what tasks are being automated, which jobs are transforming, and what skills matter most as autonomous agents reshape knowledge work.

May 27, 2026 9 min read

AI agent workflow automation on development screen — will ai agents replace software developers

AI Learning

🔥 Trending

Will AI Agents Replace Software Developers? The Honest Technical Analysis

Will AI agents replace software developers? An honest technical analysis of what AI agents can and can't do, current limitations, and what skills remain uniquely human in 2025.

May 27, 2026 8 min read

Go deeper on this topic

NotesPrompt Engineering Cheat Sheet NotesLLM Core Concepts Explained NotesChatGPT Tips & Tricks Cheat Sheet NotesTransformer Architecture Cheat Sheet NotesPrompt Engineering vs Fine-Tuning vs RLHF NotesRAG: Retrieval-Augmented Generation Guide

10K+ Members Growing Daily

Get Free AI Notes Daily

Join AiTechWorlds on Telegram and get daily AI tips, prompt engineering templates, coding resources, and exclusive content — 100% free!

📚 Free Study Notes🤖 AI Tips Daily⚡ Prompt Templates💻 Coding Resources

Join Free Channel

No spam. Leave anytime.

Agent Development

OpenAI Assistants API Guide: Build Agents with Code Interpreter and File Search

⚡ Quick Answer

OpenAI Assistants API guide — build AI agents with persistent threads, Code Interpreter, File Search, and function calling. Complete Python tutorial with production patterns.

AiTechWorlds Team May 27, 2026 7 min read

#openai-assistants-api #openai-assistants #code-interpreter-api #agent-development

📚Part of the Agent Development guide — explore all Agent Development articles→

Share:Facebook Twitter/X LinkedIn Telegram WhatsApp

📱

Get more content like this on Telegram!

Daily AI tips, notes & resources — free

Join Free →

OpenAI Assistants API Guide: Build Agents with Code Interpreter and File Search

Code Interpreter had it running in an afternoon. The custom pipeline would have taken a week, plus ongoing maintenance.

That's the proposition of the Assistants API: trade flexibility and control for managed infrastructure. Here's when it's worth it and how to use it.

Core Concepts

Assistants API Object Model:

Assistant
  → Configured once: model, instructions, tools, files
  → Reusable across many conversations

Thread
  → One per user conversation
  → Stores all messages automatically
  → Persists until deleted
  → Handles context overflow automatically

Message
  → User or Assistant turn in a thread
  → Can contain text, images, files

Run
  → One execution of the LLM against the thread
  → Has status: queued → in_progress → completed/failed
  → Must poll or use webhooks for completion

Run Step
  → Individual action within a run
  → Tool call, message creation, etc.

Setup and First Assistant

from openai import OpenAI
import time

client = OpenAI()

# Create an assistant (do this once, reuse the ID)
assistant = client.beta.assistants.create(
    name="Research Assistant",
    instructions="""You are a helpful research assistant. 
    Use Code Interpreter for data analysis and calculations.
    Use File Search to answer questions from uploaded documents.
    Always provide sources when using document information.""",
    model="gpt-4o",
    tools=[
        {"type": "code_interpreter"},
        {"type": "file_search"}
    ]
)

print(f"Assistant created: {assistant.id}")
# Save this ID — reference it instead of recreating

# List existing assistants
assistants = client.beta.assistants.list()
for a in assistants.data:
    print(f"{a.id}: {a.name}")

# Update assistant
client.beta.assistants.update(
    assistant.id,
    instructions="Updated instructions here"
)

Threads and Messages

def create_and_run(
    assistant_id: str,
    user_message: str,
    thread_id: str | None = None,
    attachments: list | None = None
) -> tuple[str, str]:
    """
    Start or continue a conversation.
    Returns (response_text, thread_id)
    """
    
    # Create new thread or use existing
    if not thread_id:
        thread = client.beta.threads.create()
        thread_id = thread.id
    
    # Add user message
    message_params = {
        "thread_id": thread_id,
        "role": "user",
        "content": user_message
    }
    if attachments:
        message_params["attachments"] = attachments
    
    client.beta.threads.messages.create(**message_params)
    
    # Start a run
    run = client.beta.threads.runs.create(
        thread_id=thread_id,
        assistant_id=assistant_id,
        # Override assistant settings if needed:
        # temperature=0.5,
        # max_completion_tokens=1000
    )
    
    # Poll until complete
    while run.status in ["queued", "in_progress", "cancelling"]:
        time.sleep(1)
        run = client.beta.threads.runs.retrieve(
            thread_id=thread_id,
            run_id=run.id
        )
    
    if run.status == "failed":
        raise RuntimeError(f"Run failed: {run.last_error}")
    
    if run.status == "requires_action":
        # Handle function calling (see next section)
        pass
    
    # Get messages
    messages = client.beta.threads.messages.list(
        thread_id=thread_id,
        order="desc",
        limit=1
    )
    
    response = messages.data[0].content[0].text.value
    return response, thread_id

# Usage
response, thread_id = create_and_run(
    assistant.id,
    "What are the main causes of inflation?"
)
print(f"Response: {response}")
print(f"Thread ID: {thread_id}")

# Continue conversation (persistent memory)
response2, _ = create_and_run(
    assistant.id,
    "How do central banks typically respond?",
    thread_id=thread_id  # Same thread
)
print(f"Follow-up: {response2}")

Code Interpreter: Data Analysis

import os

def analyze_data_file(assistant_id: str, file_path: str, analysis_request: str) -> dict:
    """Upload a CSV and ask the assistant to analyze it."""
    
    # Upload file
    with open(file_path, "rb") as f:
        uploaded_file = client.files.create(
            file=f,
            purpose="assistants"
        )
    
    print(f"Uploaded file: {uploaded_file.id}")
    
    # Create thread with file attachment
    thread = client.beta.threads.create()
    
    client.beta.threads.messages.create(
        thread_id=thread.id,
        role="user",
        content=analysis_request,
        attachments=[
            {
                "file_id": uploaded_file.id,
                "tools": [{"type": "code_interpreter"}]
            }
        ]
    )
    
    # Run
    run = client.beta.threads.runs.create_and_poll(
        thread_id=thread.id,
        assistant_id=assistant_id
    )
    
    # Get response and any generated files
    messages = client.beta.threads.messages.list(thread_id=thread.id, order="desc")
    
    result = {
        "text_response": "",
        "generated_files": []
    }
    
    for content_block in messages.data[0].content:
        if content_block.type == "text":
            result["text_response"] = content_block.text.value
        elif content_block.type == "image_file":
            # Download generated chart/image
            file_id = content_block.image_file.file_id
            file_data = client.files.content(file_id)
            chart_path = f"chart_{file_id}.png"
            with open(chart_path, "wb") as f:
                f.write(file_data.read())
            result["generated_files"].append(chart_path)
            print(f"Chart saved: {chart_path}")
    
    return result

# Example: Analyze sales data
result = analyze_data_file(
    assistant.id,
    "sales_data.csv",
    "Analyze this sales data: show top 10 products by revenue, "
    "calculate month-over-month growth, and create a bar chart."
)
print(result["text_response"])

File Search: Document Q&A

def setup_document_assistant(
    documents: list[str],  # File paths
    assistant_name: str = "Document Assistant"
) -> tuple[str, str]:
    """Create an assistant with RAG over documents."""
    
    # Create vector store
    vector_store = client.beta.vector_stores.create(
        name=f"{assistant_name} Knowledge Base"
    )
    
    # Upload all files
    file_ids = []
    for doc_path in documents:
        with open(doc_path, "rb") as f:
            uploaded = client.files.create(file=f, purpose="assistants")
            file_ids.append(uploaded.id)
            print(f"Uploaded: {doc_path} → {uploaded.id}")
    
    # Add files to vector store
    client.beta.vector_stores.file_batches.create_and_poll(
        vector_store_id=vector_store.id,
        file_ids=file_ids
    )
    
    # Create assistant with vector store
    assistant = client.beta.assistants.create(
        name=assistant_name,
        instructions="""You are a helpful assistant that answers questions based on 
        the provided documents. Always cite which document your information comes from.
        If the answer isn't in the documents, say so clearly.""",
        model="gpt-4o-mini",
        tools=[{"type": "file_search"}],
        tool_resources={
            "file_search": {
                "vector_store_ids": [vector_store.id]
            }
        }
    )
    
    return assistant.id, vector_store.id

# Setup with your documents
assistant_id, vs_id = setup_document_assistant([
    "product_manual.pdf",
    "faq.txt",
    "policies.docx"
])

# Now answer questions from documents
response, thread_id = create_and_run(
    assistant_id,
    "What is the warranty period for the product?"
)
print(response)  # Answers from your documents with citations

Function Calling in Assistants API

import json

# Create assistant with custom functions
assistant_with_functions = client.beta.assistants.create(
    name="Database Assistant",
    instructions="Help users query the database. Use the query_database tool.",
    model="gpt-4o-mini",
    tools=[
        {
            "type": "function",
            "function": {
                "name": "query_database",
                "description": "Query the database for customer information",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "customer_id": {"type": "string"},
                        "fields": {
                            "type": "array",
                            "items": {"type": "string"},
                            "description": "Fields to return: name, email, orders, etc."
                        }
                    },
                    "required": ["customer_id"]
                }
            }
        }
    ]
)

def handle_requires_action(thread_id: str, run_id: str) -> str:
    """Handle function calling in the Assistants API."""
    
    run = client.beta.threads.runs.retrieve(thread_id=thread_id, run_id=run_id)
    
    if run.required_action and run.required_action.type == "submit_tool_outputs":
        tool_outputs = []
        
        for tool_call in run.required_action.submit_tool_outputs.tool_calls:
            func_name = tool_call.function.name
            args = json.loads(tool_call.function.arguments)
            
            # Execute your function
            if func_name == "query_database":
                # Your actual database query here
                result = {"name": "John Smith", "email": "john@example.com", "orders": 15}
                output = json.dumps(result)
            else:
                output = "Function not found"
            
            tool_outputs.append({
                "tool_call_id": tool_call.id,
                "output": output
            })
        
        # Submit results back to run
        run = client.beta.threads.runs.submit_tool_outputs_and_poll(
            thread_id=thread_id,
            run_id=run_id,
            tool_outputs=tool_outputs
        )
    
    # Get final response
    messages = client.beta.threads.messages.list(thread_id=thread_id, order="desc", limit=1)
    return messages.data[0].content[0].text.value

Streaming Responses

def stream_assistant_response(assistant_id: str, message: str, thread_id: str | None = None) -> str:
    """Stream response for real-time display."""
    
    if not thread_id:
        thread = client.beta.threads.create()
        thread_id = thread.id
    
    client.beta.threads.messages.create(
        thread_id=thread_id,
        role="user",
        content=message
    )
    
    full_response = ""
    
    with client.beta.threads.runs.stream(
        thread_id=thread_id,
        assistant_id=assistant_id
    ) as stream:
        for text in stream.text_deltas:
            print(text, end="", flush=True)
            full_response += text
    
    print()  # Newline
    return full_response, thread_id

Conclusion

For custom agents with full control, see our LangGraph agent tutorial. For the Chat Completions API, see our OpenAI API integration guide.

Frequently Asked Questions

AiTechWorlds Team

✓ Verified Writer

The AiTechWorlds team is passionate about AI, technology, and education. We create high-quality, research-backed content to help you learn, grow, and succeed in the modern digital world.

📱 Follow on Telegram 🐦 Follow on X Learn More →

AI Learning

AI Agent Memory and Planning: How Agents Remember and Reason About Long Tasks

AI agent memory and planning explained — how agents store context across sessions, plan multi-step tasks, and use working memory, episodic memory, and semantic memory effectively.

May 27, 2026 8 min read

AI Learning

🔥 Trending

AI Agents Explained: How Autonomous AI Systems Work and What They Can Do

AI agents explained — how autonomous AI systems perceive, reason, and act to complete complex tasks, the architectures powering them, and practical examples from ReAct to LangGraph.

May 27, 2026 7 min read

AI Learning

AI Agents and the Future of Work: What's Actually Changing in 2025-2030

AI agents and the future of work — what tasks are being automated, which jobs are transforming, and what skills matter most as autonomous agents reshape knowledge work.

May 27, 2026 9 min read

AI Learning

🔥 Trending

Will AI Agents Replace Software Developers? The Honest Technical Analysis

Will AI agents replace software developers? An honest technical analysis of what AI agents can and can't do, current limitations, and what skills remain uniquely human in 2025.

May 27, 2026 8 min read

Go deeper on this topic

10K+ Members Growing Daily

Get Free AI Notes Daily

Join AiTechWorlds on Telegram and get daily AI tips, prompt engineering templates, coding resources, and exclusive content — 100% free!

📚 Free Study Notes🤖 AI Tips Daily⚡ Prompt Templates💻 Coding Resources

Join Free Channel

No spam. Leave anytime.

OpenAI Assistants API Guide: Build Agents with Code Interpreter and File Search

OpenAI Assistants API Guide: Build Agents with Code Interpreter and File Search

Core Concepts

Setup and First Assistant

Threads and Messages

Code Interpreter: Data Analysis

File Search: Document Q&A

Function Calling in Assistants API

Streaming Responses

Conclusion

Further Reading

💬 DiscussionPowered by GitHub Discussions

Frequently Asked Questions

AiTechWorlds Team

Related Articles

AI Agent Memory and Planning: How Agents Remember and Reason About Long Tasks

AI Agents Explained: How Autonomous AI Systems Work and What They Can Do

AI Agents and the Future of Work: What's Actually Changing in 2025-2030

Will AI Agents Replace Software Developers? The Honest Technical Analysis

Go deeper on this topic

Get Free AI Notes Daily

OpenAI Assistants API Guide: Build Agents with Code Interpreter and File Search

OpenAI Assistants API Guide: Build Agents with Code Interpreter and File Search

Core Concepts

Setup and First Assistant

Threads and Messages

Code Interpreter: Data Analysis

File Search: Document Q&A

Function Calling in Assistants API

Streaming Responses

Conclusion

Further Reading

💬 DiscussionPowered by GitHub Discussions

Frequently Asked Questions

AiTechWorlds Team

Related Articles

AI Agent Memory and Planning: How Agents Remember and Reason About Long Tasks

AI Agents Explained: How Autonomous AI Systems Work and What They Can Do

AI Agents and the Future of Work: What's Actually Changing in 2025-2030

Will AI Agents Replace Software Developers? The Honest Technical Analysis

Go deeper on this topic

Get Free AI Notes Daily