Follow AiTechWorlds on LinkedIn for professional AI content!Follow Now →

OpenAI API Integration: Complete Python Guide for Building AI Applications

OpenAI API integration guide — complete Python tutorial covering authentication, chat completions, function calling, assistants, embeddings, vision, and production best practices.

A
AiTechWorlds Team
May 27, 2026 7 min read
📱

Get more content like this on Telegram!

Daily AI tips, notes & resources — free

Join Free →

OpenAI API Integration: Complete Python Guide for Building AI Applications

The OpenAI API is one of the most well-designed developer APIs I've worked with — clear documentation, consistent behavior, and a Python SDK that handles the difficult parts. Getting your first API call working takes minutes.

Getting it production-ready — with proper error handling, cost management, streaming, and function calling — takes more. This guide covers everything from the first API call to the patterns I use in real applications.


Setup and Authentication

# Install the OpenAI Python SDK
# pip install openai

# Recommended: Use environment variables for API keys
# Never hardcode API keys in source code

# Option 1: Environment variable (recommended)
import os
os.environ["OPENAI_API_KEY"] = "sk-..."  # Usually set in .env file

# Option 2: .env file with python-dotenv
# pip install python-dotenv
from dotenv import load_dotenv
load_dotenv()  # Loads from .env file

# Option 3: Pass directly (only for testing)
from openai import OpenAI
client = OpenAI(api_key="sk-...")

# Recommended: let SDK read from environment
client = OpenAI()  # Reads OPENAI_API_KEY automatically

# Test your connection
response = client.models.list()
print([model.id for model in response.data[:5]])

Chat Completions: Core API

from openai import OpenAI

client = OpenAI()

# Basic completion
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is the difference between a list and a tuple in Python?"}
    ],
    temperature=0.7,       # 0 = deterministic, 2 = very creative
    max_tokens=500,        # Maximum response length
    n=1,                   # Number of responses to generate
)

# Access the response
message = response.choices[0].message.content
print(message)

# Token usage
print(f"Input tokens: {response.usage.prompt_tokens}")
print(f"Output tokens: {response.usage.completion_tokens}")
print(f"Total tokens: {response.usage.total_tokens}")

# Streaming
stream = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Explain decorators in Python"}],
    stream=True
)

for chunk in stream:
    content = chunk.choices[0].delta.content
    if content:
        print(content, end="", flush=True)
print()

# JSON mode (guaranteed valid JSON output)
json_response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {
            "role": "user",
            "content": "Extract: name, age, email from: 'John Smith, 30, john@example.com'"
        }
    ],
    response_format={"type": "json_object"}
)

import json
data = json.loads(json_response.choices[0].message.content)
print(data)  # {"name": "John Smith", "age": 30, "email": "john@example.com"}

Function Calling (Tool Use)

import json

# Define tools (functions the model can call)
tools = [
    {
        "type": "function",
        "function": {
            "name": "get_current_weather",
            "description": "Get the current weather in a given location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "City and country, e.g. 'London, UK'"
                    },
                    "unit": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                        "description": "Temperature unit"
                    }
                },
                "required": ["location"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "search_database",
            "description": "Search product database for items matching a query",
            "parameters": {
                "type": "object",
                "properties": {
                    "query": {"type": "string", "description": "Search query"},
                    "max_results": {"type": "integer", "description": "Maximum results to return"}
                },
                "required": ["query"]
            }
        }
    }
]

# Actual function implementations
def get_current_weather(location: str, unit: str = "celsius") -> str:
    # In production: call a real weather API
    return f"The weather in {location} is 22°{unit[0].upper()}, partly cloudy."

def search_database(query: str, max_results: int = 5) -> str:
    # In production: query your database
    return f"Found {max_results} products matching '{query}': [Product 1, Product 2...]"

def execute_tool(tool_call) -> str:
    """Execute the appropriate function based on tool call."""
    function_name = tool_call.function.name
    arguments = json.loads(tool_call.function.arguments)
    
    if function_name == "get_current_weather":
        return get_current_weather(**arguments)
    elif function_name == "search_database":
        return search_database(**arguments)
    else:
        return f"Unknown function: {function_name}"

def agent_with_tools(user_message: str) -> str:
    """Simple agent loop with tool execution."""
    
    messages = [{"role": "user", "content": user_message}]
    
    while True:
        response = client.chat.completions.create(
            model="gpt-4o-mini",
            messages=messages,
            tools=tools,
            tool_choice="auto"  # Model decides when to use tools
        )
        
        choice = response.choices[0]
        
        if choice.finish_reason == "stop":
            # Model finished without calling a tool
            return choice.message.content
        
        elif choice.finish_reason == "tool_calls":
            # Model wants to call one or more tools
            messages.append(choice.message)  # Add assistant message with tool calls
            
            for tool_call in choice.message.tool_calls:
                result = execute_tool(tool_call)
                
                # Add tool result to messages
                messages.append({
                    "role": "tool",
                    "tool_call_id": tool_call.id,
                    "content": result
                })
        else:
            break
    
    return messages[-1]["content"]

# Example
answer = agent_with_tools("What's the weather in Tokyo, and search for umbrellas?")
print(answer)

Embeddings API

# Generate embeddings for semantic search
response = client.embeddings.create(
    model="text-embedding-3-small",  # 1536 dimensions, $0.02/1M tokens
    input=[
        "Machine learning is a subset of AI.",
        "Python is great for data science.",
        "The weather is sunny today."
    ]
)

embeddings = [item.embedding for item in response.data]
print(f"Embedding dimension: {len(embeddings[0])}")  # 1536

# Semantic similarity
import numpy as np

def cosine_similarity(a, b):
    return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))

# ML and data science should be more similar than weather
sim_ml_ds = cosine_similarity(embeddings[0], embeddings[1])
sim_ml_weather = cosine_similarity(embeddings[0], embeddings[2])

print(f"ML ↔ Data Science: {sim_ml_ds:.3f}")   # ~0.82
print(f"ML ↔ Weather: {sim_ml_weather:.3f}")     # ~0.12

# Dimension reduction (Matryoshka embeddings)
response_reduced = client.embeddings.create(
    model="text-embedding-3-small",
    input="Machine learning explanation",
    dimensions=256  # Reduce from 1536 to 256 (cost savings, small quality loss)
)

Vision API

import base64

def analyze_image(image_path: str, prompt: str) -> str:
    with open(image_path, "rb") as f:
        image_data = base64.b64encode(f.read()).decode("utf-8")
    
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {
                "role": "user",
                "content": [
                    {
                        "type": "image_url",
                        "image_url": {
                            "url": f"data:image/jpeg;base64,{image_data}",
                            "detail": "high"  # or "low" for faster/cheaper
                        }
                    },
                    {"type": "text", "text": prompt}
                ]
            }
        ]
    )
    return response.choices[0].message.content

# Extract data from invoice
invoice_data = analyze_image(
    "invoice.jpg",
    "Extract all data as JSON: invoice number, date, items with prices, total."
)

Error Handling and Retries

import time
from openai import OpenAI, RateLimitError, APIConnectionError, APIStatusError

client = OpenAI(max_retries=5)  # SDK handles retries automatically

def robust_completion(messages: list, max_attempts: int = 3) -> str:
    """Completion with explicit error handling."""
    
    for attempt in range(max_attempts):
        try:
            response = client.chat.completions.create(
                model="gpt-4o-mini",
                messages=messages,
                timeout=30.0  # 30 second timeout
            )
            return response.choices[0].message.content
        
        except RateLimitError as e:
            if attempt == max_attempts - 1:
                raise
            wait_time = 2 ** attempt  # Exponential backoff
            print(f"Rate limited. Waiting {wait_time}s...")
            time.sleep(wait_time)
        
        except APIConnectionError:
            if attempt == max_attempts - 1:
                raise
            time.sleep(1)
        
        except APIStatusError as e:
            if e.status_code == 400:
                raise  # Bad request — don't retry
            if attempt == max_attempts - 1:
                raise
            time.sleep(1)
    
    raise RuntimeError("Max attempts reached")

Conclusion

The OpenAI API covers most AI application patterns: chat completions for conversations, function calling for tool-using agents, embeddings for semantic search, vision for image analysis, and the Assistants API for managed conversations with file access.

The core pattern for production: use the standard Chat Completions API with gpt-4o-mini for most tasks, add function calling when you need structured outputs or tool use, and switch to gpt-4o only when quality genuinely matters for the task.

For building complete AI applications with LangChain on top of this API, see our LangChain tutorial. For cost optimization strategies, see our LLM token pricing guide.


Frequently Asked Questions

How do I get started with the OpenAI API?

Create an account at platform.openai.com, generate an API key, pip install openai, set OPENAI_API_KEY environment variable, and call client.chat.completions.create(). The minimal working example is 8 lines. Add a spending limit before going to production.

What is function calling in OpenAI API?

Describe functions as JSON schemas; the model generates structured JSON arguments when it decides to call a function; you execute the function and return results. This enables reliable structured extraction, tool-using agents, and connecting LLMs to external systems. Far more reliable than asking models to return JSON in message content.

What is the Assistants API and when should I use it?

Manages conversation threads, file uploads, and tool execution server-side. Use it for built-in Code Interpreter, managed file search (RAG), and avoiding custom conversation state management. Use Chat Completions for custom RAG, multi-model systems, or when you need full control.

How do I handle rate limits and errors?

The SDK retries automatically with exponential backoff (set max_retries=5). For custom handling: catch RateLimitError for 429s, APIConnectionError for network issues, APIStatusError for other HTTP errors. Never retry on 400 BadRequestError.

How do I reduce OpenAI API costs in production?

Use gpt-4o-mini instead of gpt-4o (33× cheaper). Control output length with explicit instructions. Use Batch API for non-real-time work (50% discount). Cache responses for repeated queries. Send only relevant context via RAG instead of full documents.

Share this article:

Frequently Asked Questions

Create an account at platform.openai.com, generate an API key, install the Python SDK (pip install openai), set your API key as an environment variable (OPENAI_API_KEY), and make your first call. The minimal working example is 8 lines of Python. For production: never hardcode API keys, always use environment variables or a secrets manager. Add retry logic (the SDK does this automatically with max_retries), implement rate limit handling, and monitor token usage. The free tier is limited — add a payment method and set a monthly spending limit to avoid surprises.
A

AiTechWorlds Team

✓ Verified Writer

The AiTechWorlds team is passionate about AI, technology, and education. We create high-quality, research-backed content to help you learn, grow, and succeed in the modern digital world.

Related Articles

10K+ Members Growing Daily

Get Free AI Notes Daily

Join AiTechWorlds on Telegram and get daily AI tips, prompt engineering templates, coding resources, and exclusive content — 100% free!

📚 Free Study Notes🤖 AI Tips Daily⚡ Prompt Templates💻 Coding Resources
Join Free Channel

No spam. Leave anytime.

!