How to Set Up AutoGPT with Pinecone (Persistent Memory)
Step-by-step guide to configuring AutoGPT with Pinecone for persistent long-term memory. Covers Pinecone setup, memory.json config, and memory_backend settings.
Get more content like this on Telegram!
Daily AI tips, notes & resources — free
One of my early frustrations with AutoGPT was watching it re-research things it had already figured out. Every new session started from zero. The agent would spend 10 minutes rediscovering facts it had already found the day before, burning API tokens on ground it had already covered.
Pinecone fixes this. Once you configure AutoGPT to use Pinecone as its memory backend, every piece of information the agent encounters gets stored as a vector embedding. When it starts a new session, it can retrieve relevant memories and build on prior context instead of starting fresh.
This guide walks through the complete setup: creating a Pinecone index, configuring AutoGPT's environment, understanding the memory configuration files, and verifying that memory persists correctly across sessions. For background on why vector storage matters for agents, the Vector database guide is worth reading first.
Understanding AutoGPT Memory Architecture
Before diving into setup, it helps to understand what AutoGPT actually stores as memory.
When AutoGPT processes information — from a web search, a file it read, or a conclusion it reached — it can store a summary or raw text as a memory. That text gets converted to a vector embedding (a high-dimensional numerical representation) and stored in the configured backend. On retrieval, AutoGPT embeds its current query and finds the most semantically similar stored memories.
The default local memory backend writes to a JSON file. It works but has obvious limitations: it doesn't scale beyond a few thousand memories, and it's tied to a single machine. Pinecone is a hosted vector database purpose-built for exactly this kind of semantic search at scale.
Memory Backend Comparison
| Backend | Persistent | Scalable | Setup Complexity | Cost |
|---|---|---|---|---|
| Local (JSON) | No (session only) | No | None | Free |
| Redis | Yes | Medium | Medium | Self-hosted or $5+/mo |
| Pinecone | Yes | High | Low | Free tier available |
| Milvus | Yes | Very high | High | Self-hosted |
| Weaviate | Yes | High | Medium | Free tier available |
Pinecone wins on the setup-to-capability ratio. Five minutes of configuration gives you a production-grade vector store.
Step 1: Create a Pinecone Account and Index
Go to pinecone.io and create a free account. Once logged in:
- Click Create Index
- Set Index Name to
autogpt-memory(you'll reference this in config) - Set Dimensions to
1536(matches OpenAI'stext-embedding-ada-002output) - Set Metric to
cosine - Choose Serverless (free tier) in the us-east-1 region
- Click Create Index
Copy your API key from the API Keys section in the left sidebar. You'll need it shortly.
Your Pinecone environment string will look like us-east-1 for serverless indexes. Note this as well.
Step 2: Install AutoGPT with Dependencies
Clone the repository if you haven't already:
git clone https://github.com/Significant-Gravitas/AutoGPT.git
cd AutoGPT/classic
pip install -r requirements.txt
Install the Pinecone client:
pip install pinecone-client>=3.0.0
Note the >=3.0.0 — Pinecone's Python SDK had a major breaking change in v3. AutoGPT's current versions expect the v3+ API where the import is from pinecone import Pinecone rather than the old import pinecone.
Step 3: Configure the .env File
AutoGPT uses a .env file for all configuration. Copy the template:
cp .env.template .env
Open .env and configure these settings:
################################################################################
### LLM PROVIDER
################################################################################
OPENAI_API_KEY=sk-your-openai-key-here
################################################################################
### MEMORY BACKEND
################################################################################
# Options: local, redis, pinecone, milvus, no_memory
MEMORY_BACKEND=pinecone
################################################################################
### PINECONE CONFIGURATION
################################################################################
PINECONE_API_KEY=your-pinecone-api-key-here
PINECONE_ENV=us-east-1
# The name of the index you created in Step 1
PINECONE_INDEX_NAME=autogpt-memory
################################################################################
### EMBEDDING MODEL
################################################################################
# This must match the dimensions of your Pinecone index (1536 for ada-002)
EMBEDDING_MODEL=text-embedding-ada-002
################################################################################
### AGENT SETTINGS
################################################################################
# Restrict memory to prevent context overflow
MEMORY_INDEX=autogpt
# Number of memories to retrieve per query
MEMORY_RECALL_COUNT=5
The MEMORY_RECALL_COUNT=5 setting controls how many memories AutoGPT retrieves when it queries its memory. More memories give the agent more context but consume more of the prompt window. Five is a reasonable starting point.
Step 4: Verify Pinecone Connection
Before running AutoGPT, verify the connection:
# verify_pinecone.py
import os
from dotenv import load_dotenv
from pinecone import Pinecone
load_dotenv()
pc = Pinecone(api_key=os.getenv("PINECONE_API_KEY"))
# List existing indexes
indexes = pc.list_indexes()
print("Available indexes:", [idx.name for idx in indexes])
# Check your specific index
index_name = os.getenv("PINECONE_INDEX_NAME", "autogpt-memory")
if index_name in [idx.name for idx in indexes]:
index = pc.Index(index_name)
stats = index.describe_index_stats()
print(f"\nIndex '{index_name}' stats:")
print(f" Total vectors: {stats.total_vector_count}")
print(f" Dimension: {stats.dimension}")
print("Connection successful!")
else:
print(f"Index '{index_name}' not found. Check your index name.")
Run it:
python verify_pinecone.py
Expected output:
Available indexes: ['autogpt-memory']
Index 'autogpt-memory' stats:
Total vectors: 0
Dimension: 1536
Connection successful!
Step 5: Understanding memory.json
AutoGPT also maintains a memory.json file in its working directory. This file tracks metadata about the agent's state, not the actual vector embeddings. Here's what it contains:
{
"agent_name": "ResearchBot",
"agent_role": "A specialized research agent focused on technology trends",
"agent_goals": [
"Research emerging AI frameworks",
"Track funding news in the AI space",
"Summarize findings weekly"
],
"memories": [
{
"id": "mem_001",
"timestamp": "2026-05-31T10:23:15Z",
"type": "research",
"summary": "Found that AutoGen 0.4 released in Q4 2025 with major API changes",
"source": "https://microsoft.github.io/autogen/",
"relevance_score": 0.94
}
],
"session_count": 7,
"last_run": "2026-05-31T10:45:00Z"
}
The memory.json serves as a lightweight index and audit trail. The actual semantic content lives in Pinecone. When AutoGPT stores a memory, it:
- Writes metadata to
memory.json - Generates an embedding via the OpenAI API
- Upserts the embedding + text into Pinecone
- Uses the Pinecone vector ID to link
memory.jsonto the vector store
Step 6: Manually Writing and Retrieving Memories
Understanding the memory API directly helps when debugging or pre-seeding memory:
# memory_operations.py
import os
from dotenv import load_dotenv
from pinecone import Pinecone
import openai
load_dotenv()
# Initialize clients
pc = Pinecone(api_key=os.getenv("PINECONE_API_KEY"))
openai_client = openai.OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
index = pc.Index(os.getenv("PINECONE_INDEX_NAME", "autogpt-memory"))
def embed_text(text: str) -> list[float]:
"""Generate an embedding vector for text using OpenAI."""
response = openai_client.embeddings.create(
input=text,
model="text-embedding-ada-002"
)
return response.data[0].embedding
def save_memory(memory_id: str, text: str, metadata: dict = None) -> None:
"""Store a text memory in Pinecone."""
vector = embed_text(text)
index.upsert(
vectors=[{
"id": memory_id,
"values": vector,
"metadata": {
"text": text,
**(metadata or {})
}
}]
)
print(f"Saved memory: {memory_id}")
def recall_memories(query: str, top_k: int = 5) -> list[dict]:
"""Retrieve the most relevant memories for a query."""
query_vector = embed_text(query)
results = index.query(
vector=query_vector,
top_k=top_k,
include_metadata=True
)
return [
{
"id": match.id,
"score": match.score,
"text": match.metadata.get("text", "")
}
for match in results.matches
]
# Pre-seed memory with context
save_memory(
"mem_001",
"AutoGen 0.4 introduced the AgentChat API with cleaner multi-agent patterns",
{"type": "research", "topic": "AutoGen"}
)
save_memory(
"mem_002",
"Pinecone serverless is available on the free tier with up to 100K vectors",
{"type": "research", "topic": "Pinecone"}
)
# Test recall
memories = recall_memories("What do I know about AutoGen?")
print("\nRecalled memories:")
for m in memories:
print(f" [{m['score']:.3f}] {m['text']}")
Running this pre-seeds your Pinecone index so AutoGPT has context from the start of its first session.
Step 7: Running AutoGPT with Pinecone Memory
Start AutoGPT normally. The memory backend configuration in .env handles the rest:
python -m autogpt --ai-name "ResearchBot" --ai-role "A research agent" --ai-goals "Research AI agent frameworks" --ai-goals "Track new framework releases in 2026"
During execution, you'll see log lines like:
Memory Backend: pinecone
Creating memory: "Found information about AutoGen 0.4 release notes..."
Stored memory in Pinecone (ID: mem_1748694821_001)
Recalling memories for query: "AutoGen recent updates"
Retrieved 5 relevant memories
After the session, run your verification script again:
python verify_pinecone.py
You should see Total vectors is no longer 0.
Step 8: Memory Across Sessions
To test persistence, stop AutoGPT after the first session. Then run a new session with a related goal:
python -m autogpt --ai-name "ResearchBot" --ai-role "A research agent" --ai-goals "Continue researching AI frameworks based on previous findings"
The agent should retrieve memories from the previous session and reference prior findings. If you see log lines like Retrieved 5 relevant memories early in the session, persistence is working correctly.
Troubleshooting Common Issues
"Pinecone index dimension mismatch" Your index dimension (1536) must match the embedding model output. If you changed the embedding model, delete and recreate the index with the correct dimensions.
"Connection refused" or authentication errors
Double-check PINECONE_API_KEY in your .env. Pinecone API keys are environment-scoped — make sure you copied the key for the correct environment.
Memories aren't being recalled
Check MEMORY_RECALL_COUNT — if set to 0, no memories are retrieved. Also verify that MEMORY_BACKEND=pinecone is set correctly and not overridden by a local override file.
High embedding costs
Each memory save and recall generates embedding API calls. With MEMORY_RECALL_COUNT=5, every agent action triggers 1 embed + 1 vector query. On long runs this adds up. Set MEMORY_RECALL_COUNT=3 to reduce embedding calls.
Integration with Agent Frameworks
The Pinecone memory pattern translates well beyond AutoGPT. If you're using Build AI agent with LangChain or the patterns from AI agent memory and planning, the same Pinecone index can serve as the vector store for any framework. The embedding + retrieval pattern is universal.
For the full picture of how memory integrates into agentic systems, AI research agent build shows a complete implementation with similar memory patterns.
FAQ
Why does AutoGPT need persistent memory at all? By default, AutoGPT only holds memory within a single session. When you stop and restart the agent, it forgets everything it learned. Pinecone stores memory as vector embeddings that survive restarts, letting the agent build on previous research, decisions, and context across multiple sessions.
How much does Pinecone cost for AutoGPT memory? Pinecone's free tier (Starter plan) provides 1 index with 100K vectors and 5GB storage, which is more than enough for personal AutoGPT use. A typical AutoGPT session generates 50-200 memory embeddings. You'd need to run thousands of sessions before hitting the free tier limit.
Can I migrate my AutoGPT memory from local JSON to Pinecone? There is no official migration tool, but the JSON memory files store text snippets you can re-embed and upsert into Pinecone manually. The memory format is simple enough that a short Python script can handle the migration.
What is the memory_backend setting in AutoGPT?
The memory_backend setting in AutoGPT's .env file determines where memories are stored. Options include local (JSON file), redis, pinecone, and milvus. Setting it to pinecone tells AutoGPT to use Pinecone for all vector storage and retrieval operations.
Does persistent memory make AutoGPT smarter? It makes AutoGPT more consistent and context-aware across sessions, not fundamentally smarter. The agent can recall what it researched in previous sessions, avoid repeating work, and build on prior decisions. For long-running or recurring tasks, this is a meaningful improvement.
Frequently Asked Questions
AiTechWorlds Team
✓ Verified WriterThe AiTechWorlds team is passionate about AI, technology, and education. We create high-quality, research-backed content to help you learn, grow, and succeed in the modern digital world.
Related Articles
How to Use AutoGen with Milvus (Vector Database Memory)
Integrate Milvus vector database with AutoGen agents for large-scale persistent memory. Full setup guide with LangChain integration and vector DB comparison table.
10 AutoGPT Command Line Arguments (Continuous Mode, Speak)
Complete reference for AutoGPT's 10 most powerful CLI arguments. Master continuous mode, headless operation, and CI/CD integration for automated agent workflows.
10 AutoGPT Configuration Tweaks for Better Performance
10 proven AutoGPT configuration tweaks to improve speed, cut costs, and boost task success. Model selection, temperature, token limits, and workspace settings.
Build a Content Research Agent with AutoGPT (Trends, Outlines)
Build an AutoGPT content research agent that finds trending topics, analyzes SERPs, and generates SEO-ready outlines automatically — full workflow inside.