How to Use LangChain with ChromaDB (Local Vector Store)
Use LangChain with ChromaDB for persistent local embeddings — setup, metadata filtering, similarity search, and collection management with full Python code.
Get more content like this on Telegram!
Daily AI tips, notes & resources — free
When you are building a RAG system locally, ChromaDB is the fastest path from documents to searchable embeddings. No API keys for the vector store, no cloud configuration, no billing — just install the package and start indexing.
ChromaDB integrates tightly with LangChain, which means you can swap it for Pinecone or Weaviate later with minimal code changes. But for development, prototyping, and small-to-medium production deployments, ChromaDB with persistent storage is excellent.
This guide covers everything: creating collections, adding documents with metadata, performing filtered similarity search, updating and deleting documents, and building a complete RAG pipeline on top of ChromaDB.
Why ChromaDB for Local Development
The standard vector database comparison for RAG development:
- ChromaDB — local or server mode, zero configuration, excellent LangChain integration
- FAISS — pure in-memory, no persistence, extremely fast, used in research
- Pinecone — fully managed cloud, production-grade, costs money
- Weaviate — Docker-based, hybrid search, more complex setup
ChromaDB wins for local work because it supports persistence (unlike FAISS), requires no external server (unlike Weaviate by default), and costs nothing. When your project grows beyond a single machine, migrating to Pinecone takes about 20 lines of code change.
According to ChromaDB's own benchmarks, local ChromaDB handles similarity search across 1 million 1536-dimensional vectors in under 100ms on modern hardware.
Setup
pip install langchain langchain-openai langchain-community chromadb
import os
os.environ["OPENAI_API_KEY"] = "your-key-here"
Step 1: Basic ChromaDB Setup
from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings
from langchain_core.documents import Document
# Initialize embeddings
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
# Create documents
documents = [
Document(
page_content="LangChain is a framework for building LLM applications.",
metadata={"source": "overview.txt", "category": "framework", "year": 2023}
),
Document(
page_content="ChromaDB is an open-source vector database for AI applications.",
metadata={"source": "chroma_docs.txt", "category": "database", "year": 2023}
),
Document(
page_content="RAG combines retrieval with generation for accurate LLM responses.",
metadata={"source": "rag_guide.txt", "category": "technique", "year": 2023}
),
Document(
page_content="OpenAI text-embedding-3-small produces 1536-dimensional vectors.",
metadata={"source": "openai_docs.txt", "category": "embedding", "year": 2024}
),
Document(
page_content="Vector similarity search finds documents closest to a query embedding.",
metadata={"source": "search_guide.txt", "category": "technique", "year": 2024}
),
]
# Create vector store with persistence
vectorstore = Chroma.from_documents(
documents=documents,
embedding=embeddings,
persist_directory="./chroma_db", # saves to disk
collection_name="langchain_docs", # named collection
)
print(f"Created collection with {vectorstore._collection.count()} documents")
Step 2: Loading a Persisted Collection
After the first run, load the existing collection instead of rebuilding it:
from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
# Load existing collection from disk
vectorstore = Chroma(
collection_name="langchain_docs",
embedding_function=embeddings,
persist_directory="./chroma_db",
)
doc_count = vectorstore._collection.count()
print(f"Loaded collection with {doc_count} documents")
In ChromaDB 0.4+, persistence is automatic — data is saved after every write operation. You no longer need to call vectorstore.persist() explicitly.
Step 3: Adding Documents Incrementally
You can add documents to an existing collection without rebuilding it from scratch:
from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings
from langchain_core.documents import Document
import hashlib
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
vectorstore = Chroma(
collection_name="langchain_docs",
embedding_function=embeddings,
persist_directory="./chroma_db",
)
# New documents to add
new_docs = [
Document(
page_content="LCEL (LangChain Expression Language) uses pipe operators to compose chains.",
metadata={"source": "lcel_guide.txt", "category": "framework", "year": 2024}
),
Document(
page_content="LangChain agents can use tools like web search and code execution.",
metadata={"source": "agents_guide.txt", "category": "agents", "year": 2024}
),
]
# Generate stable IDs based on content hash to prevent duplicates
def content_hash(doc: Document) -> str:
content = doc.page_content + str(doc.metadata.get("source", ""))
return hashlib.md5(content.encode()).hexdigest()
ids = [content_hash(doc) for doc in new_docs]
# Add with explicit IDs — Chroma ignores duplicates with same ID
vectorstore.add_documents(documents=new_docs, ids=ids)
print(f"Collection now has {vectorstore._collection.count()} documents")
Step 4: Similarity Search
Basic similarity search and its variants:
# Standard similarity search — returns k most similar documents
results = vectorstore.similarity_search(
query="How do I build LLM applications?",
k=3,
)
for doc in results:
print(f"Source: {doc.metadata['source']}")
print(f"Content: {doc.page_content[:100]}")
print("---")
# Search with relevance scores — lower distance = more similar
results_with_scores = vectorstore.similarity_search_with_relevance_scores(
query="What is RAG?",
k=3,
)
for doc, score in results_with_scores:
print(f"Score: {score:.4f} | {doc.page_content[:80]}")
# Maximal Marginal Relevance (MMR) — balances relevance and diversity
# Useful when you want varied results instead of near-duplicates
mmr_results = vectorstore.max_marginal_relevance_search(
query="LangChain framework components",
k=4,
fetch_k=20, # fetch 20 candidates, return 4 most diverse
lambda_mult=0.5, # 0=max diversity, 1=max relevance
)
print("\nMMR results (diverse):")
for doc in mmr_results:
print(f" {doc.page_content[:80]}")
Step 5: Metadata Filtering
This is where ChromaDB becomes much more powerful than a basic similarity search:
from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
vectorstore = Chroma(
collection_name="langchain_docs",
embedding_function=embeddings,
persist_directory="./chroma_db",
)
# Filter by exact metadata value
framework_docs = vectorstore.similarity_search(
query="building applications",
k=5,
filter={"category": "framework"}, # only return docs where category == "framework"
)
print("Framework docs only:")
for doc in framework_docs:
print(f" {doc.metadata['source']}: {doc.page_content[:60]}")
# Filter with $in operator (multiple valid values)
tech_docs = vectorstore.similarity_search(
query="LLM tools",
k=5,
filter={"category": {"$in": ["framework", "agents", "technique"]}},
)
# Filter with comparison operators
recent_docs = vectorstore.similarity_search(
query="vector search",
k=5,
filter={"year": {"$gte": 2024}}, # only 2024 and newer
)
# Complex compound filter using $and
specific_docs = vectorstore.similarity_search(
query="embeddings",
k=5,
filter={
"$and": [
{"year": {"$gte": 2024}},
{"category": {"$ne": "database"}},
]
},
)
print("\nAvailable filter operators:")
operators = ["$eq", "$ne", "$gt", "$gte", "$lt", "$lte", "$in", "$nin", "$and", "$or"]
for op in operators:
print(f" {op}")
Step 6: Updating and Deleting Documents
import chromadb
from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
vectorstore = Chroma(
collection_name="langchain_docs",
embedding_function=embeddings,
persist_directory="./chroma_db",
)
# Access the underlying ChromaDB collection
collection = vectorstore._collection
# List all document IDs
all_docs = collection.get()
doc_ids = all_docs["ids"]
print(f"Total documents: {len(doc_ids)}")
print(f"Sample IDs: {doc_ids[:3]}")
# Delete a specific document by ID
if doc_ids:
collection.delete(ids=[doc_ids[0]])
print(f"Deleted document {doc_ids[0]}")
print(f"Remaining: {collection.count()} documents")
# Delete documents matching a filter
collection.delete(
where={"category": "embedding"} # delete all embedding category docs
)
print(f"After category deletion: {collection.count()} documents")
# Update document content (delete + re-add)
doc_to_update = Document(
page_content="LangChain LCEL is the recommended way to compose chains in 2026.",
metadata={"source": "lcel_guide.txt", "category": "framework", "year": 2026}
)
# Use a deterministic ID based on the source
update_id = hashlib.md5("lcel_guide.txt".encode()).hexdigest()
collection.delete(ids=[update_id])
vectorstore.add_documents(documents=[doc_to_update], ids=[update_id])
print("Document updated successfully")
Step 7: Collection Management
Working with multiple collections for different use cases:
import chromadb
from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings
# Access the ChromaDB client directly
client = chromadb.PersistentClient(path="./chroma_db")
# List all collections
collections = client.list_collections()
print("Existing collections:")
for col in collections:
print(f" {col.name}: {col.count()} documents")
# Create separate collections for different data sources
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
technical_store = Chroma(
collection_name="technical_docs",
embedding_function=embeddings,
persist_directory="./chroma_db",
)
product_store = Chroma(
collection_name="product_catalog",
embedding_function=embeddings,
persist_directory="./chroma_db",
)
faq_store = Chroma(
collection_name="faq",
embedding_function=embeddings,
persist_directory="./chroma_db",
)
# Delete a collection entirely
client.delete_collection("old_collection")
# Get collection stats
for store, name in [(technical_store, "technical"), (product_store, "product")]:
count = store._collection.count()
print(f"{name}: {count} documents")
Step 8: Full RAG Pipeline with ChromaDB
Putting it all together into a production-ready RAG chain:
from langchain_community.vectorstores import Chroma
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_core.documents import Document
import os
os.environ["OPENAI_API_KEY"] = "your-key-here"
# Configuration
PERSIST_DIR = "./chroma_db"
COLLECTION_NAME = "knowledge_base"
EMBED_MODEL = "text-embedding-3-small"
CHAT_MODEL = "gpt-4o-mini"
CHUNK_SIZE = 512
CHUNK_OVERLAP = 64
def build_knowledge_base(documents: list[Document]) -> Chroma:
"""Index documents into ChromaDB with chunking."""
embeddings = OpenAIEmbeddings(model=EMBED_MODEL)
splitter = RecursiveCharacterTextSplitter(
chunk_size=CHUNK_SIZE,
chunk_overlap=CHUNK_OVERLAP,
)
chunks = splitter.transform_documents(documents)
print(f"Chunked {len(documents)} documents into {len(chunks)} chunks")
vectorstore = Chroma.from_documents(
documents=chunks,
embedding=embeddings,
persist_directory=PERSIST_DIR,
collection_name=COLLECTION_NAME,
)
print(f"Indexed {vectorstore._collection.count()} chunks")
return vectorstore
def build_rag_chain(vectorstore: Chroma, k: int = 4):
"""Build a complete RAG chain on top of ChromaDB."""
retriever = vectorstore.as_retriever(
search_type="mmr", # use MMR for diversity
search_kwargs={
"k": k,
"fetch_k": k * 5,
"lambda_mult": 0.6,
},
)
llm = ChatOpenAI(model=CHAT_MODEL, temperature=0)
prompt = ChatPromptTemplate.from_messages([
("system", """You are a helpful assistant. Answer based only on the provided context.
If the context does not contain enough information, clearly state that.
Context:
{context}"""),
("human", "{question}"),
])
def format_docs_with_sources(docs: list[Document]) -> str:
formatted = []
for doc in docs:
source = doc.metadata.get("source", "Unknown")
formatted.append(f"[Source: {source}]\n{doc.page_content}")
return "\n\n---\n\n".join(formatted)
chain = (
{
"context": retriever | format_docs_with_sources,
"question": RunnablePassthrough(),
}
| prompt
| llm
| StrOutputParser()
)
return chain, retriever
# Build the pipeline
sample_docs = [
Document(
page_content="ChromaDB is an AI-native open-source embedding database. "
"It enables you to store embeddings and query them with near-neighbor search. "
"ChromaDB supports persistent storage, metadata filtering, and multiple collections.",
metadata={"source": "chromadb_overview.txt", "category": "database"}
),
Document(
page_content="LangChain's Chroma integration provides a VectorStore interface. "
"You can use similarity_search(), max_marginal_relevance_search(), "
"and as_retriever() for various retrieval strategies.",
metadata={"source": "langchain_chroma.txt", "category": "integration"}
),
Document(
page_content="For production deployments, ChromaDB supports a client-server mode "
"where the database runs as a separate process. "
"Use chromadb.HttpClient() to connect to a running ChromaDB server.",
metadata={"source": "chromadb_deployment.txt", "category": "deployment"}
),
]
vectorstore = build_knowledge_base(sample_docs)
rag_chain, retriever = build_rag_chain(vectorstore, k=3)
# Query the pipeline
questions = [
"What retrieval strategies does LangChain's Chroma integration support?",
"How do I deploy ChromaDB in production?",
"What is ChromaDB and what does it store?",
]
for question in questions:
print(f"\nQ: {question}")
answer = rag_chain.invoke(question)
print(f"A: {answer}")
Step 9: ChromaDB with Custom Embeddings
Not limited to OpenAI — use any embedding model:
from langchain_huggingface import HuggingFaceEmbeddings
from langchain_community.vectorstores import Chroma
from langchain_core.documents import Document
# Free local embeddings — no API cost
hf_embeddings = HuggingFaceEmbeddings(
model_name="sentence-transformers/all-MiniLM-L6-v2",
model_kwargs={"device": "cpu"},
encode_kwargs={"normalize_embeddings": True},
)
vectorstore = Chroma.from_documents(
documents=sample_docs,
embedding=hf_embeddings,
persist_directory="./chroma_db_local",
collection_name="local_embeddings",
)
# Works exactly the same way
results = vectorstore.similarity_search("vector database deployment", k=2)
for doc in results:
print(f"{doc.metadata['source']}: {doc.page_content[:80]}")
Comparison Table: ChromaDB vs Other Vector Stores in LangChain
| Feature | ChromaDB | FAISS | Pinecone | Weaviate | Qdrant |
|---|---|---|---|---|---|
| Local/free | Yes | Yes | No | Partial | Partial |
| Persistence | Yes | No | Managed | Yes | Yes |
| Metadata filtering | Yes | No | Yes | Yes | Yes |
| Hybrid search | No | No | Yes | Yes | Yes |
| Scale (vectors) | ~1M local | Unlimited (RAM) | Unlimited | Unlimited | Unlimited |
| Setup complexity | Very easy | Easy | Easy | Medium | Medium |
| LangChain support | Excellent | Good | Excellent | Good | Good |
Working with the Retriever Interface
The LangChain retriever interface abstracts over the underlying vector store:
from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
vectorstore = Chroma(
collection_name="langchain_docs",
embedding_function=embeddings,
persist_directory="./chroma_db",
)
# Basic retriever
retriever = vectorstore.as_retriever(search_kwargs={"k": 4})
# Retriever with metadata filter — only returns docs from specific source
filtered_retriever = vectorstore.as_retriever(
search_type="similarity",
search_kwargs={
"k": 4,
"filter": {"category": "framework"},
},
)
# MMR retriever for diverse results
mmr_retriever = vectorstore.as_retriever(
search_type="mmr",
search_kwargs={
"k": 4,
"fetch_k": 20,
"lambda_mult": 0.5,
},
)
# Use the retriever in any LangChain chain
docs = retriever.invoke("How do I build a RAG pipeline?")
print(f"Retrieved {len(docs)} documents")
Performance Tips
Use MMR when your knowledge base has redundant content. If you have scraped multiple articles on the same topic, standard similarity search returns near-duplicates. MMR ensures the retrieved documents cover different aspects.
Add relevant metadata at ingestion time. The more metadata you have, the more precisely you can filter. At minimum, store the source file path, document category, and date. This turns broad semantic search into targeted retrieval.
Pre-filter before embedding queries. If you know a query only applies to a specific document category, filter by metadata first. Chroma applies metadata filters before the similarity scoring, which improves both accuracy and speed.
Use text-embedding-3-small over ada-002. It is cheaper, faster, and performs similarly for most RAG use cases. Switch to text-embedding-3-large only if you have measured a quality gap on your specific documents.
For a complete production RAG system that uses these ChromaDB patterns, see RAG system tutorial. For vector database comparisons beyond ChromaDB, see vector database guide. For building the agent layer on top of your ChromaDB retriever, Build AI agent with LangChain shows the full stack.
Frequently Asked Questions
Does ChromaDB persist data between Python sessions?
Yes. When you initialize Chroma with persist_directory='./chroma_db', all embeddings and metadata are saved to disk automatically. In ChromaDB 0.4 and later, persistence is automatic — data is written after every operation. You no longer need to call vectorstore.persist() explicitly. Restart your Python session, create a new Chroma object pointing to the same directory, and all your documents will be there.
How do I filter ChromaDB results by metadata?
Pass a filter parameter to similarity_search(). For a simple equality filter: vectorstore.similarity_search(query, filter={'category': 'engineering'}). For complex conditions, use ChromaDB's query operators: $eq, $ne, $gt, $gte, $lt, $lte, $in, $nin, combined with $and and $or for compound conditions. Metadata filters are applied before similarity scoring, so they improve both precision and speed.
Can ChromaDB handle millions of vectors?
ChromaDB is optimized for hundreds of thousands to low millions of vectors in local mode using its HNSW index. For tens of millions of vectors or multi-server deployments, ChromaDB supports a client-server mode where the database runs as a separate process. For truly large-scale production workloads — hundreds of millions of vectors across multiple regions — consider Pinecone, Weaviate, or Qdrant, which are purpose-built for horizontal scaling.
Frequently Asked Questions
AiTechWorlds Team
✓ Verified WriterThe AiTechWorlds team is passionate about AI, technology, and education. We create high-quality, research-backed content to help you learn, grow, and succeed in the modern digital world.
Related Articles
AutoGen vs LangChain: Which for Multi-Agent Systems in 2026?
AutoGen vs LangChain for multi-agent systems in 2026 — feature comparison, same use case in both frameworks, and an honest verdict on when each wins.
AutoGPT vs LangChain Agents: Which is More Autonomous?
Compare AutoGPT's zero-shot autonomy against LangChain's ReAct agents. Discover which handles complex tasks better and when to choose each framework.
10 LangChain Retrieval Strategies for Better RAG Results
Go beyond basic similarity search with ParentDocumentRetriever, MultiQueryRetriever, EnsembleRetriever, HyDE, and 6 more LangChain retrieval strategies — with code for each.
Build a LangChain Agent with Memory and Tools (Full Example)
Build a complete LangChain conversational agent with persistent memory, multiple tools, and step-by-step trace — from setup to a production-ready implementation with code.