AI Tips Prompting Python AI Tools Web Dev ChatGPT LLM Agent Dev Reviews Notes Free Books

AiTechWorlds

local vector database storing embeddings — LangChain ChromaDB local persistent

How to Use LangChain with ChromaDB (Local Vector Store)

Q: How do I filter ChromaDB results by metadata?

Pass a where parameter to similarity_search(). For example: vectorstore.similarity_search(query, filter={'category': 'engineering'}). You can use operators like $in, $eq, $ne, $gt, $lt, and $and for complex filter conditions.

⚡ Quick Answer

Use LangChain with ChromaDB for persistent local embeddings — setup, metadata filtering, similarity search, and collection management with full Python code.

AiTechWorlds Team May 31, 2026 11 min read

#LangChain #ChromaDB #vector store #embeddings #RAG

📚Part of the Langchain guide — explore all Langchain articles→

Share:Facebook Twitter/X LinkedIn Telegram WhatsApp

📱

Get more content like this on Telegram!

Daily AI tips, notes & resources — free

Join Free →

When you are building a RAG system locally, ChromaDB is the fastest path from documents to searchable embeddings. No API keys for the vector store, no cloud configuration, no billing — just install the package and start indexing.

ChromaDB integrates tightly with LangChain, which means you can swap it for Pinecone or Weaviate later with minimal code changes. But for development, prototyping, and small-to-medium production deployments, ChromaDB with persistent storage is excellent.

This guide covers everything: creating collections, adding documents with metadata, performing filtered similarity search, updating and deleting documents, and building a complete RAG pipeline on top of ChromaDB.

Why ChromaDB for Local Development

The standard vector database comparison for RAG development:

ChromaDB — local or server mode, zero configuration, excellent LangChain integration
FAISS — pure in-memory, no persistence, extremely fast, used in research
Pinecone — fully managed cloud, production-grade, costs money
Weaviate — Docker-based, hybrid search, more complex setup

ChromaDB wins for local work because it supports persistence (unlike FAISS), requires no external server (unlike Weaviate by default), and costs nothing. When your project grows beyond a single machine, migrating to Pinecone takes about 20 lines of code change.

According to ChromaDB's own benchmarks, local ChromaDB handles similarity search across 1 million 1536-dimensional vectors in under 100ms on modern hardware.

Setup

pip install langchain langchain-openai langchain-community chromadb

import os
os.environ["OPENAI_API_KEY"] = "your-key-here"

Step 1: Basic ChromaDB Setup

from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings
from langchain_core.documents import Document

# Initialize embeddings
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

# Create documents
documents = [
    Document(
        page_content="LangChain is a framework for building LLM applications.",
        metadata={"source": "overview.txt", "category": "framework", "year": 2023}
    ),
    Document(
        page_content="ChromaDB is an open-source vector database for AI applications.",
        metadata={"source": "chroma_docs.txt", "category": "database", "year": 2023}
    ),
    Document(
        page_content="RAG combines retrieval with generation for accurate LLM responses.",
        metadata={"source": "rag_guide.txt", "category": "technique", "year": 2023}
    ),
    Document(
        page_content="OpenAI text-embedding-3-small produces 1536-dimensional vectors.",
        metadata={"source": "openai_docs.txt", "category": "embedding", "year": 2024}
    ),
    Document(
        page_content="Vector similarity search finds documents closest to a query embedding.",
        metadata={"source": "search_guide.txt", "category": "technique", "year": 2024}
    ),
]

# Create vector store with persistence
vectorstore = Chroma.from_documents(
    documents=documents,
    embedding=embeddings,
    persist_directory="./chroma_db",       # saves to disk
    collection_name="langchain_docs",      # named collection
)

print(f"Created collection with {vectorstore._collection.count()} documents")

Step 2: Loading a Persisted Collection

After the first run, load the existing collection instead of rebuilding it:

from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings

embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

# Load existing collection from disk
vectorstore = Chroma(
    collection_name="langchain_docs",
    embedding_function=embeddings,
    persist_directory="./chroma_db",
)

doc_count = vectorstore._collection.count()
print(f"Loaded collection with {doc_count} documents")

In ChromaDB 0.4+, persistence is automatic — data is saved after every write operation. You no longer need to call vectorstore.persist() explicitly.

Step 3: Adding Documents Incrementally

You can add documents to an existing collection without rebuilding it from scratch:

from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings
from langchain_core.documents import Document
import hashlib

embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

vectorstore = Chroma(
    collection_name="langchain_docs",
    embedding_function=embeddings,
    persist_directory="./chroma_db",
)

# New documents to add
new_docs = [
    Document(
        page_content="LCEL (LangChain Expression Language) uses pipe operators to compose chains.",
        metadata={"source": "lcel_guide.txt", "category": "framework", "year": 2024}
    ),
    Document(
        page_content="LangChain agents can use tools like web search and code execution.",
        metadata={"source": "agents_guide.txt", "category": "agents", "year": 2024}
    ),
]

# Generate stable IDs based on content hash to prevent duplicates
def content_hash(doc: Document) -> str:
    content = doc.page_content + str(doc.metadata.get("source", ""))
    return hashlib.md5(content.encode()).hexdigest()

ids = [content_hash(doc) for doc in new_docs]

# Add with explicit IDs — Chroma ignores duplicates with same ID
vectorstore.add_documents(documents=new_docs, ids=ids)

print(f"Collection now has {vectorstore._collection.count()} documents")

Step 4: Similarity Search

Basic similarity search and its variants:

# Standard similarity search — returns k most similar documents
results = vectorstore.similarity_search(
    query="How do I build LLM applications?",
    k=3,
)

for doc in results:
    print(f"Source: {doc.metadata['source']}")
    print(f"Content: {doc.page_content[:100]}")
    print("---")

# Search with relevance scores — lower distance = more similar
results_with_scores = vectorstore.similarity_search_with_relevance_scores(
    query="What is RAG?",
    k=3,
)

for doc, score in results_with_scores:
    print(f"Score: {score:.4f} | {doc.page_content[:80]}")

# Maximal Marginal Relevance (MMR) — balances relevance and diversity
# Useful when you want varied results instead of near-duplicates
mmr_results = vectorstore.max_marginal_relevance_search(
    query="LangChain framework components",
    k=4,
    fetch_k=20,    # fetch 20 candidates, return 4 most diverse
    lambda_mult=0.5,  # 0=max diversity, 1=max relevance
)

print("\nMMR results (diverse):")
for doc in mmr_results:
    print(f"  {doc.page_content[:80]}")

Step 5: Metadata Filtering

This is where ChromaDB becomes much more powerful than a basic similarity search:

from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings

embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
vectorstore = Chroma(
    collection_name="langchain_docs",
    embedding_function=embeddings,
    persist_directory="./chroma_db",
)

# Filter by exact metadata value
framework_docs = vectorstore.similarity_search(
    query="building applications",
    k=5,
    filter={"category": "framework"},  # only return docs where category == "framework"
)

print("Framework docs only:")
for doc in framework_docs:
    print(f"  {doc.metadata['source']}: {doc.page_content[:60]}")

# Filter with $in operator (multiple valid values)
tech_docs = vectorstore.similarity_search(
    query="LLM tools",
    k=5,
    filter={"category": {"$in": ["framework", "agents", "technique"]}},
)

# Filter with comparison operators
recent_docs = vectorstore.similarity_search(
    query="vector search",
    k=5,
    filter={"year": {"$gte": 2024}},  # only 2024 and newer
)

# Complex compound filter using $and
specific_docs = vectorstore.similarity_search(
    query="embeddings",
    k=5,
    filter={
        "$and": [
            {"year": {"$gte": 2024}},
            {"category": {"$ne": "database"}},
        ]
    },
)

print("\nAvailable filter operators:")
operators = ["$eq", "$ne", "$gt", "$gte", "$lt", "$lte", "$in", "$nin", "$and", "$or"]
for op in operators:
    print(f"  {op}")

Step 6: Updating and Deleting Documents

import chromadb
from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings

embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
vectorstore = Chroma(
    collection_name="langchain_docs",
    embedding_function=embeddings,
    persist_directory="./chroma_db",
)

# Access the underlying ChromaDB collection
collection = vectorstore._collection

# List all document IDs
all_docs = collection.get()
doc_ids = all_docs["ids"]
print(f"Total documents: {len(doc_ids)}")
print(f"Sample IDs: {doc_ids[:3]}")

# Delete a specific document by ID
if doc_ids:
    collection.delete(ids=[doc_ids[0]])
    print(f"Deleted document {doc_ids[0]}")
    print(f"Remaining: {collection.count()} documents")

# Delete documents matching a filter
collection.delete(
    where={"category": "embedding"}  # delete all embedding category docs
)
print(f"After category deletion: {collection.count()} documents")

# Update document content (delete + re-add)
doc_to_update = Document(
    page_content="LangChain LCEL is the recommended way to compose chains in 2026.",
    metadata={"source": "lcel_guide.txt", "category": "framework", "year": 2026}
)

# Use a deterministic ID based on the source
update_id = hashlib.md5("lcel_guide.txt".encode()).hexdigest()
collection.delete(ids=[update_id])
vectorstore.add_documents(documents=[doc_to_update], ids=[update_id])
print("Document updated successfully")

Step 7: Collection Management

Working with multiple collections for different use cases:

import chromadb
from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings

# Access the ChromaDB client directly
client = chromadb.PersistentClient(path="./chroma_db")

# List all collections
collections = client.list_collections()
print("Existing collections:")
for col in collections:
    print(f"  {col.name}: {col.count()} documents")

# Create separate collections for different data sources
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

technical_store = Chroma(
    collection_name="technical_docs",
    embedding_function=embeddings,
    persist_directory="./chroma_db",
)

product_store = Chroma(
    collection_name="product_catalog",
    embedding_function=embeddings,
    persist_directory="./chroma_db",
)

faq_store = Chroma(
    collection_name="faq",
    embedding_function=embeddings,
    persist_directory="./chroma_db",
)

# Delete a collection entirely
client.delete_collection("old_collection")

# Get collection stats
for store, name in [(technical_store, "technical"), (product_store, "product")]:
    count = store._collection.count()
    print(f"{name}: {count} documents")

Step 8: Full RAG Pipeline with ChromaDB

Putting it all together into a production-ready RAG chain:

from langchain_community.vectorstores import Chroma
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_core.documents import Document
import os

os.environ["OPENAI_API_KEY"] = "your-key-here"

# Configuration
PERSIST_DIR = "./chroma_db"
COLLECTION_NAME = "knowledge_base"
EMBED_MODEL = "text-embedding-3-small"
CHAT_MODEL = "gpt-4o-mini"
CHUNK_SIZE = 512
CHUNK_OVERLAP = 64


def build_knowledge_base(documents: list[Document]) -> Chroma:
    """Index documents into ChromaDB with chunking."""
    embeddings = OpenAIEmbeddings(model=EMBED_MODEL)

    splitter = RecursiveCharacterTextSplitter(
        chunk_size=CHUNK_SIZE,
        chunk_overlap=CHUNK_OVERLAP,
    )
    chunks = splitter.transform_documents(documents)
    print(f"Chunked {len(documents)} documents into {len(chunks)} chunks")

    vectorstore = Chroma.from_documents(
        documents=chunks,
        embedding=embeddings,
        persist_directory=PERSIST_DIR,
        collection_name=COLLECTION_NAME,
    )
    print(f"Indexed {vectorstore._collection.count()} chunks")
    return vectorstore


def build_rag_chain(vectorstore: Chroma, k: int = 4):
    """Build a complete RAG chain on top of ChromaDB."""
    retriever = vectorstore.as_retriever(
        search_type="mmr",         # use MMR for diversity
        search_kwargs={
            "k": k,
            "fetch_k": k * 5,
            "lambda_mult": 0.6,
        },
    )

    llm = ChatOpenAI(model=CHAT_MODEL, temperature=0)

    prompt = ChatPromptTemplate.from_messages([
        ("system", """You are a helpful assistant. Answer based only on the provided context.
If the context does not contain enough information, clearly state that.

Context:
{context}"""),
        ("human", "{question}"),
    ])

    def format_docs_with_sources(docs: list[Document]) -> str:
        formatted = []
        for doc in docs:
            source = doc.metadata.get("source", "Unknown")
            formatted.append(f"[Source: {source}]\n{doc.page_content}")
        return "\n\n---\n\n".join(formatted)

    chain = (
        {
            "context": retriever | format_docs_with_sources,
            "question": RunnablePassthrough(),
        }
        | prompt
        | llm
        | StrOutputParser()
    )

    return chain, retriever


# Build the pipeline
sample_docs = [
    Document(
        page_content="ChromaDB is an AI-native open-source embedding database. "
                     "It enables you to store embeddings and query them with near-neighbor search. "
                     "ChromaDB supports persistent storage, metadata filtering, and multiple collections.",
        metadata={"source": "chromadb_overview.txt", "category": "database"}
    ),
    Document(
        page_content="LangChain's Chroma integration provides a VectorStore interface. "
                     "You can use similarity_search(), max_marginal_relevance_search(), "
                     "and as_retriever() for various retrieval strategies.",
        metadata={"source": "langchain_chroma.txt", "category": "integration"}
    ),
    Document(
        page_content="For production deployments, ChromaDB supports a client-server mode "
                     "where the database runs as a separate process. "
                     "Use chromadb.HttpClient() to connect to a running ChromaDB server.",
        metadata={"source": "chromadb_deployment.txt", "category": "deployment"}
    ),
]

vectorstore = build_knowledge_base(sample_docs)
rag_chain, retriever = build_rag_chain(vectorstore, k=3)

# Query the pipeline
questions = [
    "What retrieval strategies does LangChain's Chroma integration support?",
    "How do I deploy ChromaDB in production?",
    "What is ChromaDB and what does it store?",
]

for question in questions:
    print(f"\nQ: {question}")
    answer = rag_chain.invoke(question)
    print(f"A: {answer}")

Step 9: ChromaDB with Custom Embeddings

Not limited to OpenAI — use any embedding model:

from langchain_huggingface import HuggingFaceEmbeddings
from langchain_community.vectorstores import Chroma
from langchain_core.documents import Document

# Free local embeddings — no API cost
hf_embeddings = HuggingFaceEmbeddings(
    model_name="sentence-transformers/all-MiniLM-L6-v2",
    model_kwargs={"device": "cpu"},
    encode_kwargs={"normalize_embeddings": True},
)

vectorstore = Chroma.from_documents(
    documents=sample_docs,
    embedding=hf_embeddings,
    persist_directory="./chroma_db_local",
    collection_name="local_embeddings",
)

# Works exactly the same way
results = vectorstore.similarity_search("vector database deployment", k=2)
for doc in results:
    print(f"{doc.metadata['source']}: {doc.page_content[:80]}")

Comparison Table: ChromaDB vs Other Vector Stores in LangChain

Feature	ChromaDB	FAISS	Pinecone	Weaviate	Qdrant
Local/free	Yes	Yes	No	Partial	Partial
Persistence	Yes	No	Managed	Yes	Yes
Metadata filtering	Yes	No	Yes	Yes	Yes
Hybrid search	No	No	Yes	Yes	Yes
Scale (vectors)	~1M local	Unlimited (RAM)	Unlimited	Unlimited	Unlimited
Setup complexity	Very easy	Easy	Easy	Medium	Medium
LangChain support	Excellent	Good	Excellent	Good	Good

Working with the Retriever Interface

The LangChain retriever interface abstracts over the underlying vector store:

from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings

embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
vectorstore = Chroma(
    collection_name="langchain_docs",
    embedding_function=embeddings,
    persist_directory="./chroma_db",
)

# Basic retriever
retriever = vectorstore.as_retriever(search_kwargs={"k": 4})

# Retriever with metadata filter — only returns docs from specific source
filtered_retriever = vectorstore.as_retriever(
    search_type="similarity",
    search_kwargs={
        "k": 4,
        "filter": {"category": "framework"},
    },
)

# MMR retriever for diverse results
mmr_retriever = vectorstore.as_retriever(
    search_type="mmr",
    search_kwargs={
        "k": 4,
        "fetch_k": 20,
        "lambda_mult": 0.5,
    },
)

# Use the retriever in any LangChain chain
docs = retriever.invoke("How do I build a RAG pipeline?")
print(f"Retrieved {len(docs)} documents")

Performance Tips

Use MMR when your knowledge base has redundant content. If you have scraped multiple articles on the same topic, standard similarity search returns near-duplicates. MMR ensures the retrieved documents cover different aspects.

Add relevant metadata at ingestion time. The more metadata you have, the more precisely you can filter. At minimum, store the source file path, document category, and date. This turns broad semantic search into targeted retrieval.

Pre-filter before embedding queries. If you know a query only applies to a specific document category, filter by metadata first. Chroma applies metadata filters before the similarity scoring, which improves both accuracy and speed.

Use text-embedding-3-small over ada-002. It is cheaper, faster, and performs similarly for most RAG use cases. Switch to text-embedding-3-large only if you have measured a quality gap on your specific documents.

For a complete production RAG system that uses these ChromaDB patterns, see RAG system tutorial. For vector database comparisons beyond ChromaDB, see vector database guide. For building the agent layer on top of your ChromaDB retriever, Build AI agent with LangChain shows the full stack.

Frequently Asked Questions

Does ChromaDB persist data between Python sessions?

Yes. When you initialize Chroma with persist_directory='./chroma_db', all embeddings and metadata are saved to disk automatically. In ChromaDB 0.4 and later, persistence is automatic — data is written after every operation. You no longer need to call vectorstore.persist() explicitly. Restart your Python session, create a new Chroma object pointing to the same directory, and all your documents will be there.

How do I filter ChromaDB results by metadata?

Pass a filter parameter to similarity_search(). For a simple equality filter: vectorstore.similarity_search(query, filter={'category': 'engineering'}). For complex conditions, use ChromaDB's query operators: $eq, $ne, $gt, $gte, $lt, $lte, $in, $nin, combined with $and and $or for compound conditions. Metadata filters are applied before similarity scoring, so they improve both precision and speed.

Can ChromaDB handle millions of vectors?

ChromaDB is optimized for hundreds of thousands to low millions of vectors in local mode using its HNSW index. For tens of millions of vectors or multi-server deployments, ChromaDB supports a client-server mode where the database runs as a separate process. For truly large-scale production workloads — hundreds of millions of vectors across multiple regions — consider Pinecone, Weaviate, or Qdrant, which are purpose-built for horizontal scaling.

Share this article:Facebook Twitter/X LinkedIn Telegram WhatsApp

💬 DiscussionPowered by GitHub Discussions

Frequently Asked Questions

Yes, when you initialize Chroma with persist_directory='./chroma_db', all embeddings and metadata are saved to disk automatically. In newer versions of ChromaDB (0.4+), persistence is automatic — you no longer need to call vectorstore.persist() explicitly.

AiTechWorlds Team

✓ Verified Writer

The AiTechWorlds team is passionate about AI, technology, and education. We create high-quality, research-backed content to help you learn, grow, and succeed in the modern digital world.

📱 Follow on Telegram 🐦 Follow on X Learn More →

search relevance ranking showing scores — LangChain advanced RAG retrieval strategies

Agent Development

10 LangChain Retrieval Strategies for Better RAG Results

Go beyond basic similarity search with ParentDocumentRetriever, MultiQueryRetriever, EnsembleRetriever, HyDE, and 6 more LangChain retrieval strategies — with code for each.

May 31, 2026 13 min read

AI agent architecture with memory and tool connections — LangChain agent memory tools

Agent Development

Build a LangChain Agent with Memory and Tools (Full Example)

Build a complete LangChain conversational agent with persistent memory, multiple tools, and step-by-step trace — from setup to a production-ready implementation with code.

May 31, 2026 14 min read

developer coding AI agent decision loop — LangChain agent types ZeroShot ReAct Conversational

Agent Development

5 LangChain Agent Types Explained (ZeroShot, ReAct, and More)

Understand every major LangChain agent type — ZeroShotAgent, ReAct, ConversationalAgent, and more — with Python code and agent trace walkthroughs.

May 31, 2026 13 min read

FastAPI server running LangChain endpoint — deploy LangChain FastAPI REST streaming

Agent Development

How to Deploy a LangChain App as a FastAPI REST Endpoint

Serve a LangChain app as a production FastAPI REST endpoint with streaming, async chains, error handling, and Docker deployment — full Python code included.

May 31, 2026 11 min read

Go deeper on this topic

NotesRAG: Retrieval-Augmented Generation Guide NotesAI Agent Development Notes NotesEmbeddings & Vector Databases Reference BookAI Agent Development Guide BookBuilding AI Apps: Developer's Guide CourseAI Agent Development Course

10K+ Members Growing Daily

Get Free AI Notes Daily

Join AiTechWorlds on Telegram and get daily AI tips, prompt engineering templates, coding resources, and exclusive content — 100% free!

📚 Free Study Notes🤖 AI Tips Daily⚡ Prompt Templates💻 Coding Resources

Join Free Channel

No spam. Leave anytime.

Langchain

How to Use LangChain with ChromaDB (Local Vector Store)

⚡ Quick Answer

Use LangChain with ChromaDB for persistent local embeddings — setup, metadata filtering, similarity search, and collection management with full Python code.

AiTechWorlds Team May 31, 2026 11 min read

#LangChain #ChromaDB #vector store #embeddings #RAG

📚Part of the Langchain guide — explore all Langchain articles→

Share:Facebook Twitter/X LinkedIn Telegram WhatsApp

📱

Get more content like this on Telegram!

Daily AI tips, notes & resources — free

Join Free →

Why ChromaDB for Local Development

The standard vector database comparison for RAG development:

ChromaDB — local or server mode, zero configuration, excellent LangChain integration
FAISS — pure in-memory, no persistence, extremely fast, used in research
Pinecone — fully managed cloud, production-grade, costs money
Weaviate — Docker-based, hybrid search, more complex setup

According to ChromaDB's own benchmarks, local ChromaDB handles similarity search across 1 million 1536-dimensional vectors in under 100ms on modern hardware.

Setup

pip install langchain langchain-openai langchain-community chromadb

import os
os.environ["OPENAI_API_KEY"] = "your-key-here"

Step 1: Basic ChromaDB Setup

from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings
from langchain_core.documents import Document

# Initialize embeddings
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

# Create documents
documents = [
    Document(
        page_content="LangChain is a framework for building LLM applications.",
        metadata={"source": "overview.txt", "category": "framework", "year": 2023}
    ),
    Document(
        page_content="ChromaDB is an open-source vector database for AI applications.",
        metadata={"source": "chroma_docs.txt", "category": "database", "year": 2023}
    ),
    Document(
        page_content="RAG combines retrieval with generation for accurate LLM responses.",
        metadata={"source": "rag_guide.txt", "category": "technique", "year": 2023}
    ),
    Document(
        page_content="OpenAI text-embedding-3-small produces 1536-dimensional vectors.",
        metadata={"source": "openai_docs.txt", "category": "embedding", "year": 2024}
    ),
    Document(
        page_content="Vector similarity search finds documents closest to a query embedding.",
        metadata={"source": "search_guide.txt", "category": "technique", "year": 2024}
    ),
]

# Create vector store with persistence
vectorstore = Chroma.from_documents(
    documents=documents,
    embedding=embeddings,
    persist_directory="./chroma_db",       # saves to disk
    collection_name="langchain_docs",      # named collection
)

print(f"Created collection with {vectorstore._collection.count()} documents")

Step 2: Loading a Persisted Collection

After the first run, load the existing collection instead of rebuilding it:

from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings

embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

# Load existing collection from disk
vectorstore = Chroma(
    collection_name="langchain_docs",
    embedding_function=embeddings,
    persist_directory="./chroma_db",
)

doc_count = vectorstore._collection.count()
print(f"Loaded collection with {doc_count} documents")

In ChromaDB 0.4+, persistence is automatic — data is saved after every write operation. You no longer need to call vectorstore.persist() explicitly.

Step 3: Adding Documents Incrementally

You can add documents to an existing collection without rebuilding it from scratch:

from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings
from langchain_core.documents import Document
import hashlib

embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

vectorstore = Chroma(
    collection_name="langchain_docs",
    embedding_function=embeddings,
    persist_directory="./chroma_db",
)

# New documents to add
new_docs = [
    Document(
        page_content="LCEL (LangChain Expression Language) uses pipe operators to compose chains.",
        metadata={"source": "lcel_guide.txt", "category": "framework", "year": 2024}
    ),
    Document(
        page_content="LangChain agents can use tools like web search and code execution.",
        metadata={"source": "agents_guide.txt", "category": "agents", "year": 2024}
    ),
]

# Generate stable IDs based on content hash to prevent duplicates
def content_hash(doc: Document) -> str:
    content = doc.page_content + str(doc.metadata.get("source", ""))
    return hashlib.md5(content.encode()).hexdigest()

ids = [content_hash(doc) for doc in new_docs]

# Add with explicit IDs — Chroma ignores duplicates with same ID
vectorstore.add_documents(documents=new_docs, ids=ids)

print(f"Collection now has {vectorstore._collection.count()} documents")

Step 4: Similarity Search

Basic similarity search and its variants:

# Standard similarity search — returns k most similar documents
results = vectorstore.similarity_search(
    query="How do I build LLM applications?",
    k=3,
)

for doc in results:
    print(f"Source: {doc.metadata['source']}")
    print(f"Content: {doc.page_content[:100]}")
    print("---")

# Search with relevance scores — lower distance = more similar
results_with_scores = vectorstore.similarity_search_with_relevance_scores(
    query="What is RAG?",
    k=3,
)

for doc, score in results_with_scores:
    print(f"Score: {score:.4f} | {doc.page_content[:80]}")

# Maximal Marginal Relevance (MMR) — balances relevance and diversity
# Useful when you want varied results instead of near-duplicates
mmr_results = vectorstore.max_marginal_relevance_search(
    query="LangChain framework components",
    k=4,
    fetch_k=20,    # fetch 20 candidates, return 4 most diverse
    lambda_mult=0.5,  # 0=max diversity, 1=max relevance
)

print("\nMMR results (diverse):")
for doc in mmr_results:
    print(f"  {doc.page_content[:80]}")

Step 5: Metadata Filtering

This is where ChromaDB becomes much more powerful than a basic similarity search:

from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings

embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
vectorstore = Chroma(
    collection_name="langchain_docs",
    embedding_function=embeddings,
    persist_directory="./chroma_db",
)

# Filter by exact metadata value
framework_docs = vectorstore.similarity_search(
    query="building applications",
    k=5,
    filter={"category": "framework"},  # only return docs where category == "framework"
)

print("Framework docs only:")
for doc in framework_docs:
    print(f"  {doc.metadata['source']}: {doc.page_content[:60]}")

# Filter with $in operator (multiple valid values)
tech_docs = vectorstore.similarity_search(
    query="LLM tools",
    k=5,
    filter={"category": {"$in": ["framework", "agents", "technique"]}},
)

# Filter with comparison operators
recent_docs = vectorstore.similarity_search(
    query="vector search",
    k=5,
    filter={"year": {"$gte": 2024}},  # only 2024 and newer
)

# Complex compound filter using $and
specific_docs = vectorstore.similarity_search(
    query="embeddings",
    k=5,
    filter={
        "$and": [
            {"year": {"$gte": 2024}},
            {"category": {"$ne": "database"}},
        ]
    },
)

print("\nAvailable filter operators:")
operators = ["$eq", "$ne", "$gt", "$gte", "$lt", "$lte", "$in", "$nin", "$and", "$or"]
for op in operators:
    print(f"  {op}")

Step 6: Updating and Deleting Documents

import chromadb
from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings

embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
vectorstore = Chroma(
    collection_name="langchain_docs",
    embedding_function=embeddings,
    persist_directory="./chroma_db",
)

# Access the underlying ChromaDB collection
collection = vectorstore._collection

# List all document IDs
all_docs = collection.get()
doc_ids = all_docs["ids"]
print(f"Total documents: {len(doc_ids)}")
print(f"Sample IDs: {doc_ids[:3]}")

# Delete a specific document by ID
if doc_ids:
    collection.delete(ids=[doc_ids[0]])
    print(f"Deleted document {doc_ids[0]}")
    print(f"Remaining: {collection.count()} documents")

# Delete documents matching a filter
collection.delete(
    where={"category": "embedding"}  # delete all embedding category docs
)
print(f"After category deletion: {collection.count()} documents")

# Update document content (delete + re-add)
doc_to_update = Document(
    page_content="LangChain LCEL is the recommended way to compose chains in 2026.",
    metadata={"source": "lcel_guide.txt", "category": "framework", "year": 2026}
)

# Use a deterministic ID based on the source
update_id = hashlib.md5("lcel_guide.txt".encode()).hexdigest()
collection.delete(ids=[update_id])
vectorstore.add_documents(documents=[doc_to_update], ids=[update_id])
print("Document updated successfully")

Step 7: Collection Management

Working with multiple collections for different use cases:

import chromadb
from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings

# Access the ChromaDB client directly
client = chromadb.PersistentClient(path="./chroma_db")

# List all collections
collections = client.list_collections()
print("Existing collections:")
for col in collections:
    print(f"  {col.name}: {col.count()} documents")

# Create separate collections for different data sources
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

technical_store = Chroma(
    collection_name="technical_docs",
    embedding_function=embeddings,
    persist_directory="./chroma_db",
)

product_store = Chroma(
    collection_name="product_catalog",
    embedding_function=embeddings,
    persist_directory="./chroma_db",
)

faq_store = Chroma(
    collection_name="faq",
    embedding_function=embeddings,
    persist_directory="./chroma_db",
)

# Delete a collection entirely
client.delete_collection("old_collection")

# Get collection stats
for store, name in [(technical_store, "technical"), (product_store, "product")]:
    count = store._collection.count()
    print(f"{name}: {count} documents")

Step 8: Full RAG Pipeline with ChromaDB

Putting it all together into a production-ready RAG chain:

from langchain_community.vectorstores import Chroma
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_core.documents import Document
import os

os.environ["OPENAI_API_KEY"] = "your-key-here"

# Configuration
PERSIST_DIR = "./chroma_db"
COLLECTION_NAME = "knowledge_base"
EMBED_MODEL = "text-embedding-3-small"
CHAT_MODEL = "gpt-4o-mini"
CHUNK_SIZE = 512
CHUNK_OVERLAP = 64


def build_knowledge_base(documents: list[Document]) -> Chroma:
    """Index documents into ChromaDB with chunking."""
    embeddings = OpenAIEmbeddings(model=EMBED_MODEL)

    splitter = RecursiveCharacterTextSplitter(
        chunk_size=CHUNK_SIZE,
        chunk_overlap=CHUNK_OVERLAP,
    )
    chunks = splitter.transform_documents(documents)
    print(f"Chunked {len(documents)} documents into {len(chunks)} chunks")

    vectorstore = Chroma.from_documents(
        documents=chunks,
        embedding=embeddings,
        persist_directory=PERSIST_DIR,
        collection_name=COLLECTION_NAME,
    )
    print(f"Indexed {vectorstore._collection.count()} chunks")
    return vectorstore


def build_rag_chain(vectorstore: Chroma, k: int = 4):
    """Build a complete RAG chain on top of ChromaDB."""
    retriever = vectorstore.as_retriever(
        search_type="mmr",         # use MMR for diversity
        search_kwargs={
            "k": k,
            "fetch_k": k * 5,
            "lambda_mult": 0.6,
        },
    )

    llm = ChatOpenAI(model=CHAT_MODEL, temperature=0)

    prompt = ChatPromptTemplate.from_messages([
        ("system", """You are a helpful assistant. Answer based only on the provided context.
If the context does not contain enough information, clearly state that.

Context:
{context}"""),
        ("human", "{question}"),
    ])

    def format_docs_with_sources(docs: list[Document]) -> str:
        formatted = []
        for doc in docs:
            source = doc.metadata.get("source", "Unknown")
            formatted.append(f"[Source: {source}]\n{doc.page_content}")
        return "\n\n---\n\n".join(formatted)

    chain = (
        {
            "context": retriever | format_docs_with_sources,
            "question": RunnablePassthrough(),
        }
        | prompt
        | llm
        | StrOutputParser()
    )

    return chain, retriever


# Build the pipeline
sample_docs = [
    Document(
        page_content="ChromaDB is an AI-native open-source embedding database. "
                     "It enables you to store embeddings and query them with near-neighbor search. "
                     "ChromaDB supports persistent storage, metadata filtering, and multiple collections.",
        metadata={"source": "chromadb_overview.txt", "category": "database"}
    ),
    Document(
        page_content="LangChain's Chroma integration provides a VectorStore interface. "
                     "You can use similarity_search(), max_marginal_relevance_search(), "
                     "and as_retriever() for various retrieval strategies.",
        metadata={"source": "langchain_chroma.txt", "category": "integration"}
    ),
    Document(
        page_content="For production deployments, ChromaDB supports a client-server mode "
                     "where the database runs as a separate process. "
                     "Use chromadb.HttpClient() to connect to a running ChromaDB server.",
        metadata={"source": "chromadb_deployment.txt", "category": "deployment"}
    ),
]

vectorstore = build_knowledge_base(sample_docs)
rag_chain, retriever = build_rag_chain(vectorstore, k=3)

# Query the pipeline
questions = [
    "What retrieval strategies does LangChain's Chroma integration support?",
    "How do I deploy ChromaDB in production?",
    "What is ChromaDB and what does it store?",
]

for question in questions:
    print(f"\nQ: {question}")
    answer = rag_chain.invoke(question)
    print(f"A: {answer}")

Step 9: ChromaDB with Custom Embeddings

Not limited to OpenAI — use any embedding model:

from langchain_huggingface import HuggingFaceEmbeddings
from langchain_community.vectorstores import Chroma
from langchain_core.documents import Document

# Free local embeddings — no API cost
hf_embeddings = HuggingFaceEmbeddings(
    model_name="sentence-transformers/all-MiniLM-L6-v2",
    model_kwargs={"device": "cpu"},
    encode_kwargs={"normalize_embeddings": True},
)

vectorstore = Chroma.from_documents(
    documents=sample_docs,
    embedding=hf_embeddings,
    persist_directory="./chroma_db_local",
    collection_name="local_embeddings",
)

# Works exactly the same way
results = vectorstore.similarity_search("vector database deployment", k=2)
for doc in results:
    print(f"{doc.metadata['source']}: {doc.page_content[:80]}")

Comparison Table: ChromaDB vs Other Vector Stores in LangChain

Feature	ChromaDB	FAISS	Pinecone	Weaviate	Qdrant
Local/free	Yes	Yes	No	Partial	Partial
Persistence	Yes	No	Managed	Yes	Yes
Metadata filtering	Yes	No	Yes	Yes	Yes
Hybrid search	No	No	Yes	Yes	Yes
Scale (vectors)	~1M local	Unlimited (RAM)	Unlimited	Unlimited	Unlimited
Setup complexity	Very easy	Easy	Easy	Medium	Medium
LangChain support	Excellent	Good	Excellent	Good	Good

Working with the Retriever Interface

The LangChain retriever interface abstracts over the underlying vector store:

from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings

embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
vectorstore = Chroma(
    collection_name="langchain_docs",
    embedding_function=embeddings,
    persist_directory="./chroma_db",
)

# Basic retriever
retriever = vectorstore.as_retriever(search_kwargs={"k": 4})

# Retriever with metadata filter — only returns docs from specific source
filtered_retriever = vectorstore.as_retriever(
    search_type="similarity",
    search_kwargs={
        "k": 4,
        "filter": {"category": "framework"},
    },
)

# MMR retriever for diverse results
mmr_retriever = vectorstore.as_retriever(
    search_type="mmr",
    search_kwargs={
        "k": 4,
        "fetch_k": 20,
        "lambda_mult": 0.5,
    },
)

# Use the retriever in any LangChain chain
docs = retriever.invoke("How do I build a RAG pipeline?")
print(f"Retrieved {len(docs)} documents")

Performance Tips

Frequently Asked Questions

Does ChromaDB persist data between Python sessions?

How do I filter ChromaDB results by metadata?

Can ChromaDB handle millions of vectors?

Share this article:Facebook Twitter/X LinkedIn Telegram WhatsApp

💬 DiscussionPowered by GitHub Discussions

Frequently Asked Questions

AiTechWorlds Team

✓ Verified Writer

The AiTechWorlds team is passionate about AI, technology, and education. We create high-quality, research-backed content to help you learn, grow, and succeed in the modern digital world.

📱 Follow on Telegram 🐦 Follow on X Learn More →

Agent Development

10 LangChain Retrieval Strategies for Better RAG Results

Go beyond basic similarity search with ParentDocumentRetriever, MultiQueryRetriever, EnsembleRetriever, HyDE, and 6 more LangChain retrieval strategies — with code for each.

May 31, 2026 13 min read

Agent Development

Build a LangChain Agent with Memory and Tools (Full Example)

Build a complete LangChain conversational agent with persistent memory, multiple tools, and step-by-step trace — from setup to a production-ready implementation with code.

May 31, 2026 14 min read

Agent Development

5 LangChain Agent Types Explained (ZeroShot, ReAct, and More)

Understand every major LangChain agent type — ZeroShotAgent, ReAct, ConversationalAgent, and more — with Python code and agent trace walkthroughs.

May 31, 2026 13 min read

Agent Development

How to Deploy a LangChain App as a FastAPI REST Endpoint

Serve a LangChain app as a production FastAPI REST endpoint with streaming, async chains, error handling, and Docker deployment — full Python code included.

May 31, 2026 11 min read

Go deeper on this topic

10K+ Members Growing Daily

Get Free AI Notes Daily

Join AiTechWorlds on Telegram and get daily AI tips, prompt engineering templates, coding resources, and exclusive content — 100% free!

📚 Free Study Notes🤖 AI Tips Daily⚡ Prompt Templates💻 Coding Resources

Join Free Channel

No spam. Leave anytime.

How to Use LangChain with ChromaDB (Local Vector Store)

Why ChromaDB for Local Development

Setup

Step 1: Basic ChromaDB Setup

Step 2: Loading a Persisted Collection

Step 3: Adding Documents Incrementally

Step 4: Similarity Search

Step 5: Metadata Filtering

Step 6: Updating and Deleting Documents

Step 7: Collection Management

Step 8: Full RAG Pipeline with ChromaDB

Step 9: ChromaDB with Custom Embeddings

Comparison Table: ChromaDB vs Other Vector Stores in LangChain

Working with the Retriever Interface

Performance Tips

Frequently Asked Questions

💬 DiscussionPowered by GitHub Discussions

Frequently Asked Questions

AiTechWorlds Team

Related Articles

10 LangChain Retrieval Strategies for Better RAG Results

Build a LangChain Agent with Memory and Tools (Full Example)

5 LangChain Agent Types Explained (ZeroShot, ReAct, and More)

How to Deploy a LangChain App as a FastAPI REST Endpoint

Go deeper on this topic

Get Free AI Notes Daily

How to Use LangChain with ChromaDB (Local Vector Store)

Why ChromaDB for Local Development

Setup

Step 1: Basic ChromaDB Setup

Step 2: Loading a Persisted Collection

Step 3: Adding Documents Incrementally

Step 4: Similarity Search

Step 5: Metadata Filtering

Step 6: Updating and Deleting Documents

Step 7: Collection Management

Step 8: Full RAG Pipeline with ChromaDB

Step 9: ChromaDB with Custom Embeddings

Comparison Table: ChromaDB vs Other Vector Stores in LangChain

Working with the Retriever Interface

Performance Tips

Frequently Asked Questions

💬 DiscussionPowered by GitHub Discussions

Frequently Asked Questions

AiTechWorlds Team

Related Articles

10 LangChain Retrieval Strategies for Better RAG Results

Build a LangChain Agent with Memory and Tools (Full Example)

5 LangChain Agent Types Explained (ZeroShot, ReAct, and More)

How to Deploy a LangChain App as a FastAPI REST Endpoint

Go deeper on this topic

Get Free AI Notes Daily