AI Tips Prompting Python AI Tools Web Dev ChatGPT LLM Agent Dev Reviews Notes Free Books

AiTechWorlds

database architecture with vector similarity search — LangChain vector stores Pinecone FAISS Weaviate

7 Vector Stores Compatible with LangChain (Ranked 2026)

Q: Which vector store is best for a beginner LangChain RAG project?

Start with Chroma. It requires zero infrastructure setup, runs locally, persists to disk, and integrates with LangChain in three lines. Switch to Qdrant or Pinecone when you outgrow it.

Q: Can I switch vector stores without rewriting my LangChain code?

Mostly yes. LangChain's vector store interface is standardized — the similarity_search(), as_retriever(), and add_documents() methods work the same across providers. You'll need to re-embed your documents for the new store, but your chain code changes minimally.

Q: Does FAISS work for production RAG systems?

FAISS works well for production if your data fits in RAM and you don't need real-time updates or multi-server deployments. Many companies run FAISS in production for read-heavy workloads with periodic batch rebuilds. It's not ideal for dynamic datasets that update frequently.

⚡ Quick Answer

Compare Pinecone, Weaviate, FAISS, Chroma, Milvus, Qdrant, and PGVector for LangChain RAG — with code snippets, cost breakdown, and honest recommendations.

AiTechWorlds Team May 31, 2026 12 min read

#LangChain #Vector Database #Pinecone #FAISS #Chroma #Qdrant

📚Part of the Langchain guide — explore all Langchain articles→

Share:Facebook Twitter/X LinkedIn Telegram WhatsApp

📱

Get more content like this on Telegram!

Daily AI tips, notes & resources — free

Join Free →

Choosing the wrong vector database for your RAG system is a decision you'll regret at scale. I've seen teams build on FAISS because it was the first example in the tutorial, then spend weeks migrating when they hit production requirements. I've also seen teams choose Pinecone for a weekend project and spend more on the database than they spent on the LLM API.

The choice matters. And it's genuinely non-trivial because the options vary significantly in hosting model, cost structure, query performance, metadata filtering capabilities, and operational complexity.

This guide covers all seven major vector stores that work with LangChain in 2026: Pinecone, Weaviate, FAISS, Chroma, Milvus, Qdrant, and PGVector. Each section includes working LangChain integration code, a comparison table, and my honest take on when each one makes sense.

For context on how vector stores fit into a complete RAG pipeline, see our RAG system tutorial. For LangChain fundamentals, the LangChain tutorial 2025 is the right starting point.

What Makes a Vector Store Good for LangChain

Before the comparisons, here's what actually matters when picking a vector store for LangChain RAG:

Query performance — How fast is approximate nearest neighbor (ANN) search? Most stores use HNSW (Hierarchical Navigable Small World) graphs. Latency under 100ms for sub-million-scale queries is the bar.

Metadata filtering — Can you filter by document type, date, user ID, or other attributes alongside vector search? This matters enormously for multi-tenant apps or when you only want to search a subset of documents.

Hybrid search — Does it combine dense vector search with BM25 keyword search? Hybrid search consistently outperforms pure vector search on most RAG benchmarks.

Update flexibility — Can you add and delete documents without rebuilding the index? Some stores (FAISS) require rebuilds. Others (Chroma, Qdrant) support real-time updates.

Operational simplicity — Do you want managed hosting (no ops) or self-hosted control? This is usually the deciding factor for small teams.

According to the ANN Benchmarks project, HNSW-based indexes consistently deliver the best recall-to-speed ratio for high-dimensional embeddings, which is why most modern vector stores use it.

1. Chroma: The Developer's First Choice

Chroma is where most LangChain RAG projects start. It's local-first, zero-config, and has first-class LangChain support.

from langchain_chroma import Chroma
from langchain_openai import OpenAIEmbeddings
from langchain_core.documents import Document

embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

# Create or load a persistent collection
vectorstore = Chroma(
    collection_name="my_documents",
    embedding_function=embeddings,
    persist_directory="./chroma_db",
    collection_metadata={"hnsw:space": "cosine"}  # Use cosine similarity
)

# Add documents with metadata
documents = [
    Document(
        page_content="LangChain is a framework for LLM applications.",
        metadata={"source": "docs", "category": "framework", "year": 2026}
    ),
    Document(
        page_content="Vector databases store embeddings for similarity search.",
        metadata={"source": "tutorial", "category": "database", "year": 2025}
    ),
]

ids = vectorstore.add_documents(documents)
print(f"Added documents with IDs: {ids}")

# Similarity search
results = vectorstore.similarity_search("LLM frameworks", k=2)

# Filtered search — only documents from 2026
filtered_results = vectorstore.similarity_search(
    "AI frameworks",
    k=5,
    filter={"year": 2026}
)

# Search with scores
results_with_scores = vectorstore.similarity_search_with_score("LLM tools", k=3)
for doc, score in results_with_scores:
    print(f"Score: {score:.4f} | {doc.page_content[:80]}")

# Use as retriever
retriever = vectorstore.as_retriever(
    search_type="mmr",
    search_kwargs={"k": 4, "fetch_k": 20}
)

Verdict: Best for development and projects under ~500K documents. The hosted Chroma Cloud service extends it to production. The SQLite backend it uses by default doesn't scale to millions of documents.

2. FAISS: Maximum Speed, Zero Dependencies

FAISS (Facebook AI Similarity Search) is a C++ library with Python bindings. No server, no network latency, just raw performance.

from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings

embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

# Create from texts directly
texts = [
    "Python is a programming language",
    "LangChain builds LLM applications",
    "Embeddings convert text to vectors",
    "RAG retrieves context for LLMs",
]
metadatas = [{"id": i, "source": "test"} for i in range(len(texts))]

vectorstore = FAISS.from_texts(texts=texts, embedding=embeddings, metadatas=metadatas)

# Save to disk
vectorstore.save_local("./faiss_index")

# Load from disk (fast — no re-embedding needed)
loaded_store = FAISS.load_local(
    "./faiss_index",
    embeddings=embeddings,
    allow_dangerous_deserialization=True  # Required flag
)

# Similarity search
results = loaded_store.similarity_search("vector databases", k=3)

# Merge two FAISS stores (useful for incremental builds)
store1 = FAISS.from_texts(["doc1", "doc2"], embedding=embeddings)
store2 = FAISS.from_texts(["doc3", "doc4"], embedding=embeddings)
store1.merge_from(store2)

# Delete specific documents by ID
store1.delete(["doc_id_1", "doc_id_2"])

Verdict: Best for local deployments where all data fits in RAM. Exceptional query speed. Not suited for real-time updates at scale or distributed deployments.

3. Pinecone: Managed Scale

Pinecone is the managed vector database leader. Zero infrastructure, global distribution, and consistent performance at any scale.

from langchain_pinecone import PineconeVectorStore
from langchain_openai import OpenAIEmbeddings
from pinecone import Pinecone, ServerlessSpec
import os

# Initialize Pinecone
pc = Pinecone(api_key=os.getenv("PINECONE_API_KEY"))

# Create a serverless index (pay per query, not per hour)
index_name = "langchain-rag"
if index_name not in [i.name for i in pc.list_indexes()]:
    pc.create_index(
        name=index_name,
        dimension=1536,  # text-embedding-3-small dimension
        metric="cosine",
        spec=ServerlessSpec(cloud="aws", region="us-east-1")
    )

embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

# Connect LangChain to Pinecone
vectorstore = PineconeVectorStore(
    index=pc.Index(index_name),
    embedding=embeddings,
    text_key="text"
)

# Add documents
from langchain_core.documents import Document

docs = [
    Document(
        page_content="Pinecone provides managed vector search at scale.",
        metadata={"source": "pinecone_docs", "category": "database", "tenant_id": "user_001"}
    ),
]
vectorstore.add_documents(docs)

# Multi-tenant filtered search
results = vectorstore.similarity_search(
    "managed databases",
    k=5,
    filter={"tenant_id": {"$eq": "user_001"}}
)

# Namespace-based multi-tenancy
tenant_store = PineconeVectorStore(
    index=pc.Index(index_name),
    embedding=embeddings,
    namespace="tenant_user_001"  # Namespace isolates data
)

Verdict: Best for production SaaS applications, multi-tenant RAG, and teams that don't want to manage infrastructure. The cost adds up at scale — budget carefully.

4. Qdrant: Best Self-Hosted Option

Qdrant is my personal recommendation for production self-hosted deployments. It has excellent metadata filtering, hybrid search, and a clean REST/gRPC API.

from langchain_qdrant import QdrantVectorStore
from qdrant_client import QdrantClient
from qdrant_client.models import Distance, VectorParams
from langchain_openai import OpenAIEmbeddings
import os

# Connect to Qdrant (local, Docker, or cloud)
# Local: QdrantClient(path="./qdrant_db")
# Docker: QdrantClient(host="localhost", port=6333)
# Cloud: QdrantClient(url="https://...", api_key="...")

client = QdrantClient(path="./qdrant_local")

# Create collection
collection_name = "langchain_docs"
client.recreate_collection(
    collection_name=collection_name,
    vectors_config=VectorParams(
        size=1536,
        distance=Distance.COSINE
    )
)

embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

vectorstore = QdrantVectorStore(
    client=client,
    collection_name=collection_name,
    embedding=embeddings,
)

# Add documents
from langchain_core.documents import Document
docs = [
    Document(
        page_content="Qdrant is a high-performance vector search engine.",
        metadata={
            "source": "qdrant_docs",
            "category": "database",
            "language": "en",
            "published_year": 2024
        }
    ),
]
vectorstore.add_documents(docs)

# Advanced filtered search using Qdrant's filter syntax
from qdrant_client.models import Filter, FieldCondition, MatchValue, Range

results = vectorstore.similarity_search(
    "vector database performance",
    k=5,
    filter=Filter(
        must=[
            FieldCondition(key="language", match=MatchValue(value="en")),
            FieldCondition(key="published_year", range=Range(gte=2023))
        ]
    )
)

# Hybrid search (dense + sparse)
from langchain_qdrant import FastEmbedSparse, RetrievalMode

sparse_embeddings = FastEmbedSparse(model_name="Qdrant/bm25")

hybrid_store = QdrantVectorStore.from_documents(
    docs,
    embedding=embeddings,
    sparse_embedding=sparse_embeddings,
    location=":memory:",
    collection_name="hybrid_collection",
    retrieval_mode=RetrievalMode.HYBRID,
)

Verdict: Best for teams that need production performance, excellent filtering, and hybrid search without the Pinecone price tag. Docker deployment is straightforward.

Weaviate stands out for multi-modal search (text + images + audio) and built-in BM25 hybrid search. The schema-based approach is more structured than other stores.

from langchain_weaviate import WeaviateVectorStore
import weaviate
from langchain_openai import OpenAIEmbeddings

# Connect to Weaviate
client = weaviate.connect_to_local()  # Docker local
# Or: weaviate.connect_to_weaviate_cloud(cluster_url="...", auth_credentials=...)

embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

vectorstore = WeaviateVectorStore(
    client=client,
    index_name="LangchainDocs",
    text_key="text",
    embedding=embeddings,
)

from langchain_core.documents import Document
docs = [
    Document(
        page_content="Weaviate supports hybrid search combining vectors and BM25.",
        metadata={"source": "weaviate_docs", "author": "AiTechWorlds"}
    ),
]
vectorstore.add_documents(docs)

# Hybrid search
results = vectorstore.similarity_search(
    "hybrid vector search BM25",
    k=3,
    alpha=0.5  # 0=pure BM25, 1=pure vector, 0.5=equal blend
)

client.close()

Verdict: Best for multi-modal RAG or when you need strong hybrid search out of the box. The schema system has a steeper learning curve than other options.

6. Milvus: High-Throughput Enterprise

Milvus is designed for high-throughput, large-scale deployments. It's more complex to operate than Qdrant but scales to billions of vectors.

from langchain_milvus import Milvus
from langchain_openai import OpenAIEmbeddings
from pymilvus import connections, utility

# Connect to Milvus
connections.connect(host="localhost", port="19530")

embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

# Create collection with Milvus
vectorstore = Milvus.from_documents(
    documents=[],  # Start empty
    embedding=embeddings,
    collection_name="langchain_rag",
    connection_args={"host": "localhost", "port": "19530"},
    index_params={
        "metric_type": "COSINE",
        "index_type": "HNSW",
        "params": {"M": 8, "efConstruction": 64}
    }
)

from langchain_core.documents import Document
docs = [
    Document(
        page_content="Milvus handles billions of vectors with high throughput.",
        metadata={"source": "milvus_docs", "type": "technical"}
    ),
]
vectorstore.add_documents(docs)

# Search with partition support (for large-scale multi-tenancy)
results = vectorstore.similarity_search("high performance vector search", k=5)

# Also available as managed: Zilliz Cloud (Milvus-compatible SaaS)

Verdict: Best for enterprise deployments requiring billions of vectors and multi-hundred QPS throughput. Overkill for most projects under 10 million documents.

7. PGVector: If You Already Have PostgreSQL

PGVector adds vector search as a Postgres extension. If your application already runs on Postgres, this is the path of least resistance.

from langchain_postgres import PGVector
from langchain_openai import OpenAIEmbeddings
import os

embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

# Connection string format
connection_string = "postgresql+psycopg://user:password@localhost:5432/mydb"

vectorstore = PGVector(
    embeddings=embeddings,
    collection_name="langchain_vectors",
    connection=connection_string,
    use_jsonb=True,  # Better metadata filtering with JSONB
)

from langchain_core.documents import Document
docs = [
    Document(
        page_content="PGVector adds vector search to PostgreSQL databases.",
        metadata={
            "source": "pgvector_docs",
            "user_id": "u001",
            "document_type": "tutorial"
        }
    ),
]
vectorstore.add_documents(docs)

# Metadata filtering using PostgreSQL operators
results = vectorstore.similarity_search(
    "PostgreSQL vector extension",
    k=5,
    filter={"user_id": "u001"}
)

# Full-text hybrid search combining pgvector and tsvector
# (Requires custom SQL — PGVector supports this via raw queries)

Verdict: Best when you already run PostgreSQL and want to avoid adding a new database to your stack. Performance is solid for small to medium scale. Not competitive with dedicated vector stores at large scale.

Head-to-Head Comparison Table

Database	Hosting	Monthly Cost (est.)	ANN Algorithm	Metadata Filtering	Hybrid Search	Best For
Chroma	Local / Cloud	Free (self-host)	HNSW	Good	No	Dev, small prod
FAISS	Local only	Free	IVF + HNSW	Limited	No	Speed-critical local
Pinecone	Cloud only	$0–$70+	Proprietary	Excellent	Yes (managed)	SaaS, managed prod
Qdrant	Local + Cloud	Free (self-host)	HNSW	Excellent	Yes	Self-hosted prod
Weaviate	Local + Cloud	Free (self-host)	HNSW	Good	Yes (built-in)	Multi-modal, hybrid
Milvus	Local + Cloud	Free (self-host)	HNSW / IVF	Good	Limited	Enterprise scale
PGVector	Local (Postgres)	Free (self-host)	HNSW / IVF	Excellent	Manual	Postgres-first teams

Switching Between Vector Stores

One of LangChain's advantages is that the vector store interface is largely consistent. Switching stores is mostly about the initialization code, not the rest of your application:

from langchain_core.vectorstores import VectorStore
from langchain_openai import OpenAIEmbeddings

def build_rag_chain(vectorstore: VectorStore):
    """This function works with ANY LangChain vector store."""
    from langchain_openai import ChatOpenAI
    from langchain_core.prompts import ChatPromptTemplate
    from langchain_core.output_parsers import StrOutputParser
    from langchain_core.runnables import RunnablePassthrough
    
    retriever = vectorstore.as_retriever(
        search_type="mmr",
        search_kwargs={"k": 4, "fetch_k": 20}
    )
    
    llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
    
    prompt = ChatPromptTemplate.from_messages([
        ("system", "Answer using context:\n{context}"),
        ("human", "{question}")
    ])
    
    return (
        {"context": retriever | (lambda docs: "\n\n".join(d.page_content for d in docs)),
         "question": RunnablePassthrough()}
        | prompt
        | llm
        | StrOutputParser()
    )

# Use with any store
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

# Swap these out without changing build_rag_chain
from langchain_chroma import Chroma
chroma_store = Chroma(persist_directory="./chroma_db", embedding_function=embeddings)
chain = build_rag_chain(chroma_store)

This portability is one of the best reasons to use LangChain's vector store abstractions rather than vendor SDKs directly.

My Recommendations by Use Case

You're building a prototype or internal tool (< 100K documents): Chroma. Zero setup, good filtering, works on your laptop.

You need production RAG and don't want to manage infrastructure: Pinecone. It's more expensive than self-hosted but you get consistent performance and no ops burden.

You need production RAG and have DevOps resources: Qdrant. Best combination of performance, filtering, hybrid search, and cost for self-hosted deployments.

You already run PostgreSQL and your data is under 1M documents: PGVector. The operational simplicity of staying in one database is worth a lot.

You need multi-modal search (images, audio, text): Weaviate. It's the only option here with native multi-modal support.

You need billions of vectors at high throughput: Milvus or a managed Pinecone enterprise tier.

For more context on the semantic search fundamentals that underpin all of these stores, the semantic search tutorial covers embedding models, distance metrics, and ANN algorithms. The LangChain RAG pipeline guide shows how any of these stores fits into a complete retrieval system.

Conclusion

There's no universally best vector database — just the right one for your specific requirements. The table and recommendations above should narrow it down. If you're unsure, start with Chroma, get your RAG pipeline working end-to-end, then evaluate whether you need to migrate based on actual performance data from your use case.

The good news: switching stores with LangChain is a day of work, not a week. Build your chain against the interface, test with Chroma first, and upgrade when you have real reasons to. Don't over-engineer your database choice before you even know what your query patterns look like.

Pair this guide with the LangChain advanced RAG strategies post to get the most out of whichever store you choose.

Frequently Asked Questions

Which vector store is best for a beginner LangChain RAG project?

Start with Chroma. It requires zero infrastructure setup, runs locally, persists to disk, and integrates with LangChain in three lines. Switch to Qdrant or Pinecone when you outgrow it.

Can I switch vector stores without rewriting my LangChain code?

Mostly yes. LangChain's vector store interface is standardized — the similarity_search(), as_retriever(), and add_documents() methods work the same across providers. You'll need to re-embed your documents for the new store, but your chain code changes minimally.

Does FAISS work for production RAG systems?

FAISS works well for production if your data fits in RAM and you don't need real-time updates or multi-server deployments. Many companies run FAISS in production for read-heavy workloads with periodic batch rebuilds. It's not ideal for dynamic datasets that update frequently.

Share this article:Facebook Twitter/X LinkedIn Telegram WhatsApp

💬 DiscussionPowered by GitHub Discussions

Frequently Asked Questions

Start with Chroma. It requires zero infrastructure setup, runs locally, persists to disk, and integrates with LangChain in three lines. Switch to Qdrant or Pinecone when you outgrow it.

AiTechWorlds Team

✓ Verified Writer

The AiTechWorlds team is passionate about AI, technology, and education. We create high-quality, research-backed content to help you learn, grow, and succeed in the modern digital world.

📱 Follow on Telegram 🐦 Follow on X Learn More →

search relevance ranking showing scores — LangChain advanced RAG retrieval strategies

Agent Development

10 LangChain Retrieval Strategies for Better RAG Results

Go beyond basic similarity search with ParentDocumentRetriever, MultiQueryRetriever, EnsembleRetriever, HyDE, and 6 more LangChain retrieval strategies — with code for each.

May 31, 2026 13 min read

AI agent architecture with memory and tool connections — LangChain agent memory tools

Agent Development

Build a LangChain Agent with Memory and Tools (Full Example)

Build a complete LangChain conversational agent with persistent memory, multiple tools, and step-by-step trace — from setup to a production-ready implementation with code.

May 31, 2026 14 min read

developer coding AI agent decision loop — LangChain agent types ZeroShot ReAct Conversational

Agent Development

5 LangChain Agent Types Explained (ZeroShot, ReAct, and More)

Understand every major LangChain agent type — ZeroShotAgent, ReAct, ConversationalAgent, and more — with Python code and agent trace walkthroughs.

May 31, 2026 13 min read

FastAPI server running LangChain endpoint — deploy LangChain FastAPI REST streaming

Agent Development

How to Deploy a LangChain App as a FastAPI REST Endpoint

Serve a LangChain app as a production FastAPI REST endpoint with streaming, async chains, error handling, and Docker deployment — full Python code included.

May 31, 2026 11 min read

Go deeper on this topic

NotesEmbeddings & Vector Databases Reference NotesAI Agent Development Notes NotesRAG: Retrieval-Augmented Generation Guide BookAI Agent Development Guide BookBuilding AI Apps: Developer's Guide CourseAI Agent Development Course

10K+ Members Growing Daily

Get Free AI Notes Daily

Join AiTechWorlds on Telegram and get daily AI tips, prompt engineering templates, coding resources, and exclusive content — 100% free!

📚 Free Study Notes🤖 AI Tips Daily⚡ Prompt Templates💻 Coding Resources

Join Free Channel

No spam. Leave anytime.

Langchain

7 Vector Stores Compatible with LangChain (Ranked 2026)

⚡ Quick Answer

Compare Pinecone, Weaviate, FAISS, Chroma, Milvus, Qdrant, and PGVector for LangChain RAG — with code snippets, cost breakdown, and honest recommendations.

AiTechWorlds Team May 31, 2026 12 min read

#LangChain #Vector Database #Pinecone #FAISS #Chroma #Qdrant

📚Part of the Langchain guide — explore all Langchain articles→

Share:Facebook Twitter/X LinkedIn Telegram WhatsApp

📱

Get more content like this on Telegram!

Daily AI tips, notes & resources — free

Join Free →

For context on how vector stores fit into a complete RAG pipeline, see our RAG system tutorial. For LangChain fundamentals, the LangChain tutorial 2025 is the right starting point.

What Makes a Vector Store Good for LangChain

Before the comparisons, here's what actually matters when picking a vector store for LangChain RAG:

Hybrid search — Does it combine dense vector search with BM25 keyword search? Hybrid search consistently outperforms pure vector search on most RAG benchmarks.

Update flexibility — Can you add and delete documents without rebuilding the index? Some stores (FAISS) require rebuilds. Others (Chroma, Qdrant) support real-time updates.

Operational simplicity — Do you want managed hosting (no ops) or self-hosted control? This is usually the deciding factor for small teams.

According to the ANN Benchmarks project, HNSW-based indexes consistently deliver the best recall-to-speed ratio for high-dimensional embeddings, which is why most modern vector stores use it.

1. Chroma: The Developer's First Choice

Chroma is where most LangChain RAG projects start. It's local-first, zero-config, and has first-class LangChain support.

from langchain_chroma import Chroma
from langchain_openai import OpenAIEmbeddings
from langchain_core.documents import Document

embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

# Create or load a persistent collection
vectorstore = Chroma(
    collection_name="my_documents",
    embedding_function=embeddings,
    persist_directory="./chroma_db",
    collection_metadata={"hnsw:space": "cosine"}  # Use cosine similarity
)

# Add documents with metadata
documents = [
    Document(
        page_content="LangChain is a framework for LLM applications.",
        metadata={"source": "docs", "category": "framework", "year": 2026}
    ),
    Document(
        page_content="Vector databases store embeddings for similarity search.",
        metadata={"source": "tutorial", "category": "database", "year": 2025}
    ),
]

ids = vectorstore.add_documents(documents)
print(f"Added documents with IDs: {ids}")

# Similarity search
results = vectorstore.similarity_search("LLM frameworks", k=2)

# Filtered search — only documents from 2026
filtered_results = vectorstore.similarity_search(
    "AI frameworks",
    k=5,
    filter={"year": 2026}
)

# Search with scores
results_with_scores = vectorstore.similarity_search_with_score("LLM tools", k=3)
for doc, score in results_with_scores:
    print(f"Score: {score:.4f} | {doc.page_content[:80]}")

# Use as retriever
retriever = vectorstore.as_retriever(
    search_type="mmr",
    search_kwargs={"k": 4, "fetch_k": 20}
)

2. FAISS: Maximum Speed, Zero Dependencies

FAISS (Facebook AI Similarity Search) is a C++ library with Python bindings. No server, no network latency, just raw performance.

from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings

embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

# Create from texts directly
texts = [
    "Python is a programming language",
    "LangChain builds LLM applications",
    "Embeddings convert text to vectors",
    "RAG retrieves context for LLMs",
]
metadatas = [{"id": i, "source": "test"} for i in range(len(texts))]

vectorstore = FAISS.from_texts(texts=texts, embedding=embeddings, metadatas=metadatas)

# Save to disk
vectorstore.save_local("./faiss_index")

# Load from disk (fast — no re-embedding needed)
loaded_store = FAISS.load_local(
    "./faiss_index",
    embeddings=embeddings,
    allow_dangerous_deserialization=True  # Required flag
)

# Similarity search
results = loaded_store.similarity_search("vector databases", k=3)

# Merge two FAISS stores (useful for incremental builds)
store1 = FAISS.from_texts(["doc1", "doc2"], embedding=embeddings)
store2 = FAISS.from_texts(["doc3", "doc4"], embedding=embeddings)
store1.merge_from(store2)

# Delete specific documents by ID
store1.delete(["doc_id_1", "doc_id_2"])

Verdict: Best for local deployments where all data fits in RAM. Exceptional query speed. Not suited for real-time updates at scale or distributed deployments.

3. Pinecone: Managed Scale

Pinecone is the managed vector database leader. Zero infrastructure, global distribution, and consistent performance at any scale.

from langchain_pinecone import PineconeVectorStore
from langchain_openai import OpenAIEmbeddings
from pinecone import Pinecone, ServerlessSpec
import os

# Initialize Pinecone
pc = Pinecone(api_key=os.getenv("PINECONE_API_KEY"))

# Create a serverless index (pay per query, not per hour)
index_name = "langchain-rag"
if index_name not in [i.name for i in pc.list_indexes()]:
    pc.create_index(
        name=index_name,
        dimension=1536,  # text-embedding-3-small dimension
        metric="cosine",
        spec=ServerlessSpec(cloud="aws", region="us-east-1")
    )

embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

# Connect LangChain to Pinecone
vectorstore = PineconeVectorStore(
    index=pc.Index(index_name),
    embedding=embeddings,
    text_key="text"
)

# Add documents
from langchain_core.documents import Document

docs = [
    Document(
        page_content="Pinecone provides managed vector search at scale.",
        metadata={"source": "pinecone_docs", "category": "database", "tenant_id": "user_001"}
    ),
]
vectorstore.add_documents(docs)

# Multi-tenant filtered search
results = vectorstore.similarity_search(
    "managed databases",
    k=5,
    filter={"tenant_id": {"$eq": "user_001"}}
)

# Namespace-based multi-tenancy
tenant_store = PineconeVectorStore(
    index=pc.Index(index_name),
    embedding=embeddings,
    namespace="tenant_user_001"  # Namespace isolates data
)

Verdict: Best for production SaaS applications, multi-tenant RAG, and teams that don't want to manage infrastructure. The cost adds up at scale — budget carefully.

4. Qdrant: Best Self-Hosted Option

Qdrant is my personal recommendation for production self-hosted deployments. It has excellent metadata filtering, hybrid search, and a clean REST/gRPC API.

from langchain_qdrant import QdrantVectorStore
from qdrant_client import QdrantClient
from qdrant_client.models import Distance, VectorParams
from langchain_openai import OpenAIEmbeddings
import os

# Connect to Qdrant (local, Docker, or cloud)
# Local: QdrantClient(path="./qdrant_db")
# Docker: QdrantClient(host="localhost", port=6333)
# Cloud: QdrantClient(url="https://...", api_key="...")

client = QdrantClient(path="./qdrant_local")

# Create collection
collection_name = "langchain_docs"
client.recreate_collection(
    collection_name=collection_name,
    vectors_config=VectorParams(
        size=1536,
        distance=Distance.COSINE
    )
)

embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

vectorstore = QdrantVectorStore(
    client=client,
    collection_name=collection_name,
    embedding=embeddings,
)

# Add documents
from langchain_core.documents import Document
docs = [
    Document(
        page_content="Qdrant is a high-performance vector search engine.",
        metadata={
            "source": "qdrant_docs",
            "category": "database",
            "language": "en",
            "published_year": 2024
        }
    ),
]
vectorstore.add_documents(docs)

# Advanced filtered search using Qdrant's filter syntax
from qdrant_client.models import Filter, FieldCondition, MatchValue, Range

results = vectorstore.similarity_search(
    "vector database performance",
    k=5,
    filter=Filter(
        must=[
            FieldCondition(key="language", match=MatchValue(value="en")),
            FieldCondition(key="published_year", range=Range(gte=2023))
        ]
    )
)

# Hybrid search (dense + sparse)
from langchain_qdrant import FastEmbedSparse, RetrievalMode

sparse_embeddings = FastEmbedSparse(model_name="Qdrant/bm25")

hybrid_store = QdrantVectorStore.from_documents(
    docs,
    embedding=embeddings,
    sparse_embedding=sparse_embeddings,
    location=":memory:",
    collection_name="hybrid_collection",
    retrieval_mode=RetrievalMode.HYBRID,
)

Verdict: Best for teams that need production performance, excellent filtering, and hybrid search without the Pinecone price tag. Docker deployment is straightforward.

Weaviate stands out for multi-modal search (text + images + audio) and built-in BM25 hybrid search. The schema-based approach is more structured than other stores.

from langchain_weaviate import WeaviateVectorStore
import weaviate
from langchain_openai import OpenAIEmbeddings

# Connect to Weaviate
client = weaviate.connect_to_local()  # Docker local
# Or: weaviate.connect_to_weaviate_cloud(cluster_url="...", auth_credentials=...)

embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

vectorstore = WeaviateVectorStore(
    client=client,
    index_name="LangchainDocs",
    text_key="text",
    embedding=embeddings,
)

from langchain_core.documents import Document
docs = [
    Document(
        page_content="Weaviate supports hybrid search combining vectors and BM25.",
        metadata={"source": "weaviate_docs", "author": "AiTechWorlds"}
    ),
]
vectorstore.add_documents(docs)

# Hybrid search
results = vectorstore.similarity_search(
    "hybrid vector search BM25",
    k=3,
    alpha=0.5  # 0=pure BM25, 1=pure vector, 0.5=equal blend
)

client.close()

Verdict: Best for multi-modal RAG or when you need strong hybrid search out of the box. The schema system has a steeper learning curve than other options.

6. Milvus: High-Throughput Enterprise

Milvus is designed for high-throughput, large-scale deployments. It's more complex to operate than Qdrant but scales to billions of vectors.

from langchain_milvus import Milvus
from langchain_openai import OpenAIEmbeddings
from pymilvus import connections, utility

# Connect to Milvus
connections.connect(host="localhost", port="19530")

embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

# Create collection with Milvus
vectorstore = Milvus.from_documents(
    documents=[],  # Start empty
    embedding=embeddings,
    collection_name="langchain_rag",
    connection_args={"host": "localhost", "port": "19530"},
    index_params={
        "metric_type": "COSINE",
        "index_type": "HNSW",
        "params": {"M": 8, "efConstruction": 64}
    }
)

from langchain_core.documents import Document
docs = [
    Document(
        page_content="Milvus handles billions of vectors with high throughput.",
        metadata={"source": "milvus_docs", "type": "technical"}
    ),
]
vectorstore.add_documents(docs)

# Search with partition support (for large-scale multi-tenancy)
results = vectorstore.similarity_search("high performance vector search", k=5)

# Also available as managed: Zilliz Cloud (Milvus-compatible SaaS)

Verdict: Best for enterprise deployments requiring billions of vectors and multi-hundred QPS throughput. Overkill for most projects under 10 million documents.

7. PGVector: If You Already Have PostgreSQL

PGVector adds vector search as a Postgres extension. If your application already runs on Postgres, this is the path of least resistance.

from langchain_postgres import PGVector
from langchain_openai import OpenAIEmbeddings
import os

embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

# Connection string format
connection_string = "postgresql+psycopg://user:password@localhost:5432/mydb"

vectorstore = PGVector(
    embeddings=embeddings,
    collection_name="langchain_vectors",
    connection=connection_string,
    use_jsonb=True,  # Better metadata filtering with JSONB
)

from langchain_core.documents import Document
docs = [
    Document(
        page_content="PGVector adds vector search to PostgreSQL databases.",
        metadata={
            "source": "pgvector_docs",
            "user_id": "u001",
            "document_type": "tutorial"
        }
    ),
]
vectorstore.add_documents(docs)

# Metadata filtering using PostgreSQL operators
results = vectorstore.similarity_search(
    "PostgreSQL vector extension",
    k=5,
    filter={"user_id": "u001"}
)

# Full-text hybrid search combining pgvector and tsvector
# (Requires custom SQL — PGVector supports this via raw queries)

Head-to-Head Comparison Table

Database	Hosting	Monthly Cost (est.)	ANN Algorithm	Metadata Filtering	Hybrid Search	Best For
Chroma	Local / Cloud	Free (self-host)	HNSW	Good	No	Dev, small prod
FAISS	Local only	Free	IVF + HNSW	Limited	No	Speed-critical local
Pinecone	Cloud only	$0–$70+	Proprietary	Excellent	Yes (managed)	SaaS, managed prod
Qdrant	Local + Cloud	Free (self-host)	HNSW	Excellent	Yes	Self-hosted prod
Weaviate	Local + Cloud	Free (self-host)	HNSW	Good	Yes (built-in)	Multi-modal, hybrid
Milvus	Local + Cloud	Free (self-host)	HNSW / IVF	Good	Limited	Enterprise scale
PGVector	Local (Postgres)	Free (self-host)	HNSW / IVF	Excellent	Manual	Postgres-first teams

Switching Between Vector Stores

One of LangChain's advantages is that the vector store interface is largely consistent. Switching stores is mostly about the initialization code, not the rest of your application:

from langchain_core.vectorstores import VectorStore
from langchain_openai import OpenAIEmbeddings

def build_rag_chain(vectorstore: VectorStore):
    """This function works with ANY LangChain vector store."""
    from langchain_openai import ChatOpenAI
    from langchain_core.prompts import ChatPromptTemplate
    from langchain_core.output_parsers import StrOutputParser
    from langchain_core.runnables import RunnablePassthrough
    
    retriever = vectorstore.as_retriever(
        search_type="mmr",
        search_kwargs={"k": 4, "fetch_k": 20}
    )
    
    llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
    
    prompt = ChatPromptTemplate.from_messages([
        ("system", "Answer using context:\n{context}"),
        ("human", "{question}")
    ])
    
    return (
        {"context": retriever | (lambda docs: "\n\n".join(d.page_content for d in docs)),
         "question": RunnablePassthrough()}
        | prompt
        | llm
        | StrOutputParser()
    )

# Use with any store
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

# Swap these out without changing build_rag_chain
from langchain_chroma import Chroma
chroma_store = Chroma(persist_directory="./chroma_db", embedding_function=embeddings)
chain = build_rag_chain(chroma_store)

This portability is one of the best reasons to use LangChain's vector store abstractions rather than vendor SDKs directly.

My Recommendations by Use Case

You're building a prototype or internal tool (< 100K documents): Chroma. Zero setup, good filtering, works on your laptop.

You need production RAG and don't want to manage infrastructure: Pinecone. It's more expensive than self-hosted but you get consistent performance and no ops burden.

You need production RAG and have DevOps resources: Qdrant. Best combination of performance, filtering, hybrid search, and cost for self-hosted deployments.

You already run PostgreSQL and your data is under 1M documents: PGVector. The operational simplicity of staying in one database is worth a lot.

You need multi-modal search (images, audio, text): Weaviate. It's the only option here with native multi-modal support.

You need billions of vectors at high throughput: Milvus or a managed Pinecone enterprise tier.

Conclusion

Pair this guide with the LangChain advanced RAG strategies post to get the most out of whichever store you choose.

Frequently Asked Questions

Which vector store is best for a beginner LangChain RAG project?

Start with Chroma. It requires zero infrastructure setup, runs locally, persists to disk, and integrates with LangChain in three lines. Switch to Qdrant or Pinecone when you outgrow it.

Can I switch vector stores without rewriting my LangChain code?

Does FAISS work for production RAG systems?

Share this article:Facebook Twitter/X LinkedIn Telegram WhatsApp

💬 DiscussionPowered by GitHub Discussions

Frequently Asked Questions

Start with Chroma. It requires zero infrastructure setup, runs locally, persists to disk, and integrates with LangChain in three lines. Switch to Qdrant or Pinecone when you outgrow it.

AiTechWorlds Team

✓ Verified Writer

The AiTechWorlds team is passionate about AI, technology, and education. We create high-quality, research-backed content to help you learn, grow, and succeed in the modern digital world.

📱 Follow on Telegram 🐦 Follow on X Learn More →

Agent Development

10 LangChain Retrieval Strategies for Better RAG Results

Go beyond basic similarity search with ParentDocumentRetriever, MultiQueryRetriever, EnsembleRetriever, HyDE, and 6 more LangChain retrieval strategies — with code for each.

May 31, 2026 13 min read

Agent Development

Build a LangChain Agent with Memory and Tools (Full Example)

Build a complete LangChain conversational agent with persistent memory, multiple tools, and step-by-step trace — from setup to a production-ready implementation with code.

May 31, 2026 14 min read

Agent Development

5 LangChain Agent Types Explained (ZeroShot, ReAct, and More)

Understand every major LangChain agent type — ZeroShotAgent, ReAct, ConversationalAgent, and more — with Python code and agent trace walkthroughs.

May 31, 2026 13 min read

Agent Development

How to Deploy a LangChain App as a FastAPI REST Endpoint

Serve a LangChain app as a production FastAPI REST endpoint with streaming, async chains, error handling, and Docker deployment — full Python code included.

May 31, 2026 11 min read

Go deeper on this topic

10K+ Members Growing Daily

Get Free AI Notes Daily

Join AiTechWorlds on Telegram and get daily AI tips, prompt engineering templates, coding resources, and exclusive content — 100% free!

📚 Free Study Notes🤖 AI Tips Daily⚡ Prompt Templates💻 Coding Resources

Join Free Channel

No spam. Leave anytime.

7 Vector Stores Compatible with LangChain (Ranked 2026)

What Makes a Vector Store Good for LangChain

1. Chroma: The Developer's First Choice

2. FAISS: Maximum Speed, Zero Dependencies

3. Pinecone: Managed Scale

4. Qdrant: Best Self-Hosted Option

5. Weaviate: Multi-Modal and Hybrid

6. Milvus: High-Throughput Enterprise

7. PGVector: If You Already Have PostgreSQL

Head-to-Head Comparison Table

Switching Between Vector Stores

My Recommendations by Use Case

Conclusion

Frequently Asked Questions

💬 DiscussionPowered by GitHub Discussions

Frequently Asked Questions

AiTechWorlds Team

Related Articles

10 LangChain Retrieval Strategies for Better RAG Results

Build a LangChain Agent with Memory and Tools (Full Example)

5 LangChain Agent Types Explained (ZeroShot, ReAct, and More)

How to Deploy a LangChain App as a FastAPI REST Endpoint

Go deeper on this topic

Get Free AI Notes Daily

7 Vector Stores Compatible with LangChain (Ranked 2026)

What Makes a Vector Store Good for LangChain

1. Chroma: The Developer's First Choice

2. FAISS: Maximum Speed, Zero Dependencies

3. Pinecone: Managed Scale

4. Qdrant: Best Self-Hosted Option

5. Weaviate: Multi-Modal and Hybrid

6. Milvus: High-Throughput Enterprise

7. PGVector: If You Already Have PostgreSQL

Head-to-Head Comparison Table

Switching Between Vector Stores

My Recommendations by Use Case

Conclusion

Frequently Asked Questions

💬 DiscussionPowered by GitHub Discussions

Frequently Asked Questions

AiTechWorlds Team

Related Articles

10 LangChain Retrieval Strategies for Better RAG Results

Build a LangChain Agent with Memory and Tools (Full Example)

5 LangChain Agent Types Explained (ZeroShot, ReAct, and More)

How to Deploy a LangChain App as a FastAPI REST Endpoint

Go deeper on this topic

Get Free AI Notes Daily