Follow AiTechWorlds on LinkedIn for professional AI content!Follow Now →

Vector Database Guide: Pinecone, Weaviate, Chroma, and pgvector Compared

Vector database guide 2025 — compare Pinecone, Weaviate, Chroma, pgvector and Qdrant by features, performance, cost, and use cases for production AI applications.

A
AiTechWorlds Team
May 27, 2026 8 min read
📱

Get more content like this on Telegram!

Daily AI tips, notes & resources — free

Join Free →

Vector Database Guide: Pinecone, Weaviate, Chroma, and pgvector Compared

The first production RAG system I shipped used Chroma. By month three, it was too slow for our growing dataset. I migrated to Pinecone. Six months later, we were paying $400/month for what PostgreSQL with pgvector could handle for $20.

Every vector database has a sweet spot. Understanding the tradeoffs before you commit to one saves significant pain. This guide covers the major options with honest assessments of where each wins and loses.


The Vector Database Landscape

Category 1: Managed Cloud (zero ops)
  - Pinecone: easiest, most mature managed offering
  - Weaviate Cloud: managed Weaviate

Category 2: Self-Hosted Open-Source
  - Qdrant: Rust-based, excellent performance, great docs
  - Weaviate: full-featured, multimodal, active community
  - Milvus: enterprise-grade, complex but scalable

Category 3: Embedded (no separate server)
  - Chroma: Python-first, great for development
  - LanceDB: columnar storage, good for local use

Category 4: PostgreSQL Extensions
  - pgvector: vector search in existing PostgreSQL
  - pg_embedding: alternative to pgvector

Category 5: Search Platforms with Vector Support
  - Elasticsearch: full-text + vector hybrid
  - OpenSearch: AWS's Elasticsearch fork

Comparison Matrix

DatabaseTypeScaleHybrid SearchSetupCost
PineconeManaged10M-1B+Minutes$70+/mo
Weaviate CloudManaged1M-100M+Minutes$0-$25+/mo
Qdrant CloudManaged1M-100MMinutes$0-$50+/mo
ChromaEmbedded<1MSecondsFree
pgvectorExtension<10MVia PostgreSQLMinutesExisting DB
Qdrant (self-host)Self-hosted1M-1B1 hourInfra cost
Weaviate (self-host)Self-hosted1M-1B1-2 hoursInfra cost

Chroma: Development Default

# pip install chromadb

import chromadb
from chromadb.utils import embedding_functions

# In-memory client (lost when process ends)
client = chromadb.Client()

# Persistent client
client = chromadb.PersistentClient(path="./chroma_db")

# Create collection with embedding function
openai_ef = embedding_functions.OpenAIEmbeddingFunction(
    api_key="your-key",
    model_name="text-embedding-3-small"
)

collection = client.get_or_create_collection(
    name="documents",
    embedding_function=openai_ef,
    metadata={"hnsw:space": "cosine"}  # Distance metric
)

# Add documents (Chroma handles embedding)
collection.add(
    documents=[
        "The quick brown fox jumps over the lazy dog.",
        "Machine learning is a subset of artificial intelligence.",
        "Python is a versatile programming language.",
    ],
    metadatas=[
        {"source": "sample.txt", "category": "text"},
        {"source": "ml_intro.txt", "category": "ai"},
        {"source": "python_intro.txt", "category": "programming"},
    ],
    ids=["doc1", "doc2", "doc3"]
)

# Query
results = collection.query(
    query_texts=["What is AI?"],
    n_results=2,
    where={"category": "ai"}  # Metadata filtering
)

for doc, meta, distance in zip(
    results["documents"][0],
    results["metadatas"][0],
    results["distances"][0]
):
    print(f"Distance: {distance:.3f} | Source: {meta['source']}")
    print(f"Content: {doc[:100]}\n")

# Update and delete
collection.update(ids=["doc1"], documents=["Updated text content"])
collection.delete(ids=["doc1"])
print(f"Collection count: {collection.count()}")

Pinecone: Managed Production

# pip install pinecone-client

from pinecone import Pinecone, ServerlessSpec
import numpy as np

pc = Pinecone(api_key="your-pinecone-api-key")

# Create index
if "my-index" not in [idx.name for idx in pc.list_indexes()]:
    pc.create_index(
        name="my-index",
        dimension=1536,  # Must match your embedding model
        metric="cosine",
        spec=ServerlessSpec(cloud="aws", region="us-east-1")
    )

index = pc.Index("my-index")

# Upsert vectors (with metadata)
from openai import OpenAI

client = OpenAI()

def embed(texts: list[str]) -> list[list[float]]:
    response = client.embeddings.create(
        model="text-embedding-3-small",
        input=texts
    )
    return [item.embedding for item in response.data]

documents = [
    {"id": "doc1", "text": "Introduction to machine learning concepts."},
    {"id": "doc2", "text": "Python data structures and algorithms."},
    {"id": "doc3", "text": "Deep learning with PyTorch tutorial."},
]

texts = [d["text"] for d in documents]
embeddings = embed(texts)

# Batch upsert
vectors = [
    {
        "id": doc["id"],
        "values": emb,
        "metadata": {"text": doc["text"], "category": "tech"}
    }
    for doc, emb in zip(documents, embeddings)
]

index.upsert(vectors=vectors, namespace="production")

# Query
query_embedding = embed(["how does machine learning work?"])[0]

results = index.query(
    namespace="production",
    vector=query_embedding,
    top_k=3,
    include_values=False,
    include_metadata=True,
    filter={"category": {"$eq": "tech"}}  # Metadata filter
)

for match in results.matches:
    print(f"Score: {match.score:.3f} | ID: {match.id}")
    print(f"Text: {match.metadata['text']}\n")

# Index statistics
stats = index.describe_index_stats()
print(f"Total vectors: {stats.total_vector_count}")

pgvector: PostgreSQL Integration

# pip install psycopg2-binary pgvector

import psycopg2
from pgvector.psycopg2 import register_vector
import numpy as np

# Connect to PostgreSQL with pgvector
conn = psycopg2.connect("postgresql://user:password@localhost/mydb")
register_vector(conn)

cur = conn.cursor()

# Enable extension and create table
cur.execute("CREATE EXTENSION IF NOT EXISTS vector")

cur.execute("""
    CREATE TABLE IF NOT EXISTS documents (
        id SERIAL PRIMARY KEY,
        content TEXT NOT NULL,
        category VARCHAR(50),
        embedding vector(1536)  -- Match your model's dimensions
    )
""")

# Create HNSW index for fast approximate search
cur.execute("""
    CREATE INDEX IF NOT EXISTS documents_embedding_idx
    ON documents USING hnsw (embedding vector_cosine_ops)
    WITH (m = 16, ef_construction = 64)
""")

conn.commit()

# Insert with embeddings
def insert_document(content: str, category: str, embedding: list[float]):
    cur.execute(
        "INSERT INTO documents (content, category, embedding) VALUES (%s, %s, %s)",
        (content, category, np.array(embedding))
    )
    conn.commit()

# Similarity search
def search(query_embedding: list[float], top_k: int = 5, category: str | None = None):
    if category:
        cur.execute("""
            SELECT content, category, 1 - (embedding <=> %s) AS similarity
            FROM documents
            WHERE category = %s
            ORDER BY embedding <=> %s
            LIMIT %s
        """, (np.array(query_embedding), category, np.array(query_embedding), top_k))
    else:
        cur.execute("""
            SELECT content, category, 1 - (embedding <=> %s) AS similarity
            FROM documents
            ORDER BY embedding <=> %s
            LIMIT %s
        """, (np.array(query_embedding), np.array(query_embedding), top_k))
    
    return cur.fetchall()

# Hybrid search (semantic + keyword)
def hybrid_search(query: str, query_embedding: list[float], top_k: int = 5):
    cur.execute("""
        SELECT content, category,
               ts_rank(to_tsvector(content), plainto_tsquery(%s)) AS keyword_score,
               1 - (embedding <=> %s) AS semantic_score,
               -- Combine: 50% keyword + 50% semantic
               (ts_rank(to_tsvector(content), plainto_tsquery(%s)) * 0.5 +
                (1 - (embedding <=> %s)) * 0.5) AS combined_score
        FROM documents
        WHERE to_tsvector(content) @@ plainto_tsquery(%s)
           OR (1 - (embedding <=> %s)) > 0.7
        ORDER BY combined_score DESC
        LIMIT %s
    """, (query, np.array(query_embedding), query, np.array(query_embedding), 
          query, np.array(query_embedding), top_k))
    
    return cur.fetchall()

Qdrant: High-Performance Self-Hosted

# pip install qdrant-client

from qdrant_client import QdrantClient
from qdrant_client.models import (
    Distance, VectorParams, PointStruct,
    Filter, FieldCondition, MatchValue, SearchRequest
)

# Connect to local Qdrant (docker run -p 6333:6333 qdrant/qdrant)
client = QdrantClient(host="localhost", port=6333)

# Or cloud
# client = QdrantClient(url="https://your-cluster.qdrant.io", api_key="your-key")

# Create collection
client.recreate_collection(
    collection_name="articles",
    vectors_config=VectorParams(size=1536, distance=Distance.COSINE),
)

# Upsert points
points = [
    PointStruct(
        id=1,
        vector=[0.1] * 1536,  # Your actual embedding
        payload={"text": "Article about ML", "category": "ai", "views": 1500}
    ),
    PointStruct(
        id=2,
        vector=[0.2] * 1536,
        payload={"text": "Python tutorial", "category": "programming", "views": 3000}
    ),
]

client.upsert(collection_name="articles", points=points)

# Search with filters
results = client.search(
    collection_name="articles",
    query_vector=[0.1] * 1536,
    limit=5,
    query_filter=Filter(
        must=[
            FieldCondition(key="category", match=MatchValue(value="ai")),
            FieldCondition(key="views", range={"gte": 1000})
        ]
    ),
    with_payload=True
)

for r in results:
    print(f"Score: {r.score:.3f} | Text: {r.payload['text']}")

Choosing the Right Database

Decision Tree:

Q: Is this for development/prototyping?
  YES → Chroma (zero setup, Python-native)

Q: Do you already use PostgreSQL?
  YES → pgvector (no new infrastructure)

Q: Do you need zero ops managed cloud?
  YES → Pinecone (easiest) or Weaviate Cloud

Q: Do you need maximum performance + self-hosted?
  YES → Qdrant (best performance/cost ratio for self-hosted)

Q: Do you need multimodal (images + text) vectors?
  YES → Weaviate (native multimodal support)

Q: Do you need full-text + vector hybrid (existing Elasticsearch)?
  YES → Stay in Elasticsearch with kNN

Scale thresholds:
  < 100K vectors: any option works, choose by ops preference
  100K-10M: Qdrant, Pinecone, Weaviate all scale fine
  10M+: Pinecone or Qdrant with proper instance sizing
  1B+: Pinecone, Milvus, or Elasticsearch

Conclusion

Vector databases have matured rapidly. For most applications, Chroma during development and Qdrant or pgvector in production covers 90% of use cases at a fraction of managed service costs.

Pinecone wins on operational simplicity — no servers, no maintenance, scales automatically. If your team's time is worth more than the $70+/month difference versus self-hosting, it's often the right choice.

For the retrieval application layer that uses these databases, see our RAG system tutorial. For understanding the embeddings stored in these databases, see our embeddings explained guide.


Frequently Asked Questions

What is a vector database?

Stores embedding vectors and supports fast similarity search using ANN (approximate nearest neighbor) algorithms like HNSW. Unlike SQL databases that find exact matches, vector databases find "most similar" embeddings in milliseconds across millions of vectors. Essential for RAG, semantic search, and recommendation systems.

Which vector database should I use?

Development: Chroma. Existing PostgreSQL: pgvector. Managed cloud: Pinecone (easiest) or Qdrant Cloud. Self-hosted high-performance: Qdrant. Full-featured with multimodal: Weaviate. Start simple — don't over-engineer for scale you don't have yet.

What is the difference between HNSW and IVF indexing?

HNSW: graph-based, fastest queries (~1ms), memory-intensive, modern default. IVF: cluster-based, more memory-efficient, slightly slower, better for very large datasets on limited RAM. HNSW is the right choice for most use cases under 50M vectors.

What is hybrid search in vector databases?

Combines dense vector search (semantic similarity) with sparse BM25 keyword search. Dense finds semantically related content; sparse finds exact keywords and proper nouns. Hybrid consistently outperforms either alone by 5-20%. Use for production RAG; supported natively by Weaviate, Pinecone, and Qdrant.

How do I choose the right vector dimension?

Your embedding model determines dimension — you can't choose independently. OpenAI text-embedding-3-small = 1536; text-embedding-3-large = 3072; all-MiniLM = 384. Higher dimensions = better quality but higher cost. 768-1536 is sufficient for most RAG applications.

Share this article:

Frequently Asked Questions

A vector database stores high-dimensional numerical vectors (embeddings) and supports fast similarity search — finding the most similar vectors to a query vector. Unlike traditional databases that search by exact match or range queries, vector databases use approximate nearest neighbor (ANN) algorithms (HNSW, IVF) to find similar embeddings in milliseconds even across millions of vectors. They're essential for RAG systems (finding relevant document chunks), semantic search (finding similar content without exact keywords), recommendation systems, and any application where you need to 'find things with similar meaning to this.'
A

AiTechWorlds Team

✓ Verified Writer

The AiTechWorlds team is passionate about AI, technology, and education. We create high-quality, research-backed content to help you learn, grow, and succeed in the modern digital world.

Related Articles

10K+ Members Growing Daily

Get Free AI Notes Daily

Join AiTechWorlds on Telegram and get daily AI tips, prompt engineering templates, coding resources, and exclusive content — 100% free!

📚 Free Study Notes🤖 AI Tips Daily⚡ Prompt Templates💻 Coding Resources
Join Free Channel

No spam. Leave anytime.

!