AI Tips Prompting Python AI Tools Web Dev ChatGPT LLM Agent Dev Reviews Notes Free Books

AiTechWorlds

hybrid vector and keyword search visualization — LangChain Weaviate hybrid search BM25

How to Use LangChain with Weaviate (Hybrid Search 2026)

Q: How does LangChain's WeaviateVectorStore compare to other vector store integrations?

WeaviateVectorStore is one of the richer integrations—it exposes nearText, hybrid search, metadata filters, multi-tenancy, and HNSW configuration through a consistent LangChain interface, so you get Weaviate's advanced features without writing raw GraphQL.

⚡ Quick Answer

Connect LangChain to Weaviate for hybrid vector and keyword search. Covers local and cloud setup, nearText, BM25, metadata filtering, and a comparison table.

AiTechWorlds Team May 31, 2026 11 min read

#LangChain #Weaviate #hybrid search #BM25 #vector database

📚Part of the Langchain guide — explore all Langchain articles→

Share:Facebook Twitter/X LinkedIn Telegram WhatsApp

📱

Get more content like this on Telegram!

Daily AI tips, notes & resources — free

Join Free →

Most vector database tutorials stop at cosine similarity. You embed your query, find the nearest neighbors, and call it done. That works reasonably well until you hit a common failure mode: a user types an exact product code, a legal citation, or a technical acronym that the embedding model has never seen. The semantic search returns loosely related results when what the user wanted was an exact keyword match.

Weaviate's hybrid search solves this by running both a vector search and a BM25 keyword search simultaneously, then blending the scores. The result is a retrieval system that is both semantically aware and keyword-precise. This guide shows you how to wire it up with LangChain, starting from a local Docker instance and ending with a production-ready pattern.

What Makes Weaviate Different

Before diving into code, it is worth understanding what Weaviate brings to the table compared to simpler vector stores. The vector database guide covers the landscape, but Weaviate specifically offers:

Native hybrid search: BM25 + vector in a single query, not two separate requests merged in Python
Multi-tenancy: isolate collections per user or tenant without running separate instances
Generative search: pipe retrieval results directly to an LLM in a single Weaviate query
HNSW + Product Quantization: memory-efficient indexing for large collections
Schema flexibility: optional strict schema or auto-schema creation

Industry data from the 2025 Weaviate benchmark report shows hybrid search achieving 18-23% higher NDCG@10 scores compared to pure vector search on enterprise document collections with mixed content types.

Environment Setup

Start a local Weaviate instance with Docker:

docker run -d \
  -p 8080:8080 \
  -p 50051:50051 \
  --name weaviate \
  -e AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED=true \
  -e PERSISTENCE_DATA_PATH=/var/lib/weaviate \
  -e ENABLE_MODULES="text2vec-openai,generative-openai" \
  -e OPENAI_APIKEY=$OPENAI_API_KEY \
  cr.weaviate.io/semitechnologies/weaviate:1.25.0

Install Python dependencies:

pip install langchain-weaviate weaviate-client langchain-openai langchain-community

Connecting LangChain to Weaviate

import weaviate
from weaviate.classes.init import Auth
from langchain_weaviate.vectorstores import WeaviateVectorStore
from langchain_openai import OpenAIEmbeddings

# Local connection
client = weaviate.connect_to_local(
    host="localhost",
    port=8080,
    grpc_port=50051
)

embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

# Create or connect to a collection
vector_store = WeaviateVectorStore(
    client=client,
    index_name="DocumentChunk",
    text_key="content",
    embedding=embeddings,
    attributes=["source", "doc_type", "created_at", "chunk_id"]
)

print("Connected to Weaviate:", client.is_ready())

For Weaviate Cloud Services (WCS):

import weaviate
from weaviate.classes.init import Auth

# Cloud connection
client = weaviate.connect_to_weaviate_cloud(
    cluster_url="https://your-cluster.weaviate.network",
    auth_credentials=Auth.api_key("your-weaviate-api-key"),
    headers={"X-OpenAI-Api-Key": "your-openai-api-key"}
)

vector_store = WeaviateVectorStore(
    client=client,
    index_name="DocumentChunk",
    text_key="content",
    embedding=embeddings
)

The connection swap is the only change between local and cloud — the rest of your code stays identical.

Ingesting Documents

from langchain_community.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_weaviate.vectorstores import WeaviateVectorStore

# Load and split documents
loader = PyPDFLoader("technical_manual.pdf")
documents = loader.load()

splitter = RecursiveCharacterTextSplitter(
    chunk_size=800,
    chunk_overlap=100,
    separators=["\n\n", "\n", ". ", " ", ""]
)
chunks = splitter.split_documents(documents)

# Add metadata to each chunk
for i, chunk in enumerate(chunks):
    chunk.metadata.update({
        "chunk_id": f"chunk_{i:04d}",
        "doc_type": "technical_manual",
        "created_at": "2026-05-31"
    })

# Ingest with auto-generated embeddings
ids = vector_store.add_documents(chunks)
print(f"Ingested {len(ids)} chunks into Weaviate")

Pure Vector Search with nearText

# Simple similarity search
query = "How do I configure the network interface?"
results = vector_store.similarity_search(query, k=5)

for doc in results:
    print(f"Source: {doc.metadata.get('source', 'unknown')}")
    print(f"Content: {doc.page_content[:200]}")
    print("---")

With scores:

results_with_scores = vector_store.similarity_search_with_score(query, k=5)

for doc, score in results_with_scores:
    print(f"Score: {score:.4f} | {doc.page_content[:150]}")

The score here is cosine distance — lower is closer. For semantic concepts this works well, but notice what happens with exact product codes:

# This often fails with pure vector search
exact_query = "Error code E-4021-B network timeout"
results = vector_store.similarity_search(exact_query, k=3)
# Results may be thematically related but miss the exact code

Hybrid Search: Combining nearText and BM25

This is where Weaviate's hybrid search shines. The alpha parameter controls the blend:

alpha=1.0 → pure vector search
alpha=0.0 → pure BM25 keyword search
alpha=0.5 → equal weight to both (recommended starting point)

# Hybrid search via LangChain retriever
retriever = vector_store.as_retriever(
    search_type="similarity",
    search_kwargs={
        "k": 10,
        "alpha": 0.5  # 50/50 blend of vector and BM25
    }
)

# Query that benefits from hybrid: has both semantic and keyword components
query = "Error code E-4021-B network timeout troubleshooting steps"
hybrid_results = retriever.invoke(query)

for doc in hybrid_results:
    print(doc.page_content[:200])
    print("---")

For direct Weaviate hybrid queries with full control:

from weaviate.classes.query import HybridFusion

# Using the Weaviate client directly for maximum control
collection = client.collections.get("DocumentChunk")

response = collection.query.hybrid(
    query=query,
    alpha=0.5,
    fusion_type=HybridFusion.RELATIVE_SCORE,  # or RANKED
    limit=10,
    return_properties=["content", "source", "doc_type"],
    return_metadata=["score", "explain_score"]
)

for obj in response.objects:
    print(f"Score: {obj.metadata.score:.4f}")
    print(f"Content: {obj.properties['content'][:200]}")
    print(f"Explanation: {obj.metadata.explain_score}")
    print("---")

The explain_score field is invaluable for debugging — it tells you which component (vector or BM25) contributed how much to each result's final score.

Metadata Filtering

One of Weaviate's strengths is combining hybrid search with metadata filters. You get semantic + keyword matching constrained to a specific subset of your data.

from weaviate.classes.query import Filter

# Filter by doc_type then hybrid search within that subset
response = collection.query.hybrid(
    query="network configuration error",
    alpha=0.6,
    limit=5,
    filters=Filter.by_property("doc_type").equal("technical_manual")
)

for obj in response.objects:
    print(f"DocType: {obj.properties.get('doc_type')} | Score: {obj.metadata.score:.4f}")
    print(obj.properties["content"][:150])

Complex filter combinations:

from weaviate.classes.query import Filter

# Multiple filter conditions
compound_filter = (
    Filter.by_property("doc_type").equal("technical_manual") &
    Filter.by_property("created_at").greater_than("2026-01-01")
)

response = collection.query.hybrid(
    query="installation requirements",
    alpha=0.5,
    filters=compound_filter,
    limit=8
)

Via LangChain's interface:

results = vector_store.similarity_search(
    query="installation requirements",
    k=8,
    where_filter={
        "path": ["doc_type"],
        "operator": "Equal",
        "valueText": "technical_manual"
    }
)

Building a Hybrid RAG Chain

Now let's integrate hybrid search into a full RAG pipeline:

from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

# Hybrid retriever with tuned alpha
hybrid_retriever = vector_store.as_retriever(
    search_type="similarity",
    search_kwargs={
        "k": 6,
        "alpha": 0.55  # Slightly favor vector for most use cases
    }
)

prompt = ChatPromptTemplate.from_template("""
You are a technical support specialist. Use the following document excerpts to answer the question accurately.

Context:
{context}

Question: {question}

Provide a clear, step-by-step answer. If the information is not in the context, say so explicitly.
""")

def format_docs(docs):
    return "\n\n".join(
        f"[Source: {doc.metadata.get('source', 'unknown')}]\n{doc.page_content}"
        for doc in docs
    )

hybrid_rag_chain = (
    {"context": hybrid_retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

# Test with both semantic and keyword-heavy queries
response = hybrid_rag_chain.invoke("What are the steps to resolve error E-4021-B?")
print(response)

response = hybrid_rag_chain.invoke("Explain the general approach to network troubleshooting")
print(response)

This pairs naturally with the patterns in the RAG system tutorial and builds on what you learn in the LangChain tutorial 2025.

Multi-Tenant Search

Weaviate's multi-tenancy feature is useful when you host a document search service for multiple customers and need strict data isolation.

import weaviate
from weaviate.classes.config import Configure, Property, DataType

# Create a multi-tenant collection
client.collections.create(
    name="CustomerDocuments",
    multi_tenancy_config=Configure.multi_tenancy(enabled=True),
    properties=[
        Property(name="content", data_type=DataType.TEXT),
        Property(name="doc_type", data_type=DataType.TEXT),
        Property(name="source", data_type=DataType.TEXT),
    ]
)

# Add tenants
collection = client.collections.get("CustomerDocuments")
collection.tenants.create([
    weaviate.classes.tenants.Tenant(name="acme_corp"),
    weaviate.classes.tenants.Tenant(name="globex_inc"),
])

# Ingest data for a specific tenant
acme_collection = collection.with_tenant("acme_corp")
acme_collection.data.insert({
    "content": "ACME Corp internal network policy document content here...",
    "doc_type": "policy",
    "source": "acme_intranet"
})

# Query is automatically isolated to the tenant
response = acme_collection.query.hybrid(
    query="network policy",
    alpha=0.5,
    limit=5
)

With LangChain, you can create a tenant-scoped vector store per user request:

def get_tenant_retriever(tenant_id: str) -> object:
    """Return a retriever scoped to a specific tenant."""
    tenant_store = WeaviateVectorStore(
        client=client,
        index_name="CustomerDocuments",
        text_key="content",
        embedding=embeddings,
        tenant=tenant_id
    )
    return tenant_store.as_retriever(
        search_kwargs={"k": 5, "alpha": 0.5}
    )

# Per-request retriever creation
acme_retriever = get_tenant_retriever("acme_corp")
globex_retriever = get_tenant_retriever("globex_inc")

This isolation model is critical for enterprise deployments. The Build AI agent with LangChain guide covers how to route requests to tenant-specific components.

Alpha Tuning: Finding the Right Blend

The right alpha value depends on your content and query patterns. Here is a systematic approach to tuning:

from langchain_core.runnables import RunnableLambda
import statistics

def evaluate_alpha(alpha: float, test_queries: list, ground_truth: list) -> dict:
    """Evaluate retrieval quality at a given alpha value."""
    retriever = vector_store.as_retriever(
        search_kwargs={"k": 5, "alpha": alpha}
    )
    
    hit_rates = []
    for query, relevant_chunks in zip(test_queries, ground_truth):
        results = retriever.invoke(query)
        retrieved_contents = [r.page_content for r in results]
        
        hits = sum(1 for gt in relevant_chunks if any(gt in r for r in retrieved_contents))
        hit_rate = hits / len(relevant_chunks) if relevant_chunks else 0
        hit_rates.append(hit_rate)
    
    return {
        "alpha": alpha,
        "avg_hit_rate": statistics.mean(hit_rates),
        "min_hit_rate": min(hit_rates),
        "max_hit_rate": max(hit_rates)
    }

test_queries = [
    "Error code E-4021-B",
    "How does the authentication system work?",
    "network timeout configuration"
]

ground_truth = [
    ["Error E-4021-B occurs when", "E-4021-B network timeout"],
    ["authentication uses JWT tokens", "the auth flow validates"],
    ["timeout_seconds parameter", "network interface timeout setting"]
]

# Test a range of alpha values
for alpha in [0.0, 0.25, 0.5, 0.75, 1.0]:
    result = evaluate_alpha(alpha, test_queries, ground_truth)
    print(f"Alpha {alpha:.2f}: Hit Rate = {result['avg_hit_rate']:.3f}")

Weaviate Search Mode Comparison

Search Mode	Semantic Understanding	Exact Keyword Match	Metadata Filter	Speed
nearText (vector only)	Excellent	Poor	Yes	Fast
BM25 (keyword only)	Poor	Excellent	Yes	Very fast
Hybrid (alpha=0.5)	Good	Good	Yes	Fast
Hybrid (alpha=0.75)	Very good	Moderate	Yes	Fast
Hybrid (alpha=0.25)	Moderate	Very good	Yes	Fast
Generative search	Excellent	Good	Yes	Slowest

This mirrors findings in the semantic search tutorial — there is no single best configuration, only best configurations for specific query distributions.

Streaming Results with LangChain

For a better user experience in chat interfaces, stream the RAG response:

from langchain_core.callbacks import StreamingStdOutCallbackHandler

streaming_llm = ChatOpenAI(
    model="gpt-4o-mini",
    streaming=True,
    callbacks=[StreamingStdOutCallbackHandler()]
)

streaming_chain = (
    {"context": hybrid_retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | streaming_llm
    | StrOutputParser()
)

# Tokens stream to stdout as they arrive
for chunk in streaming_chain.stream("How do I reset the admin password?"):
    pass  # StreamingStdOutCallbackHandler handles printing

Weaviate Backend Cleanup

Always close the Weaviate client connection when your application shuts down:

import atexit

@atexit.register
def cleanup():
    client.close()
    print("Weaviate connection closed.")

# Or use as a context manager in scripts:
with weaviate.connect_to_local() as client:
    store = WeaviateVectorStore(client=client, ...)
    results = store.similarity_search("test query", k=3)
    # Connection closes automatically when the block exits

Combining Hybrid Search with Agent Tools

Wrapping the hybrid retriever as a LangChain tool makes it accessible to agents built with the patterns in Build AI agent with LangChain:

from langchain.tools.retriever import create_retriever_tool

hybrid_search_tool = create_retriever_tool(
    retriever=hybrid_retriever,
    name="hybrid_document_search",
    description=(
        "Search the technical documentation using hybrid vector and keyword search. "
        "Use this for both broad conceptual questions and specific error codes or product names."
    )
)

# Add to an agent
from langchain.agents import AgentExecutor, create_openai_tools_agent
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

agent_prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful technical support assistant."),
    MessagesPlaceholder(variable_name="chat_history"),
    ("human", "{input}"),
    MessagesPlaceholder(variable_name="agent_scratchpad"),
])

agent = create_openai_tools_agent(llm, [hybrid_search_tool], agent_prompt)
agent_executor = AgentExecutor(agent=agent, tools=[hybrid_search_tool], verbose=True)

result = agent_executor.invoke({
    "input": "I'm seeing error E-4021-B, what should I do?",
    "chat_history": []
})
print(result["output"])

This combination of hybrid retrieval and agent orchestration connects directly to what you see in the AI research agent build for more complex retrieval workflows.

Key Takeaways

Weaviate's hybrid search is not a gimmick — it addresses a real failure mode in production RAG systems where users type exact identifiers, codes, and acronyms that embedding models handle poorly. The alpha parameter gives you fine-grained control over the blend, and the metadata filtering capabilities let you constrain searches to relevant subsets of your data.

The LangChain integration abstracts away Weaviate's GraphQL API for common operations while still letting you drop down to the native client when you need advanced features like explain_score, RELATIVE_SCORE fusion, or multi-tenancy configuration.

For agents that need reliable document retrieval, the OpenAI API integration guide shows how to manage the embedding costs that come with large Weaviate collections, and the Deploy AI model to production guide covers the infrastructure patterns for running Weaviate at scale.

Frequently Asked Questions

What is hybrid search in Weaviate? Hybrid search combines vector similarity search (nearText or nearVector) with keyword-based BM25 search. Weaviate scores both results and blends them using a configurable alpha parameter, giving you the best of semantic and lexical retrieval.

Do I need a Weaviate Cloud account to follow this tutorial? No. All core examples run against a local Weaviate instance started with Docker. The cloud section shows how to swap the connection string for Weaviate Cloud Services when you are ready to deploy.

How does LangChain's WeaviateVectorStore compare to other vector store integrations? WeaviateVectorStore is one of the richer integrations — it exposes nearText, hybrid search, metadata filters, multi-tenancy, and HNSW configuration through a consistent LangChain interface, so you get Weaviate's advanced features without writing raw GraphQL.

Share this article:Facebook Twitter/X LinkedIn Telegram WhatsApp

💬 DiscussionPowered by GitHub Discussions

Frequently Asked Questions

Hybrid search combines vector similarity search (nearText or nearVector) with keyword-based BM25 search. Weaviate scores both results and blends them using a configurable alpha parameter, giving you the best of semantic and lexical retrieval.

AiTechWorlds Team

✓ Verified Writer

The AiTechWorlds team is passionate about AI, technology, and education. We create high-quality, research-backed content to help you learn, grow, and succeed in the modern digital world.

📱 Follow on Telegram 🐦 Follow on X Learn More →

search relevance ranking showing scores — LangChain advanced RAG retrieval strategies

Agent Development

10 LangChain Retrieval Strategies for Better RAG Results

Go beyond basic similarity search with ParentDocumentRetriever, MultiQueryRetriever, EnsembleRetriever, HyDE, and 6 more LangChain retrieval strategies — with code for each.

May 31, 2026 13 min read

AI agent architecture with memory and tool connections — LangChain agent memory tools

Agent Development

Build a LangChain Agent with Memory and Tools (Full Example)

Build a complete LangChain conversational agent with persistent memory, multiple tools, and step-by-step trace — from setup to a production-ready implementation with code.

May 31, 2026 14 min read

developer coding AI agent decision loop — LangChain agent types ZeroShot ReAct Conversational

Agent Development

5 LangChain Agent Types Explained (ZeroShot, ReAct, and More)

Understand every major LangChain agent type — ZeroShotAgent, ReAct, ConversationalAgent, and more — with Python code and agent trace walkthroughs.

May 31, 2026 13 min read

FastAPI server running LangChain endpoint — deploy LangChain FastAPI REST streaming

Agent Development

How to Deploy a LangChain App as a FastAPI REST Endpoint

Serve a LangChain app as a production FastAPI REST endpoint with streaming, async chains, error handling, and Docker deployment — full Python code included.

May 31, 2026 11 min read

Go deeper on this topic

NotesAI Agent Development Notes NotesRAG: Retrieval-Augmented Generation Guide BookAI Agent Development Guide BookBuilding AI Apps: Developer's Guide CourseAI Agent Development Course ProjectAutonomous Multi-Agent System for Software Development

10K+ Members Growing Daily

Get Free AI Notes Daily

Join AiTechWorlds on Telegram and get daily AI tips, prompt engineering templates, coding resources, and exclusive content — 100% free!

📚 Free Study Notes🤖 AI Tips Daily⚡ Prompt Templates💻 Coding Resources

Join Free Channel

No spam. Leave anytime.

Langchain

How to Use LangChain with Weaviate (Hybrid Search 2026)

⚡ Quick Answer

Connect LangChain to Weaviate for hybrid vector and keyword search. Covers local and cloud setup, nearText, BM25, metadata filtering, and a comparison table.

AiTechWorlds Team May 31, 2026 11 min read

#LangChain #Weaviate #hybrid search #BM25 #vector database

📚Part of the Langchain guide — explore all Langchain articles→

Share:Facebook Twitter/X LinkedIn Telegram WhatsApp

📱

Get more content like this on Telegram!

Daily AI tips, notes & resources — free

Join Free →

What Makes Weaviate Different

Native hybrid search: BM25 + vector in a single query, not two separate requests merged in Python
Multi-tenancy: isolate collections per user or tenant without running separate instances
Generative search: pipe retrieval results directly to an LLM in a single Weaviate query
HNSW + Product Quantization: memory-efficient indexing for large collections
Schema flexibility: optional strict schema or auto-schema creation

Environment Setup

Start a local Weaviate instance with Docker:

docker run -d \
  -p 8080:8080 \
  -p 50051:50051 \
  --name weaviate \
  -e AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED=true \
  -e PERSISTENCE_DATA_PATH=/var/lib/weaviate \
  -e ENABLE_MODULES="text2vec-openai,generative-openai" \
  -e OPENAI_APIKEY=$OPENAI_API_KEY \
  cr.weaviate.io/semitechnologies/weaviate:1.25.0

Install Python dependencies:

pip install langchain-weaviate weaviate-client langchain-openai langchain-community

Connecting LangChain to Weaviate

import weaviate
from weaviate.classes.init import Auth
from langchain_weaviate.vectorstores import WeaviateVectorStore
from langchain_openai import OpenAIEmbeddings

# Local connection
client = weaviate.connect_to_local(
    host="localhost",
    port=8080,
    grpc_port=50051
)

embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

# Create or connect to a collection
vector_store = WeaviateVectorStore(
    client=client,
    index_name="DocumentChunk",
    text_key="content",
    embedding=embeddings,
    attributes=["source", "doc_type", "created_at", "chunk_id"]
)

print("Connected to Weaviate:", client.is_ready())

For Weaviate Cloud Services (WCS):

import weaviate
from weaviate.classes.init import Auth

# Cloud connection
client = weaviate.connect_to_weaviate_cloud(
    cluster_url="https://your-cluster.weaviate.network",
    auth_credentials=Auth.api_key("your-weaviate-api-key"),
    headers={"X-OpenAI-Api-Key": "your-openai-api-key"}
)

vector_store = WeaviateVectorStore(
    client=client,
    index_name="DocumentChunk",
    text_key="content",
    embedding=embeddings
)

The connection swap is the only change between local and cloud — the rest of your code stays identical.

Ingesting Documents

from langchain_community.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_weaviate.vectorstores import WeaviateVectorStore

# Load and split documents
loader = PyPDFLoader("technical_manual.pdf")
documents = loader.load()

splitter = RecursiveCharacterTextSplitter(
    chunk_size=800,
    chunk_overlap=100,
    separators=["\n\n", "\n", ". ", " ", ""]
)
chunks = splitter.split_documents(documents)

# Add metadata to each chunk
for i, chunk in enumerate(chunks):
    chunk.metadata.update({
        "chunk_id": f"chunk_{i:04d}",
        "doc_type": "technical_manual",
        "created_at": "2026-05-31"
    })

# Ingest with auto-generated embeddings
ids = vector_store.add_documents(chunks)
print(f"Ingested {len(ids)} chunks into Weaviate")

Pure Vector Search with nearText

# Simple similarity search
query = "How do I configure the network interface?"
results = vector_store.similarity_search(query, k=5)

for doc in results:
    print(f"Source: {doc.metadata.get('source', 'unknown')}")
    print(f"Content: {doc.page_content[:200]}")
    print("---")

With scores:

results_with_scores = vector_store.similarity_search_with_score(query, k=5)

for doc, score in results_with_scores:
    print(f"Score: {score:.4f} | {doc.page_content[:150]}")

The score here is cosine distance — lower is closer. For semantic concepts this works well, but notice what happens with exact product codes:

# This often fails with pure vector search
exact_query = "Error code E-4021-B network timeout"
results = vector_store.similarity_search(exact_query, k=3)
# Results may be thematically related but miss the exact code

Hybrid Search: Combining nearText and BM25

This is where Weaviate's hybrid search shines. The alpha parameter controls the blend:

alpha=1.0 → pure vector search
alpha=0.0 → pure BM25 keyword search
alpha=0.5 → equal weight to both (recommended starting point)

# Hybrid search via LangChain retriever
retriever = vector_store.as_retriever(
    search_type="similarity",
    search_kwargs={
        "k": 10,
        "alpha": 0.5  # 50/50 blend of vector and BM25
    }
)

# Query that benefits from hybrid: has both semantic and keyword components
query = "Error code E-4021-B network timeout troubleshooting steps"
hybrid_results = retriever.invoke(query)

for doc in hybrid_results:
    print(doc.page_content[:200])
    print("---")

For direct Weaviate hybrid queries with full control:

from weaviate.classes.query import HybridFusion

# Using the Weaviate client directly for maximum control
collection = client.collections.get("DocumentChunk")

response = collection.query.hybrid(
    query=query,
    alpha=0.5,
    fusion_type=HybridFusion.RELATIVE_SCORE,  # or RANKED
    limit=10,
    return_properties=["content", "source", "doc_type"],
    return_metadata=["score", "explain_score"]
)

for obj in response.objects:
    print(f"Score: {obj.metadata.score:.4f}")
    print(f"Content: {obj.properties['content'][:200]}")
    print(f"Explanation: {obj.metadata.explain_score}")
    print("---")

The explain_score field is invaluable for debugging — it tells you which component (vector or BM25) contributed how much to each result's final score.

Metadata Filtering

One of Weaviate's strengths is combining hybrid search with metadata filters. You get semantic + keyword matching constrained to a specific subset of your data.

from weaviate.classes.query import Filter

# Filter by doc_type then hybrid search within that subset
response = collection.query.hybrid(
    query="network configuration error",
    alpha=0.6,
    limit=5,
    filters=Filter.by_property("doc_type").equal("technical_manual")
)

for obj in response.objects:
    print(f"DocType: {obj.properties.get('doc_type')} | Score: {obj.metadata.score:.4f}")
    print(obj.properties["content"][:150])

Complex filter combinations:

from weaviate.classes.query import Filter

# Multiple filter conditions
compound_filter = (
    Filter.by_property("doc_type").equal("technical_manual") &
    Filter.by_property("created_at").greater_than("2026-01-01")
)

response = collection.query.hybrid(
    query="installation requirements",
    alpha=0.5,
    filters=compound_filter,
    limit=8
)

Via LangChain's interface:

results = vector_store.similarity_search(
    query="installation requirements",
    k=8,
    where_filter={
        "path": ["doc_type"],
        "operator": "Equal",
        "valueText": "technical_manual"
    }
)

Building a Hybrid RAG Chain

Now let's integrate hybrid search into a full RAG pipeline:

from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

# Hybrid retriever with tuned alpha
hybrid_retriever = vector_store.as_retriever(
    search_type="similarity",
    search_kwargs={
        "k": 6,
        "alpha": 0.55  # Slightly favor vector for most use cases
    }
)

prompt = ChatPromptTemplate.from_template("""
You are a technical support specialist. Use the following document excerpts to answer the question accurately.

Context:
{context}

Question: {question}

Provide a clear, step-by-step answer. If the information is not in the context, say so explicitly.
""")

def format_docs(docs):
    return "\n\n".join(
        f"[Source: {doc.metadata.get('source', 'unknown')}]\n{doc.page_content}"
        for doc in docs
    )

hybrid_rag_chain = (
    {"context": hybrid_retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

# Test with both semantic and keyword-heavy queries
response = hybrid_rag_chain.invoke("What are the steps to resolve error E-4021-B?")
print(response)

response = hybrid_rag_chain.invoke("Explain the general approach to network troubleshooting")
print(response)

This pairs naturally with the patterns in the RAG system tutorial and builds on what you learn in the LangChain tutorial 2025.

Multi-Tenant Search

Weaviate's multi-tenancy feature is useful when you host a document search service for multiple customers and need strict data isolation.

import weaviate
from weaviate.classes.config import Configure, Property, DataType

# Create a multi-tenant collection
client.collections.create(
    name="CustomerDocuments",
    multi_tenancy_config=Configure.multi_tenancy(enabled=True),
    properties=[
        Property(name="content", data_type=DataType.TEXT),
        Property(name="doc_type", data_type=DataType.TEXT),
        Property(name="source", data_type=DataType.TEXT),
    ]
)

# Add tenants
collection = client.collections.get("CustomerDocuments")
collection.tenants.create([
    weaviate.classes.tenants.Tenant(name="acme_corp"),
    weaviate.classes.tenants.Tenant(name="globex_inc"),
])

# Ingest data for a specific tenant
acme_collection = collection.with_tenant("acme_corp")
acme_collection.data.insert({
    "content": "ACME Corp internal network policy document content here...",
    "doc_type": "policy",
    "source": "acme_intranet"
})

# Query is automatically isolated to the tenant
response = acme_collection.query.hybrid(
    query="network policy",
    alpha=0.5,
    limit=5
)

With LangChain, you can create a tenant-scoped vector store per user request:

def get_tenant_retriever(tenant_id: str) -> object:
    """Return a retriever scoped to a specific tenant."""
    tenant_store = WeaviateVectorStore(
        client=client,
        index_name="CustomerDocuments",
        text_key="content",
        embedding=embeddings,
        tenant=tenant_id
    )
    return tenant_store.as_retriever(
        search_kwargs={"k": 5, "alpha": 0.5}
    )

# Per-request retriever creation
acme_retriever = get_tenant_retriever("acme_corp")
globex_retriever = get_tenant_retriever("globex_inc")

This isolation model is critical for enterprise deployments. The Build AI agent with LangChain guide covers how to route requests to tenant-specific components.

Alpha Tuning: Finding the Right Blend

The right alpha value depends on your content and query patterns. Here is a systematic approach to tuning:

from langchain_core.runnables import RunnableLambda
import statistics

def evaluate_alpha(alpha: float, test_queries: list, ground_truth: list) -> dict:
    """Evaluate retrieval quality at a given alpha value."""
    retriever = vector_store.as_retriever(
        search_kwargs={"k": 5, "alpha": alpha}
    )
    
    hit_rates = []
    for query, relevant_chunks in zip(test_queries, ground_truth):
        results = retriever.invoke(query)
        retrieved_contents = [r.page_content for r in results]
        
        hits = sum(1 for gt in relevant_chunks if any(gt in r for r in retrieved_contents))
        hit_rate = hits / len(relevant_chunks) if relevant_chunks else 0
        hit_rates.append(hit_rate)
    
    return {
        "alpha": alpha,
        "avg_hit_rate": statistics.mean(hit_rates),
        "min_hit_rate": min(hit_rates),
        "max_hit_rate": max(hit_rates)
    }

test_queries = [
    "Error code E-4021-B",
    "How does the authentication system work?",
    "network timeout configuration"
]

ground_truth = [
    ["Error E-4021-B occurs when", "E-4021-B network timeout"],
    ["authentication uses JWT tokens", "the auth flow validates"],
    ["timeout_seconds parameter", "network interface timeout setting"]
]

# Test a range of alpha values
for alpha in [0.0, 0.25, 0.5, 0.75, 1.0]:
    result = evaluate_alpha(alpha, test_queries, ground_truth)
    print(f"Alpha {alpha:.2f}: Hit Rate = {result['avg_hit_rate']:.3f}")

Weaviate Search Mode Comparison

Search Mode	Semantic Understanding	Exact Keyword Match	Metadata Filter	Speed
nearText (vector only)	Excellent	Poor	Yes	Fast
BM25 (keyword only)	Poor	Excellent	Yes	Very fast
Hybrid (alpha=0.5)	Good	Good	Yes	Fast
Hybrid (alpha=0.75)	Very good	Moderate	Yes	Fast
Hybrid (alpha=0.25)	Moderate	Very good	Yes	Fast
Generative search	Excellent	Good	Yes	Slowest

This mirrors findings in the semantic search tutorial — there is no single best configuration, only best configurations for specific query distributions.

Streaming Results with LangChain

For a better user experience in chat interfaces, stream the RAG response:

from langchain_core.callbacks import StreamingStdOutCallbackHandler

streaming_llm = ChatOpenAI(
    model="gpt-4o-mini",
    streaming=True,
    callbacks=[StreamingStdOutCallbackHandler()]
)

streaming_chain = (
    {"context": hybrid_retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | streaming_llm
    | StrOutputParser()
)

# Tokens stream to stdout as they arrive
for chunk in streaming_chain.stream("How do I reset the admin password?"):
    pass  # StreamingStdOutCallbackHandler handles printing

Weaviate Backend Cleanup

Always close the Weaviate client connection when your application shuts down:

import atexit

@atexit.register
def cleanup():
    client.close()
    print("Weaviate connection closed.")

# Or use as a context manager in scripts:
with weaviate.connect_to_local() as client:
    store = WeaviateVectorStore(client=client, ...)
    results = store.similarity_search("test query", k=3)
    # Connection closes automatically when the block exits

Combining Hybrid Search with Agent Tools

Wrapping the hybrid retriever as a LangChain tool makes it accessible to agents built with the patterns in Build AI agent with LangChain:

from langchain.tools.retriever import create_retriever_tool

hybrid_search_tool = create_retriever_tool(
    retriever=hybrid_retriever,
    name="hybrid_document_search",
    description=(
        "Search the technical documentation using hybrid vector and keyword search. "
        "Use this for both broad conceptual questions and specific error codes or product names."
    )
)

# Add to an agent
from langchain.agents import AgentExecutor, create_openai_tools_agent
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

agent_prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful technical support assistant."),
    MessagesPlaceholder(variable_name="chat_history"),
    ("human", "{input}"),
    MessagesPlaceholder(variable_name="agent_scratchpad"),
])

agent = create_openai_tools_agent(llm, [hybrid_search_tool], agent_prompt)
agent_executor = AgentExecutor(agent=agent, tools=[hybrid_search_tool], verbose=True)

result = agent_executor.invoke({
    "input": "I'm seeing error E-4021-B, what should I do?",
    "chat_history": []
})
print(result["output"])

This combination of hybrid retrieval and agent orchestration connects directly to what you see in the AI research agent build for more complex retrieval workflows.

Key Takeaways

Frequently Asked Questions

Share this article:Facebook Twitter/X LinkedIn Telegram WhatsApp

💬 DiscussionPowered by GitHub Discussions

Frequently Asked Questions

AiTechWorlds Team

✓ Verified Writer

The AiTechWorlds team is passionate about AI, technology, and education. We create high-quality, research-backed content to help you learn, grow, and succeed in the modern digital world.

📱 Follow on Telegram 🐦 Follow on X Learn More →

Agent Development

10 LangChain Retrieval Strategies for Better RAG Results

Go beyond basic similarity search with ParentDocumentRetriever, MultiQueryRetriever, EnsembleRetriever, HyDE, and 6 more LangChain retrieval strategies — with code for each.

May 31, 2026 13 min read

Agent Development

Build a LangChain Agent with Memory and Tools (Full Example)

Build a complete LangChain conversational agent with persistent memory, multiple tools, and step-by-step trace — from setup to a production-ready implementation with code.

May 31, 2026 14 min read

Agent Development

5 LangChain Agent Types Explained (ZeroShot, ReAct, and More)

Understand every major LangChain agent type — ZeroShotAgent, ReAct, ConversationalAgent, and more — with Python code and agent trace walkthroughs.

May 31, 2026 13 min read

Agent Development

How to Deploy a LangChain App as a FastAPI REST Endpoint

Serve a LangChain app as a production FastAPI REST endpoint with streaming, async chains, error handling, and Docker deployment — full Python code included.

May 31, 2026 11 min read

Go deeper on this topic

10K+ Members Growing Daily

Get Free AI Notes Daily

Join AiTechWorlds on Telegram and get daily AI tips, prompt engineering templates, coding resources, and exclusive content — 100% free!

📚 Free Study Notes🤖 AI Tips Daily⚡ Prompt Templates💻 Coding Resources

Join Free Channel

No spam. Leave anytime.

How to Use LangChain with Weaviate (Hybrid Search 2026)

What Makes Weaviate Different

Environment Setup

Connecting LangChain to Weaviate

Ingesting Documents

Pure Vector Search with nearText

Hybrid Search: Combining nearText and BM25

Metadata Filtering

Building a Hybrid RAG Chain

Multi-Tenant Search

Alpha Tuning: Finding the Right Blend

Weaviate Search Mode Comparison

Streaming Results with LangChain

Weaviate Backend Cleanup

Combining Hybrid Search with Agent Tools

Key Takeaways

Frequently Asked Questions

💬 DiscussionPowered by GitHub Discussions

Frequently Asked Questions

AiTechWorlds Team

Related Articles

10 LangChain Retrieval Strategies for Better RAG Results

Build a LangChain Agent with Memory and Tools (Full Example)

5 LangChain Agent Types Explained (ZeroShot, ReAct, and More)

How to Deploy a LangChain App as a FastAPI REST Endpoint

Go deeper on this topic

Get Free AI Notes Daily

How to Use LangChain with Weaviate (Hybrid Search 2026)

What Makes Weaviate Different

Environment Setup

Connecting LangChain to Weaviate

Ingesting Documents

Pure Vector Search with nearText

Hybrid Search: Combining nearText and BM25

Metadata Filtering

Building a Hybrid RAG Chain

Multi-Tenant Search

Alpha Tuning: Finding the Right Blend

Weaviate Search Mode Comparison

Streaming Results with LangChain

Weaviate Backend Cleanup

Combining Hybrid Search with Agent Tools

Key Takeaways

Frequently Asked Questions

💬 DiscussionPowered by GitHub Discussions

Frequently Asked Questions

AiTechWorlds Team

Related Articles

10 LangChain Retrieval Strategies for Better RAG Results

Build a LangChain Agent with Memory and Tools (Full Example)

5 LangChain Agent Types Explained (ZeroShot, ReAct, and More)

How to Deploy a LangChain App as a FastAPI REST Endpoint

Go deeper on this topic

Get Free AI Notes Daily