Follow AiTechWorlds on LinkedIn for professional AI content!Follow Now →
20 minLesson 5 of 23
LangChain Mastery

LangChain Architecture & Core Concepts

LangChain Architecture: The Building Blocks of AI Agents

LangChain is the most widely adopted framework for building LLM-powered applications and agents. Understanding its architecture helps you build more effectively and debug faster when things go wrong.

The LangChain Philosophy

LangChain is built around composability — combining small, focused components into larger, more capable systems. Each component does one thing; chains combine them; agents orchestrate them dynamically.

The core insight: most LLM applications follow similar patterns (retrieve context → format prompt → call LLM → parse output). LangChain provides standardized components for each step, so you spend time on your application logic, not boilerplate.

Core Components

LLMs and Chat Models

The foundation — a wrapper around LLM APIs that provides a consistent interface:

from langchain_openai import ChatOpenAI
from langchain_anthropic import ChatAnthropic

# OpenAI
gpt4o = ChatOpenAI(model="gpt-4o", temperature=0)
gpt4o_mini = ChatOpenAI(model="gpt-4o-mini", temperature=0.7)

# Anthropic
claude = ChatAnthropic(model="claude-sonnet-4-6", max_tokens=4096)

# Usage — .invoke() for single calls
response = gpt4o.invoke("What is the capital of France?")
print(response.content)  # "Paris"

# Streaming
for chunk in gpt4o.stream("Tell me a story"):
    print(chunk.content, end="", flush=True)

The same .invoke() and .stream() interface works across all providers — you can swap GPT-4o for Claude without changing application code.

Messages

LangChain uses structured message types that map to the chat API format:

from langchain_core.messages import SystemMessage, HumanMessage, AIMessage

messages = [
    SystemMessage(content="You are a helpful Python tutor."),
    HumanMessage(content="What is a list comprehension?"),
]

response = gpt4o.invoke(messages)
print(response)  # AIMessage with .content

# The response is an AIMessage — append it for multi-turn conversation
messages.append(response)
messages.append(HumanMessage(content="Can you show me an example?"))
follow_up = gpt4o.invoke(messages)

Prompt Templates

Reusable templates that inject dynamic values:

from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

# Basic template
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are an expert in {domain}. Answer in {language}."),
    ("human", "{question}")
])

# Render the template with values
formatted = prompt.invoke({
    "domain": "machine learning",
    "language": "simple English",
    "question": "What is gradient descent?"
})

# Chain template with LLM
chain = prompt | gpt4o  # The LCEL pipe syntax
response = chain.invoke({
    "domain": "Python",
    "language": "English",
    "question": "What are decorators?"
})
print(response.content)

MessagesPlaceholder for Conversation History

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant."),
    MessagesPlaceholder(variable_name="chat_history"),  # Insert history here
    ("human", "{input}")
])

chain = prompt | gpt4o

# First turn
history = []
response = chain.invoke({"input": "My name is Alice", "chat_history": history})

# Second turn — pass history
history.append(HumanMessage(content="My name is Alice"))
history.append(response)
response = chain.invoke({"input": "What's my name?", "chat_history": history})
print(response.content)  # "Your name is Alice."

Output Parsers

Transform LLM text output into structured data:

from langchain_core.output_parsers import StrOutputParser, JsonOutputParser
from langchain_core.pydantic_v1 import BaseModel, Field

# String parser (default)
str_chain = prompt | gpt4o | StrOutputParser()

# JSON parser with schema
class ProductReview(BaseModel):
    sentiment: str = Field(description="positive, negative, or neutral")
    rating: int = Field(description="1-5 star rating")
    summary: str = Field(description="One-sentence summary")
    key_points: list[str] = Field(description="Main points mentioned")

parser = JsonOutputParser(pydantic_object=ProductReview)

structured_chain = (
    ChatPromptTemplate.from_messages([
        ("system", "Analyze the following product review. {format_instructions}"),
        ("human", "{review}")
    ])
    | gpt4o
    | parser
)

result = structured_chain.invoke({
    "review": "Amazing product! Works exactly as described. Fast shipping. Will buy again.",
    "format_instructions": parser.get_format_instructions()
})

print(result)  # ProductReview(sentiment='positive', rating=5, ...)

LCEL: LangChain Expression Language

The | (pipe) operator chains components together:

# Each component's output becomes the next component's input
chain = prompt | llm | output_parser

# This is equivalent to:
formatted = prompt.invoke(inputs)
response = llm.invoke(formatted)
result = output_parser.invoke(response)

# Multi-step chains with RunnableLambda
from langchain_core.runnables import RunnableLambda

def format_for_email(text: str) -> str:
    return f"Subject: AI Response\n\n{text}"

chain = prompt | llm | StrOutputParser() | RunnableLambda(format_for_email)

Parallel Execution with RunnableParallel

from langchain_core.runnables import RunnableParallel

# Run multiple chains in parallel, combine results
parallel_chain = RunnableParallel({
    "pros": pros_prompt | llm | StrOutputParser(),
    "cons": cons_prompt | llm | StrOutputParser(),
    "summary": summary_prompt | llm | StrOutputParser()
})

# All three run simultaneously
result = parallel_chain.invoke({"product": "MacBook Pro M4"})
# result = {"pros": "...", "cons": "...", "summary": "..."}

Document Loaders and Text Splitters

For processing external documents:

from langchain_community.document_loaders import (
    PyPDFLoader, WebBaseLoader, TextLoader, CSVLoader
)
from langchain.text_splitter import RecursiveCharacterTextSplitter

# Load a PDF
loader = PyPDFLoader("report.pdf")
documents = loader.load()

# Load a web page
web_loader = WebBaseLoader("https://docs.langchain.com")
docs = web_loader.load()

# Split into chunks for vector embedding
splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000,       # Characters per chunk
    chunk_overlap=200,     # Overlap prevents context loss at boundaries
    separators=["\n\n", "\n", ".", " "]  # Try these separators in order
)

chunks = splitter.split_documents(documents)
print(f"Split {len(documents)} documents into {len(chunks)} chunks")

Embeddings and Vector Stores

Convert text to vectors for semantic search:

from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import Chroma

# Create embeddings model
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

# Create vector store from documents
vectorstore = Chroma.from_documents(
    documents=chunks,
    embedding=embeddings,
    persist_directory="./chroma_db"  # Save to disk
)

# Load existing vector store
vectorstore = Chroma(
    persist_directory="./chroma_db",
    embedding_function=embeddings
)

# Semantic search
results = vectorstore.similarity_search("What is the company's revenue?", k=3)
for doc in results:
    print(doc.page_content)
    print(doc.metadata)

Retrieval Chains (RAG)

Combine retrieval with generation — the foundation of RAG:

from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain.chains import create_retrieval_chain

# Retriever from vector store
retriever = vectorstore.as_retriever(search_kwargs={"k": 4})

# Chain to combine retrieved docs with the question
combine_docs_chain = create_stuff_documents_chain(
    llm=gpt4o,
    prompt=ChatPromptTemplate.from_messages([
        ("system", "Answer based on the provided context:\n\n{context}"),
        ("human", "{input}")
    ])
)

# Full RAG chain
rag_chain = create_retrieval_chain(retriever, combine_docs_chain)

response = rag_chain.invoke({"input": "What were the Q3 revenue numbers?"})
print(response["answer"])
print(response["context"])  # The retrieved documents used

LangSmith Observability

With your LangSmith keys set in .env, every chain execution is automatically traced. You see:

  • The exact prompt sent to the LLM
  • Token counts and costs
  • Latency at each step
  • Tool calls and their results
  • Any errors

This is invaluable for debugging and optimizing agents.

Next lesson: LangChain chains — building multi-step workflows for complex tasks.

📱

Get this course's notes on Telegram!

Free cheat sheets, summaries & practice exercises

Get Notes Free →
!