LangChain Architecture & Core Concepts | AI Agent Development Course | AiTechWorlds

LangChain Architecture: The Building Blocks of AI Agents

LangChain is the most widely adopted framework for building LLM-powered applications and agents. Understanding its architecture helps you build more effectively and debug faster when things go wrong.

The LangChain Philosophy

LangChain is built around composability — combining small, focused components into larger, more capable systems. Each component does one thing; chains combine them; agents orchestrate them dynamically.

The core insight: most LLM applications follow similar patterns (retrieve context → format prompt → call LLM → parse output). LangChain provides standardized components for each step, so you spend time on your application logic, not boilerplate.

Core Components

LLMs and Chat Models

The foundation — a wrapper around LLM APIs that provides a consistent interface:

from langchain_openai import ChatOpenAI
from langchain_anthropic import ChatAnthropic

# OpenAI
gpt4o = ChatOpenAI(model="gpt-4o", temperature=0)
gpt4o_mini = ChatOpenAI(model="gpt-4o-mini", temperature=0.7)

# Anthropic
claude = ChatAnthropic(model="claude-sonnet-4-6", max_tokens=4096)

# Usage — .invoke() for single calls
response = gpt4o.invoke("What is the capital of France?")
print(response.content)  # "Paris"

# Streaming
for chunk in gpt4o.stream("Tell me a story"):
    print(chunk.content, end="", flush=True)

The same .invoke() and .stream() interface works across all providers — you can swap GPT-4o for Claude without changing application code.

Messages

LangChain uses structured message types that map to the chat API format:

from langchain_core.messages import SystemMessage, HumanMessage, AIMessage

messages = [
    SystemMessage(content="You are a helpful Python tutor."),
    HumanMessage(content="What is a list comprehension?"),
]

response = gpt4o.invoke(messages)
print(response)  # AIMessage with .content

# The response is an AIMessage — append it for multi-turn conversation
messages.append(response)
messages.append(HumanMessage(content="Can you show me an example?"))
follow_up = gpt4o.invoke(messages)

Prompt Templates

Reusable templates that inject dynamic values:

from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

# Basic template
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are an expert in {domain}. Answer in {language}."),
    ("human", "{question}")
])

# Render the template with values
formatted = prompt.invoke({
    "domain": "machine learning",
    "language": "simple English",
    "question": "What is gradient descent?"
})

# Chain template with LLM
chain = prompt | gpt4o  # The LCEL pipe syntax
response = chain.invoke({
    "domain": "Python",
    "language": "English",
    "question": "What are decorators?"
})
print(response.content)

MessagesPlaceholder for Conversation History

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant."),
    MessagesPlaceholder(variable_name="chat_history"),  # Insert history here
    ("human", "{input}")
])

chain = prompt | gpt4o

# First turn
history = []
response = chain.invoke({"input": "My name is Alice", "chat_history": history})

# Second turn — pass history
history.append(HumanMessage(content="My name is Alice"))
history.append(response)
response = chain.invoke({"input": "What's my name?", "chat_history": history})
print(response.content)  # "Your name is Alice."

Output Parsers

Transform LLM text output into structured data:

from langchain_core.output_parsers import StrOutputParser, JsonOutputParser
from langchain_core.pydantic_v1 import BaseModel, Field

# String parser (default)
str_chain = prompt | gpt4o | StrOutputParser()

# JSON parser with schema
class ProductReview(BaseModel):
    sentiment: str = Field(description="positive, negative, or neutral")
    rating: int = Field(description="1-5 star rating")
    summary: str = Field(description="One-sentence summary")
    key_points: list[str] = Field(description="Main points mentioned")

parser = JsonOutputParser(pydantic_object=ProductReview)

structured_chain = (
    ChatPromptTemplate.from_messages([
        ("system", "Analyze the following product review. {format_instructions}"),
        ("human", "{review}")
    ])
    | gpt4o
    | parser
)

result = structured_chain.invoke({
    "review": "Amazing product! Works exactly as described. Fast shipping. Will buy again.",
    "format_instructions": parser.get_format_instructions()
})

print(result)  # ProductReview(sentiment='positive', rating=5, ...)

LCEL: LangChain Expression Language

The | (pipe) operator chains components together:

# Each component's output becomes the next component's input
chain = prompt | llm | output_parser

# This is equivalent to:
formatted = prompt.invoke(inputs)
response = llm.invoke(formatted)
result = output_parser.invoke(response)

# Multi-step chains with RunnableLambda
from langchain_core.runnables import RunnableLambda

def format_for_email(text: str) -> str:
    return f"Subject: AI Response\n\n{text}"

chain = prompt | llm | StrOutputParser() | RunnableLambda(format_for_email)

Parallel Execution with RunnableParallel

from langchain_core.runnables import RunnableParallel

# Run multiple chains in parallel, combine results
parallel_chain = RunnableParallel({
    "pros": pros_prompt | llm | StrOutputParser(),
    "cons": cons_prompt | llm | StrOutputParser(),
    "summary": summary_prompt | llm | StrOutputParser()
})

# All three run simultaneously
result = parallel_chain.invoke({"product": "MacBook Pro M4"})
# result = {"pros": "...", "cons": "...", "summary": "..."}

Document Loaders and Text Splitters

For processing external documents:

from langchain_community.document_loaders import (
    PyPDFLoader, WebBaseLoader, TextLoader, CSVLoader
)
from langchain.text_splitter import RecursiveCharacterTextSplitter

# Load a PDF
loader = PyPDFLoader("report.pdf")
documents = loader.load()

# Load a web page
web_loader = WebBaseLoader("https://docs.langchain.com")
docs = web_loader.load()

# Split into chunks for vector embedding
splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000,       # Characters per chunk
    chunk_overlap=200,     # Overlap prevents context loss at boundaries
    separators=["\n\n", "\n", ".", " "]  # Try these separators in order
)

chunks = splitter.split_documents(documents)
print(f"Split {len(documents)} documents into {len(chunks)} chunks")

Embeddings and Vector Stores

Convert text to vectors for semantic search:

from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import Chroma

# Create embeddings model
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

# Create vector store from documents
vectorstore = Chroma.from_documents(
    documents=chunks,
    embedding=embeddings,
    persist_directory="./chroma_db"  # Save to disk
)

# Load existing vector store
vectorstore = Chroma(
    persist_directory="./chroma_db",
    embedding_function=embeddings
)

# Semantic search
results = vectorstore.similarity_search("What is the company's revenue?", k=3)
for doc in results:
    print(doc.page_content)
    print(doc.metadata)

Retrieval Chains (RAG)

Combine retrieval with generation — the foundation of RAG:

from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain.chains import create_retrieval_chain

# Retriever from vector store
retriever = vectorstore.as_retriever(search_kwargs={"k": 4})

# Chain to combine retrieved docs with the question
combine_docs_chain = create_stuff_documents_chain(
    llm=gpt4o,
    prompt=ChatPromptTemplate.from_messages([
        ("system", "Answer based on the provided context:\n\n{context}"),
        ("human", "{input}")
    ])
)

# Full RAG chain
rag_chain = create_retrieval_chain(retriever, combine_docs_chain)

response = rag_chain.invoke({"input": "What were the Q3 revenue numbers?"})
print(response["answer"])
print(response["context"])  # The retrieved documents used

LangSmith Observability

With your LangSmith keys set in .env, every chain execution is automatically traced. You see:

The exact prompt sent to the LLM
Token counts and costs
Latency at each step
Tool calls and their results
Any errors

This is invaluable for debugging and optimizing agents.

Next lesson: LangChain chains — building multi-step workflows for complex tasks.