AI Tips Prompting Python AI Tools Web Dev ChatGPT LLM Agent Dev Reviews Notes Free Books

AiTechWorlds

error recovery code with retry logic — LangChain output fixer retry malformed

7 LangChain Output Fixer and Retry Strategies (2026)

⚡ Quick Answer

Stop losing data to malformed LLM outputs. Learn 7 LangChain error recovery strategies including OutputFixingParser, RetryOutputParser, fallbacks, and exponential backoff.

AiTechWorlds Team May 31, 2026 13 min read

#LangChain #output parser #error recovery #retry #OutputFixingParser

📚Part of the Langchain guide — explore all Langchain articles→

Share:Facebook Twitter/X LinkedIn Telegram WhatsApp

📱

Get more content like this on Telegram!

Daily AI tips, notes & resources — free

Join Free →

LLMs generate text. Structured applications need structured data. That gap — between raw model output and the JSON, Pydantic models, or lists your code expects — is where production AI systems fail silently or crash loudly.

LangChain provides seven distinct strategies for handling malformed output, from simple auto-fix to sophisticated fallback chains. This guide covers all of them with real code and tells you which to use in each situation.

Before diving in, the LangChain tutorial 2025 covers the basic output parser types if you need a refresher on JsonOutputParser and PydanticOutputParser.

Why Output Parsing Fails

Before fixing failures, understand what causes them:

# The model was asked for JSON but returned this:
broken_output_1 = """
Sure! Here's the JSON you requested:
# [json block start]
{
  "name": "Alice",
  "age": 30,
  "skills": ["Python", "ML",]  // trailing comma
}
# [block end]
"""

broken_output_2 = '{"name": "Bob", "age": 25, "skills": ["Python"'  # truncated

broken_output_3 = """
{
  "name": "Charlie"
  "age": 28,           // missing comma after "Charlie"
  "skills": ["Go"]
}
"""

The three root causes are:

Markdown wrapping — model adds ```json blocks around output
Syntax errors — trailing commas, missing quotes, JavaScript-style comments
Truncation — output cut off near the context limit

All three are fixable with the right strategy.

Strategy 1: OutputFixingParser

OutputFixingParser wraps any existing parser. When the inner parser fails, it sends the bad output to an LLM with a request to fix it:

from langchain.output_parsers import OutputFixingParser
from langchain_core.output_parsers import JsonOutputParser
from langchain_openai import ChatOpenAI
from pydantic import BaseModel, Field
from typing import List

class UserProfile(BaseModel):
    name: str = Field(description="User's full name")
    age: int = Field(description="User's age")
    skills: List[str] = Field(description="List of technical skills")

# Base parser
json_parser = JsonOutputParser(pydantic_object=UserProfile)

# Wrapping parser with auto-fix capability
fixing_parser = OutputFixingParser.from_llm(
    parser=json_parser,
    llm=ChatOpenAI(model="gpt-4o-mini", temperature=0)
)

# Test with broken JSON
broken_json = """
Sure! Here's the data:
# [json block start]
{
  "name": "Alice Johnson",
  "age": 30,
  "skills": ["Python", "Machine Learning",]
}
# [block end]
"""

try:
    result = fixing_parser.parse(broken_json)
    print(f"Fixed successfully: {result}")
    # → UserProfile(name='Alice Johnson', age=30, skills=['Python', 'Machine Learning'])
except Exception as e:
    print(f"Still failed after fixing: {e}")

How it works: The fixing parser sends a prompt like "The following output didn't parse correctly: output. Error: error. Please return only the valid JSON." The repair LLM typically fixes the issue in one call.

Cost consideration: Each fix attempt makes one extra LLM call. With gpt-4o-mini, this costs roughly $0.0001 per fix — negligible for most applications.

Strategy 2: RetryOutputParser

RetryOutputParser is different — it includes the original prompt in the retry, not just the failed output. This is more powerful for cases where the model needs context to generate the right structure:

from langchain.output_parsers import RetryOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI

# Define expected output
class ProductReview(BaseModel):
    product_name: str
    rating: int = Field(ge=1, le=5, description="Rating from 1 to 5")
    pros: List[str] = Field(min_items=1, description="List of pros")
    cons: List[str] = Field(description="List of cons")
    summary: str

parser = JsonOutputParser(pydantic_object=ProductReview)

# Build the chain
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a product reviewer. Extract structured review data."),
    ("human", "{review_text}\n\n{format_instructions}")
])

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

# Retry parser wraps the original chain
retry_parser = RetryOutputParser.from_llm(
    parser=parser,
    llm=llm,
    max_retries=3
)

# Manual retry usage
chain = prompt | llm

def parse_with_retry(review_text: str) -> ProductReview:
    prompt_value = prompt.invoke({
        "review_text": review_text,
        "format_instructions": parser.get_format_instructions()
    })
    
    output = llm.invoke(prompt_value)
    
    try:
        return parser.parse(output.content)
    except Exception:
        # Retry with full context
        return retry_parser.parse_with_prompt(
            completion=output.content,
            prompt_value=prompt_value
        )

result = parse_with_retry("This laptop is amazing! Great battery life and speed, but a bit heavy.")
print(result)

Strategy 3: with_retry in LCEL

For LCEL chains, use the built-in .with_retry() method on any runnable:

from langchain_core.runnables import RunnableRetry
from langchain_openai import ChatOpenAI
from langchain_core.output_parsers import JsonOutputParser

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
parser = JsonOutputParser()

# Simple retry on the chain
chain_with_retry = (
    ChatPromptTemplate.from_template(
        "Extract a JSON object with 'title' and 'year' from: {text}\n\nReturn only JSON."
    )
    | llm
    | parser
).with_retry(
    retry_if_exception_type=(ValueError, Exception),
    stop_after_attempt=3,
    wait_exponential_jitter=True  # adds random jitter to exponential backoff
)

# Or configure more precisely
from tenacity import retry, stop_after_attempt, wait_exponential, retry_if_exception_type

chain_with_advanced_retry = (
    ChatPromptTemplate.from_template("Extract JSON: {text}")
    | llm
    | parser
).with_retry(
    retry_if_exception_type=(ValueError,),
    stop_after_attempt=3,
)

try:
    result = chain_with_retry.invoke({"text": "The Matrix was released in 1999"})
    print(result)
except Exception as e:
    print(f"Failed after 3 attempts: {e}")

.with_retry() is the simplest way to add retry logic to any LCEL chain. It wraps the entire chain, not just the parser, so transient API errors and rate limits are also handled.

Strategy 4: Fallback Chains

Use .with_fallbacks() to define backup chains when the primary chain fails:

from langchain_core.runnables import RunnableWithFallbacks
from langchain_openai import ChatOpenAI
from langchain_core.output_parsers import StrOutputParser, JsonOutputParser

# Primary chain: structured JSON output
primary_llm = ChatOpenAI(model="gpt-4o", temperature=0)
primary_parser = JsonOutputParser(pydantic_object=UserProfile)

primary_chain = (
    ChatPromptTemplate.from_template(
        "Extract user profile JSON from: {text}\n\n{format_instructions}"
    ).partial(format_instructions=primary_parser.get_format_instructions())
    | primary_llm
    | primary_parser
)

# Fallback 1: cheaper model with the same parser
fallback_llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
fallback_chain_1 = (
    ChatPromptTemplate.from_template(
        "Extract user profile JSON from: {text}\n\n{format_instructions}"
    ).partial(format_instructions=primary_parser.get_format_instructions())
    | fallback_llm
    | OutputFixingParser.from_llm(parser=primary_parser, llm=fallback_llm)
)

# Fallback 2: unstructured text extraction (always succeeds)
fallback_chain_2 = (
    ChatPromptTemplate.from_template(
        "Extract the person's name, age, and skills from this text as plain text: {text}"
    )
    | fallback_llm
    | StrOutputParser()
)

# Chain them together
resilient_chain = primary_chain.with_fallbacks(
    fallbacks=[fallback_chain_1, fallback_chain_2],
    exception_to_ignore_types=[ValueError]  # only fall back on parsing errors
)

# The chain tries primary, then fallback_1, then fallback_2
result = resilient_chain.invoke({
    "text": "Sarah, 28, works as a backend engineer specializing in Rust and distributed systems"
})
print(result)

This pattern is valuable when you need graceful degradation. A document processing pipeline might fall back from structured extraction to unstructured text rather than failing the entire job.

Strategy 5: Custom Error Handler with LCEL

Build a custom error-handling wrapper that logs failures and attempts recovery:

from langchain_core.runnables import RunnableLambda
from langchain_core.output_parsers import JsonOutputParser
from langchain_openai import ChatOpenAI
import logging
import json

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

def create_resilient_parser(schema_class, llm=None, max_attempts=3):
    """Create a parser with logging, fixing, and fallback logic."""
    
    base_parser = JsonOutputParser(pydantic_object=schema_class)
    fix_llm = llm or ChatOpenAI(model="gpt-4o-mini", temperature=0)
    fixing_parser = OutputFixingParser.from_llm(parser=base_parser, llm=fix_llm)
    
    def parse_with_recovery(text: str):
        attempt = 0
        last_error = None
        
        while attempt < max_attempts:
            try:
                if attempt == 0:
                    # First attempt: direct parse
                    result = base_parser.parse(text)
                    logger.info(f"Parsed successfully on attempt {attempt + 1}")
                    return result
                elif attempt == 1:
                    # Second attempt: auto-fix
                    result = fixing_parser.parse(text)
                    logger.info(f"Fixed and parsed on attempt {attempt + 1}")
                    return result
                else:
                    # Third attempt: aggressive JSON extraction
                    extracted = extract_json_aggressive(text)
                    result = base_parser.parse(extracted)
                    logger.info(f"Aggressively extracted on attempt {attempt + 1}")
                    return result
            except Exception as e:
                last_error = e
                logger.warning(f"Attempt {attempt + 1} failed: {e}")
                attempt += 1
        
        logger.error(f"All {max_attempts} attempts failed. Last error: {last_error}")
        raise last_error
    
    return RunnableLambda(parse_with_recovery)

def extract_json_aggressive(text: str) -> str:
    """Try multiple strategies to extract valid JSON from messy text."""
    import re
    
    # Strategy 1: Find JSON between code blocks
    code_block_match = re.search(r'```(?:json)?\s*([\s\S]*?)```', text)
    if code_block_match:
        return code_block_match.group(1).strip()
    
    # Strategy 2: Find first complete JSON object
    brace_match = re.search(r'\{[\s\S]*\}', text)
    if brace_match:
        candidate = brace_match.group(0)
        try:
            json.loads(candidate)
            return candidate
        except json.JSONDecodeError:
            pass
    
    # Strategy 3: Fix common issues
    cleaned = text.strip()
    cleaned = re.sub(r',\s*([}\]])', r'\1', cleaned)  # Remove trailing commas
    cleaned = re.sub(r'//.*$', '', cleaned, flags=re.MULTILINE)  # Remove JS comments
    cleaned = re.sub(r'/\*.*?\*/', '', cleaned, flags=re.DOTALL)  # Remove block comments
    
    return cleaned

# Usage
resilient = create_resilient_parser(UserProfile, max_attempts=3)
result = resilient.invoke('{"name": "Dave", "age": 35, "skills": ["Python", "Go",]}')

Strategy 6: Pydantic Validators for Pre-Parse Correction

Add Pydantic validators that clean data before validation, reducing parse failures:

from pydantic import BaseModel, Field, field_validator, model_validator
from typing import List, Optional, Union
import re

class RobustUserProfile(BaseModel):
    name: str
    age: Union[int, str]  # accept string age and coerce
    skills: Union[List[str], str]  # accept comma-separated string
    email: Optional[str] = None

    @field_validator("age", mode="before")
    @classmethod
    def coerce_age(cls, v):
        """Accept '30 years old', '30', or 30."""
        if isinstance(v, int):
            return v
        # Extract digits
        digits = re.findall(r'\d+', str(v))
        if digits:
            return int(digits[0])
        raise ValueError(f"Cannot parse age from: {v}")

    @field_validator("skills", mode="before")
    @classmethod
    def coerce_skills(cls, v):
        """Accept list, comma-separated string, or JSON string."""
        if isinstance(v, list):
            return v
        if isinstance(v, str):
            # Try JSON parsing first
            try:
                parsed = json.loads(v)
                if isinstance(parsed, list):
                    return parsed
            except json.JSONDecodeError:
                pass
            # Fall back to comma splitting
            return [s.strip() for s in v.split(",") if s.strip()]
        return v

    @field_validator("email", mode="before")
    @classmethod
    def validate_email(cls, v):
        """Accept None, empty string, or valid email."""
        if not v:
            return None
        # Basic email validation — don't reject, just normalize
        v = str(v).strip().lower()
        return v if "@" in v else None

    @model_validator(mode="after")
    def check_age_range(self):
        """Post-parse validation with auto-correction."""
        if self.age < 0:
            self.age = 0
        if self.age > 150:
            self.age = 150
        return self

# Test with messy inputs
parser = JsonOutputParser(pydantic_object=RobustUserProfile)

messy_json = '{"name": "Eve", "age": "28 years old", "skills": "Python, ML, LangChain", "email": ""}'
result = RobustUserProfile.model_validate_json(messy_json)
print(result)
# → RobustUserProfile(name='Eve', age=28, skills=['Python', 'ML', 'LangChain'], email=None)

Pydantic validators that coerce types reduce the surface area for parsing failures significantly. The model can output "age": "28 years old" and your application still gets age=28.

Strategy 7: Exponential Backoff for Rate Limits

API rate limits cause a different category of failures. Handle them with proper exponential backoff:

import time
import random
from functools import wraps
from typing import Callable, TypeVar, Any
import openai

T = TypeVar("T")

def exponential_backoff(
    max_retries: int = 5,
    base_delay: float = 1.0,
    max_delay: float = 60.0,
    jitter: bool = True,
    retryable_exceptions=(openai.RateLimitError, openai.APIConnectionError, openai.APITimeoutError)
):
    """Decorator for exponential backoff with jitter."""
    def decorator(func: Callable[..., T]) -> Callable[..., T]:
        @wraps(func)
        def wrapper(*args, **kwargs) -> T:
            delay = base_delay
            last_exception = None
            
            for attempt in range(max_retries + 1):
                try:
                    return func(*args, **kwargs)
                except retryable_exceptions as e:
                    last_exception = e
                    if attempt == max_retries:
                        break
                    
                    actual_delay = min(delay, max_delay)
                    if jitter:
                        actual_delay *= (0.5 + random.random())
                    
                    logger.warning(
                        f"Attempt {attempt + 1}/{max_retries} failed: {e}. "
                        f"Retrying in {actual_delay:.1f}s..."
                    )
                    time.sleep(actual_delay)
                    delay *= 2
                except Exception as e:
                    # Non-retryable exceptions fail immediately
                    raise
            
            raise last_exception
        return wrapper
    return decorator

# Apply to LangChain chain invocation
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import JsonOutputParser

@exponential_backoff(max_retries=5, base_delay=1.0)
def invoke_with_backoff(chain, input_data):
    return chain.invoke(input_data)

# Usage
llm = ChatOpenAI(model="gpt-4o")
parser = JsonOutputParser(pydantic_object=UserProfile)
chain = ChatPromptTemplate.from_template("Extract JSON: {text}") | llm | parser

result = invoke_with_backoff(chain, {"text": "Alice is 30 and knows Python"})

LCEL Async Retry with Rate Limit Handling

For async production code:

import asyncio
from langchain_core.runnables import RunnableLambda
from langchain_openai import ChatOpenAI

async def robust_chain_invoke(chain, inputs: list[dict], max_concurrent: int = 5) -> list:
    """Run multiple chain invocations with concurrency limiting and retry."""
    semaphore = asyncio.Semaphore(max_concurrent)
    
    async def invoke_one(input_data: dict):
        async with semaphore:
            for attempt in range(3):
                try:
                    return await chain.ainvoke(input_data)
                except Exception as e:
                    if attempt == 2:
                        logger.error(f"Failed after 3 attempts: {e}")
                        return {"error": str(e), "input": input_data}
                    wait = 2 ** attempt + random.uniform(0, 1)
                    await asyncio.sleep(wait)
    
    tasks = [invoke_one(inp) for inp in inputs]
    return await asyncio.gather(*tasks)

# Process 50 documents concurrently (max 5 at a time)
chain_with_retry = (
    ChatPromptTemplate.from_template("Extract JSON from: {text}")
    | ChatOpenAI(model="gpt-4o-mini")
    | OutputFixingParser.from_llm(
        parser=JsonOutputParser(pydantic_object=UserProfile),
        llm=ChatOpenAI(model="gpt-4o-mini")
    )
)

documents = [{"text": f"User {i}: age {20+i}, skills Python"} for i in range(50)]
results = asyncio.run(robust_chain_invoke(chain_with_retry, documents, max_concurrent=5))

Comparison of All 7 Strategies

Strategy	Best For	Extra Cost	Complexity
OutputFixingParser	JSON syntax errors, markdown wrapping	1 LLM call per failure	Low
RetryOutputParser	Schema mismatch, wrong field names	1 LLM call per failure	Low
with_retry LCEL	Transient errors, rate limits	Minimal (same call)	Low
Fallback chains	Graceful degradation	Varies by fallback	Medium
Custom error handler	Logging + staged recovery	Minimal	Medium
Pydantic validators	Type coercion, data normalization	Zero	Low
Exponential backoff	Rate limits, API errors	Zero (timing only)	Low

Recommended production stack:

Pydantic validators — zero cost, handles type coercion automatically
OutputFixingParser — catches the majority of LLM formatting mistakes
with_retry — handles transient API failures
Exponential backoff — handles rate limits gracefully

Layer these from cheapest to most expensive. Most failures resolve at the Pydantic validator level. API errors resolve at the with_retry level. Only complex schema failures need OutputFixingParser.

Monitoring Parse Failures in Production

Track your failure rate to know when prompts need improvement:

from dataclasses import dataclass, field
from datetime import datetime
from collections import defaultdict

@dataclass
class ParseMetrics:
    total: int = 0
    successes: int = 0
    fixed: int = 0
    failed: int = 0
    errors: list = field(default_factory=list)

    @property
    def success_rate(self):
        return self.successes / self.total if self.total > 0 else 0

    @property
    def fix_rate(self):
        return self.fixed / self.total if self.total > 0 else 0

metrics = ParseMetrics()

def tracked_parse(text: str, parser, fixing_parser):
    metrics.total += 1
    try:
        result = parser.parse(text)
        metrics.successes += 1
        return result
    except Exception as e1:
        try:
            result = fixing_parser.parse(text)
            metrics.fixed += 1
            return result
        except Exception as e2:
            metrics.failed += 1
            metrics.errors.append({"text": text[:100], "error": str(e2), "time": datetime.now().isoformat()})
            raise e2

# After processing 1000 documents:
print(f"Success rate: {metrics.success_rate:.1%}")
print(f"Fix rate: {metrics.fix_rate:.1%}")
print(f"Failure rate: {metrics.failed/metrics.total:.1%}")

If your fix rate exceeds 20%, your prompts need work. The most common fix is adding explicit instructions like "Return ONLY the JSON object, no explanation, no markdown" to the prompt.

For a complete view of output handling in LangChain, see the LangChain tutorial 2025 and Build AI agent with LangChain. For production deployment patterns, the Deploy AI model to production guide covers monitoring and observability.

Frequently Asked Questions

What causes LLM output parsing failures? The most common causes are: the model wrapping JSON in markdown code blocks, missing or extra commas in JSON, truncated responses when output is near the context limit, and the model generating explanatory text before or after the structured output. OutputFixingParser handles most of these cases.

How many retry attempts should I configure? For most production applications, 3 attempts with exponential backoff (1s, 2s, 4s) is the right balance. More than 5 attempts adds significant latency. If failures persist beyond 3 attempts, the issue is usually a systematic prompt problem rather than a transient parsing error.

Does RetryWithErrorOutputParser send my original prompt again? Yes. RetryWithErrorOutputParser includes the original prompt, the failed output, and the parse error in the retry request. This gives the model full context to fix its mistake. This is more expensive than OutputFixingParser but more effective for complex schema failures.

Share this article:Facebook Twitter/X LinkedIn Telegram WhatsApp

💬 DiscussionPowered by GitHub Discussions

Frequently Asked Questions

The most common causes are: the model wrapping JSON in markdown code blocks, missing or extra commas in JSON, truncated responses when output is near the context limit, and the model generating explanatory text before or after the structured output. OutputFixingParser handles most of these cases.

AiTechWorlds Team

✓ Verified Writer

The AiTechWorlds team is passionate about AI, technology, and education. We create high-quality, research-backed content to help you learn, grow, and succeed in the modern digital world.

📱 Follow on Telegram 🐦 Follow on X Learn More →

search relevance ranking showing scores — LangChain advanced RAG retrieval strategies

Agent Development

10 LangChain Retrieval Strategies for Better RAG Results

Go beyond basic similarity search with ParentDocumentRetriever, MultiQueryRetriever, EnsembleRetriever, HyDE, and 6 more LangChain retrieval strategies — with code for each.

May 31, 2026 13 min read

AI agent architecture with memory and tool connections — LangChain agent memory tools

Agent Development

Build a LangChain Agent with Memory and Tools (Full Example)

Build a complete LangChain conversational agent with persistent memory, multiple tools, and step-by-step trace — from setup to a production-ready implementation with code.

May 31, 2026 14 min read

developer coding AI agent decision loop — LangChain agent types ZeroShot ReAct Conversational

Agent Development

5 LangChain Agent Types Explained (ZeroShot, ReAct, and More)

Understand every major LangChain agent type — ZeroShotAgent, ReAct, ConversationalAgent, and more — with Python code and agent trace walkthroughs.

May 31, 2026 13 min read

FastAPI server running LangChain endpoint — deploy LangChain FastAPI REST streaming

Agent Development

How to Deploy a LangChain App as a FastAPI REST Endpoint

Serve a LangChain app as a production FastAPI REST endpoint with streaming, async chains, error handling, and Docker deployment — full Python code included.

May 31, 2026 11 min read

Go deeper on this topic

NotesAI Agent Development Notes NotesRAG: Retrieval-Augmented Generation Guide BookAI Agent Development Guide BookBuilding AI Apps: Developer's Guide CourseAI Agent Development Course ProjectAutonomous Multi-Agent System for Software Development

10K+ Members Growing Daily

Get Free AI Notes Daily

Join AiTechWorlds on Telegram and get daily AI tips, prompt engineering templates, coding resources, and exclusive content — 100% free!

📚 Free Study Notes🤖 AI Tips Daily⚡ Prompt Templates💻 Coding Resources

Join Free Channel

No spam. Leave anytime.

Langchain

7 LangChain Output Fixer and Retry Strategies (2026)

⚡ Quick Answer

Stop losing data to malformed LLM outputs. Learn 7 LangChain error recovery strategies including OutputFixingParser, RetryOutputParser, fallbacks, and exponential backoff.

AiTechWorlds Team May 31, 2026 13 min read

#LangChain #output parser #error recovery #retry #OutputFixingParser

📚Part of the Langchain guide — explore all Langchain articles→

Share:Facebook Twitter/X LinkedIn Telegram WhatsApp

📱

Get more content like this on Telegram!

Daily AI tips, notes & resources — free

Join Free →

Before diving in, the LangChain tutorial 2025 covers the basic output parser types if you need a refresher on JsonOutputParser and PydanticOutputParser.

Why Output Parsing Fails

Before fixing failures, understand what causes them:

# The model was asked for JSON but returned this:
broken_output_1 = """
Sure! Here's the JSON you requested:
# [json block start]
{
  "name": "Alice",
  "age": 30,
  "skills": ["Python", "ML",]  // trailing comma
}
# [block end]
"""

broken_output_2 = '{"name": "Bob", "age": 25, "skills": ["Python"'  # truncated

broken_output_3 = """
{
  "name": "Charlie"
  "age": 28,           // missing comma after "Charlie"
  "skills": ["Go"]
}
"""

The three root causes are:

Markdown wrapping — model adds ```json blocks around output
Syntax errors — trailing commas, missing quotes, JavaScript-style comments
Truncation — output cut off near the context limit

All three are fixable with the right strategy.

Strategy 1: OutputFixingParser

OutputFixingParser wraps any existing parser. When the inner parser fails, it sends the bad output to an LLM with a request to fix it:

from langchain.output_parsers import OutputFixingParser
from langchain_core.output_parsers import JsonOutputParser
from langchain_openai import ChatOpenAI
from pydantic import BaseModel, Field
from typing import List

class UserProfile(BaseModel):
    name: str = Field(description="User's full name")
    age: int = Field(description="User's age")
    skills: List[str] = Field(description="List of technical skills")

# Base parser
json_parser = JsonOutputParser(pydantic_object=UserProfile)

# Wrapping parser with auto-fix capability
fixing_parser = OutputFixingParser.from_llm(
    parser=json_parser,
    llm=ChatOpenAI(model="gpt-4o-mini", temperature=0)
)

# Test with broken JSON
broken_json = """
Sure! Here's the data:
# [json block start]
{
  "name": "Alice Johnson",
  "age": 30,
  "skills": ["Python", "Machine Learning",]
}
# [block end]
"""

try:
    result = fixing_parser.parse(broken_json)
    print(f"Fixed successfully: {result}")
    # → UserProfile(name='Alice Johnson', age=30, skills=['Python', 'Machine Learning'])
except Exception as e:
    print(f"Still failed after fixing: {e}")

Cost consideration: Each fix attempt makes one extra LLM call. With gpt-4o-mini, this costs roughly $0.0001 per fix — negligible for most applications.

Strategy 2: RetryOutputParser

from langchain.output_parsers import RetryOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI

# Define expected output
class ProductReview(BaseModel):
    product_name: str
    rating: int = Field(ge=1, le=5, description="Rating from 1 to 5")
    pros: List[str] = Field(min_items=1, description="List of pros")
    cons: List[str] = Field(description="List of cons")
    summary: str

parser = JsonOutputParser(pydantic_object=ProductReview)

# Build the chain
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a product reviewer. Extract structured review data."),
    ("human", "{review_text}\n\n{format_instructions}")
])

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

# Retry parser wraps the original chain
retry_parser = RetryOutputParser.from_llm(
    parser=parser,
    llm=llm,
    max_retries=3
)

# Manual retry usage
chain = prompt | llm

def parse_with_retry(review_text: str) -> ProductReview:
    prompt_value = prompt.invoke({
        "review_text": review_text,
        "format_instructions": parser.get_format_instructions()
    })
    
    output = llm.invoke(prompt_value)
    
    try:
        return parser.parse(output.content)
    except Exception:
        # Retry with full context
        return retry_parser.parse_with_prompt(
            completion=output.content,
            prompt_value=prompt_value
        )

result = parse_with_retry("This laptop is amazing! Great battery life and speed, but a bit heavy.")
print(result)

Strategy 3: with_retry in LCEL

For LCEL chains, use the built-in .with_retry() method on any runnable:

from langchain_core.runnables import RunnableRetry
from langchain_openai import ChatOpenAI
from langchain_core.output_parsers import JsonOutputParser

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
parser = JsonOutputParser()

# Simple retry on the chain
chain_with_retry = (
    ChatPromptTemplate.from_template(
        "Extract a JSON object with 'title' and 'year' from: {text}\n\nReturn only JSON."
    )
    | llm
    | parser
).with_retry(
    retry_if_exception_type=(ValueError, Exception),
    stop_after_attempt=3,
    wait_exponential_jitter=True  # adds random jitter to exponential backoff
)

# Or configure more precisely
from tenacity import retry, stop_after_attempt, wait_exponential, retry_if_exception_type

chain_with_advanced_retry = (
    ChatPromptTemplate.from_template("Extract JSON: {text}")
    | llm
    | parser
).with_retry(
    retry_if_exception_type=(ValueError,),
    stop_after_attempt=3,
)

try:
    result = chain_with_retry.invoke({"text": "The Matrix was released in 1999"})
    print(result)
except Exception as e:
    print(f"Failed after 3 attempts: {e}")

.with_retry() is the simplest way to add retry logic to any LCEL chain. It wraps the entire chain, not just the parser, so transient API errors and rate limits are also handled.

Strategy 4: Fallback Chains

Use .with_fallbacks() to define backup chains when the primary chain fails:

from langchain_core.runnables import RunnableWithFallbacks
from langchain_openai import ChatOpenAI
from langchain_core.output_parsers import StrOutputParser, JsonOutputParser

# Primary chain: structured JSON output
primary_llm = ChatOpenAI(model="gpt-4o", temperature=0)
primary_parser = JsonOutputParser(pydantic_object=UserProfile)

primary_chain = (
    ChatPromptTemplate.from_template(
        "Extract user profile JSON from: {text}\n\n{format_instructions}"
    ).partial(format_instructions=primary_parser.get_format_instructions())
    | primary_llm
    | primary_parser
)

# Fallback 1: cheaper model with the same parser
fallback_llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
fallback_chain_1 = (
    ChatPromptTemplate.from_template(
        "Extract user profile JSON from: {text}\n\n{format_instructions}"
    ).partial(format_instructions=primary_parser.get_format_instructions())
    | fallback_llm
    | OutputFixingParser.from_llm(parser=primary_parser, llm=fallback_llm)
)

# Fallback 2: unstructured text extraction (always succeeds)
fallback_chain_2 = (
    ChatPromptTemplate.from_template(
        "Extract the person's name, age, and skills from this text as plain text: {text}"
    )
    | fallback_llm
    | StrOutputParser()
)

# Chain them together
resilient_chain = primary_chain.with_fallbacks(
    fallbacks=[fallback_chain_1, fallback_chain_2],
    exception_to_ignore_types=[ValueError]  # only fall back on parsing errors
)

# The chain tries primary, then fallback_1, then fallback_2
result = resilient_chain.invoke({
    "text": "Sarah, 28, works as a backend engineer specializing in Rust and distributed systems"
})
print(result)

This pattern is valuable when you need graceful degradation. A document processing pipeline might fall back from structured extraction to unstructured text rather than failing the entire job.

Strategy 5: Custom Error Handler with LCEL

Build a custom error-handling wrapper that logs failures and attempts recovery:

from langchain_core.runnables import RunnableLambda
from langchain_core.output_parsers import JsonOutputParser
from langchain_openai import ChatOpenAI
import logging
import json

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

def create_resilient_parser(schema_class, llm=None, max_attempts=3):
    """Create a parser with logging, fixing, and fallback logic."""
    
    base_parser = JsonOutputParser(pydantic_object=schema_class)
    fix_llm = llm or ChatOpenAI(model="gpt-4o-mini", temperature=0)
    fixing_parser = OutputFixingParser.from_llm(parser=base_parser, llm=fix_llm)
    
    def parse_with_recovery(text: str):
        attempt = 0
        last_error = None
        
        while attempt < max_attempts:
            try:
                if attempt == 0:
                    # First attempt: direct parse
                    result = base_parser.parse(text)
                    logger.info(f"Parsed successfully on attempt {attempt + 1}")
                    return result
                elif attempt == 1:
                    # Second attempt: auto-fix
                    result = fixing_parser.parse(text)
                    logger.info(f"Fixed and parsed on attempt {attempt + 1}")
                    return result
                else:
                    # Third attempt: aggressive JSON extraction
                    extracted = extract_json_aggressive(text)
                    result = base_parser.parse(extracted)
                    logger.info(f"Aggressively extracted on attempt {attempt + 1}")
                    return result
            except Exception as e:
                last_error = e
                logger.warning(f"Attempt {attempt + 1} failed: {e}")
                attempt += 1
        
        logger.error(f"All {max_attempts} attempts failed. Last error: {last_error}")
        raise last_error
    
    return RunnableLambda(parse_with_recovery)

def extract_json_aggressive(text: str) -> str:
    """Try multiple strategies to extract valid JSON from messy text."""
    import re
    
    # Strategy 1: Find JSON between code blocks
    code_block_match = re.search(r'```(?:json)?\s*([\s\S]*?)```', text)
    if code_block_match:
        return code_block_match.group(1).strip()
    
    # Strategy 2: Find first complete JSON object
    brace_match = re.search(r'\{[\s\S]*\}', text)
    if brace_match:
        candidate = brace_match.group(0)
        try:
            json.loads(candidate)
            return candidate
        except json.JSONDecodeError:
            pass
    
    # Strategy 3: Fix common issues
    cleaned = text.strip()
    cleaned = re.sub(r',\s*([}\]])', r'\1', cleaned)  # Remove trailing commas
    cleaned = re.sub(r'//.*$', '', cleaned, flags=re.MULTILINE)  # Remove JS comments
    cleaned = re.sub(r'/\*.*?\*/', '', cleaned, flags=re.DOTALL)  # Remove block comments
    
    return cleaned

# Usage
resilient = create_resilient_parser(UserProfile, max_attempts=3)
result = resilient.invoke('{"name": "Dave", "age": 35, "skills": ["Python", "Go",]}')

Strategy 6: Pydantic Validators for Pre-Parse Correction

Add Pydantic validators that clean data before validation, reducing parse failures:

from pydantic import BaseModel, Field, field_validator, model_validator
from typing import List, Optional, Union
import re

class RobustUserProfile(BaseModel):
    name: str
    age: Union[int, str]  # accept string age and coerce
    skills: Union[List[str], str]  # accept comma-separated string
    email: Optional[str] = None

    @field_validator("age", mode="before")
    @classmethod
    def coerce_age(cls, v):
        """Accept '30 years old', '30', or 30."""
        if isinstance(v, int):
            return v
        # Extract digits
        digits = re.findall(r'\d+', str(v))
        if digits:
            return int(digits[0])
        raise ValueError(f"Cannot parse age from: {v}")

    @field_validator("skills", mode="before")
    @classmethod
    def coerce_skills(cls, v):
        """Accept list, comma-separated string, or JSON string."""
        if isinstance(v, list):
            return v
        if isinstance(v, str):
            # Try JSON parsing first
            try:
                parsed = json.loads(v)
                if isinstance(parsed, list):
                    return parsed
            except json.JSONDecodeError:
                pass
            # Fall back to comma splitting
            return [s.strip() for s in v.split(",") if s.strip()]
        return v

    @field_validator("email", mode="before")
    @classmethod
    def validate_email(cls, v):
        """Accept None, empty string, or valid email."""
        if not v:
            return None
        # Basic email validation — don't reject, just normalize
        v = str(v).strip().lower()
        return v if "@" in v else None

    @model_validator(mode="after")
    def check_age_range(self):
        """Post-parse validation with auto-correction."""
        if self.age < 0:
            self.age = 0
        if self.age > 150:
            self.age = 150
        return self

# Test with messy inputs
parser = JsonOutputParser(pydantic_object=RobustUserProfile)

messy_json = '{"name": "Eve", "age": "28 years old", "skills": "Python, ML, LangChain", "email": ""}'
result = RobustUserProfile.model_validate_json(messy_json)
print(result)
# → RobustUserProfile(name='Eve', age=28, skills=['Python', 'ML', 'LangChain'], email=None)

Pydantic validators that coerce types reduce the surface area for parsing failures significantly. The model can output "age": "28 years old" and your application still gets age=28.

Strategy 7: Exponential Backoff for Rate Limits

API rate limits cause a different category of failures. Handle them with proper exponential backoff:

import time
import random
from functools import wraps
from typing import Callable, TypeVar, Any
import openai

T = TypeVar("T")

def exponential_backoff(
    max_retries: int = 5,
    base_delay: float = 1.0,
    max_delay: float = 60.0,
    jitter: bool = True,
    retryable_exceptions=(openai.RateLimitError, openai.APIConnectionError, openai.APITimeoutError)
):
    """Decorator for exponential backoff with jitter."""
    def decorator(func: Callable[..., T]) -> Callable[..., T]:
        @wraps(func)
        def wrapper(*args, **kwargs) -> T:
            delay = base_delay
            last_exception = None
            
            for attempt in range(max_retries + 1):
                try:
                    return func(*args, **kwargs)
                except retryable_exceptions as e:
                    last_exception = e
                    if attempt == max_retries:
                        break
                    
                    actual_delay = min(delay, max_delay)
                    if jitter:
                        actual_delay *= (0.5 + random.random())
                    
                    logger.warning(
                        f"Attempt {attempt + 1}/{max_retries} failed: {e}. "
                        f"Retrying in {actual_delay:.1f}s..."
                    )
                    time.sleep(actual_delay)
                    delay *= 2
                except Exception as e:
                    # Non-retryable exceptions fail immediately
                    raise
            
            raise last_exception
        return wrapper
    return decorator

# Apply to LangChain chain invocation
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import JsonOutputParser

@exponential_backoff(max_retries=5, base_delay=1.0)
def invoke_with_backoff(chain, input_data):
    return chain.invoke(input_data)

# Usage
llm = ChatOpenAI(model="gpt-4o")
parser = JsonOutputParser(pydantic_object=UserProfile)
chain = ChatPromptTemplate.from_template("Extract JSON: {text}") | llm | parser

result = invoke_with_backoff(chain, {"text": "Alice is 30 and knows Python"})

LCEL Async Retry with Rate Limit Handling

For async production code:

import asyncio
from langchain_core.runnables import RunnableLambda
from langchain_openai import ChatOpenAI

async def robust_chain_invoke(chain, inputs: list[dict], max_concurrent: int = 5) -> list:
    """Run multiple chain invocations with concurrency limiting and retry."""
    semaphore = asyncio.Semaphore(max_concurrent)
    
    async def invoke_one(input_data: dict):
        async with semaphore:
            for attempt in range(3):
                try:
                    return await chain.ainvoke(input_data)
                except Exception as e:
                    if attempt == 2:
                        logger.error(f"Failed after 3 attempts: {e}")
                        return {"error": str(e), "input": input_data}
                    wait = 2 ** attempt + random.uniform(0, 1)
                    await asyncio.sleep(wait)
    
    tasks = [invoke_one(inp) for inp in inputs]
    return await asyncio.gather(*tasks)

# Process 50 documents concurrently (max 5 at a time)
chain_with_retry = (
    ChatPromptTemplate.from_template("Extract JSON from: {text}")
    | ChatOpenAI(model="gpt-4o-mini")
    | OutputFixingParser.from_llm(
        parser=JsonOutputParser(pydantic_object=UserProfile),
        llm=ChatOpenAI(model="gpt-4o-mini")
    )
)

documents = [{"text": f"User {i}: age {20+i}, skills Python"} for i in range(50)]
results = asyncio.run(robust_chain_invoke(chain_with_retry, documents, max_concurrent=5))

Comparison of All 7 Strategies

Strategy	Best For	Extra Cost	Complexity
OutputFixingParser	JSON syntax errors, markdown wrapping	1 LLM call per failure	Low
RetryOutputParser	Schema mismatch, wrong field names	1 LLM call per failure	Low
with_retry LCEL	Transient errors, rate limits	Minimal (same call)	Low
Fallback chains	Graceful degradation	Varies by fallback	Medium
Custom error handler	Logging + staged recovery	Minimal	Medium
Pydantic validators	Type coercion, data normalization	Zero	Low
Exponential backoff	Rate limits, API errors	Zero (timing only)	Low

Recommended production stack:

Pydantic validators — zero cost, handles type coercion automatically
OutputFixingParser — catches the majority of LLM formatting mistakes
with_retry — handles transient API failures
Exponential backoff — handles rate limits gracefully

Monitoring Parse Failures in Production

Track your failure rate to know when prompts need improvement:

from dataclasses import dataclass, field
from datetime import datetime
from collections import defaultdict

@dataclass
class ParseMetrics:
    total: int = 0
    successes: int = 0
    fixed: int = 0
    failed: int = 0
    errors: list = field(default_factory=list)

    @property
    def success_rate(self):
        return self.successes / self.total if self.total > 0 else 0

    @property
    def fix_rate(self):
        return self.fixed / self.total if self.total > 0 else 0

metrics = ParseMetrics()

def tracked_parse(text: str, parser, fixing_parser):
    metrics.total += 1
    try:
        result = parser.parse(text)
        metrics.successes += 1
        return result
    except Exception as e1:
        try:
            result = fixing_parser.parse(text)
            metrics.fixed += 1
            return result
        except Exception as e2:
            metrics.failed += 1
            metrics.errors.append({"text": text[:100], "error": str(e2), "time": datetime.now().isoformat()})
            raise e2

# After processing 1000 documents:
print(f"Success rate: {metrics.success_rate:.1%}")
print(f"Fix rate: {metrics.fix_rate:.1%}")
print(f"Failure rate: {metrics.failed/metrics.total:.1%}")

If your fix rate exceeds 20%, your prompts need work. The most common fix is adding explicit instructions like "Return ONLY the JSON object, no explanation, no markdown" to the prompt.

Frequently Asked Questions

Share this article:Facebook Twitter/X LinkedIn Telegram WhatsApp

💬 DiscussionPowered by GitHub Discussions

Frequently Asked Questions

AiTechWorlds Team

✓ Verified Writer

The AiTechWorlds team is passionate about AI, technology, and education. We create high-quality, research-backed content to help you learn, grow, and succeed in the modern digital world.

📱 Follow on Telegram 🐦 Follow on X Learn More →

Agent Development

10 LangChain Retrieval Strategies for Better RAG Results

Go beyond basic similarity search with ParentDocumentRetriever, MultiQueryRetriever, EnsembleRetriever, HyDE, and 6 more LangChain retrieval strategies — with code for each.

May 31, 2026 13 min read

Agent Development

Build a LangChain Agent with Memory and Tools (Full Example)

Build a complete LangChain conversational agent with persistent memory, multiple tools, and step-by-step trace — from setup to a production-ready implementation with code.

May 31, 2026 14 min read

Agent Development

5 LangChain Agent Types Explained (ZeroShot, ReAct, and More)

Understand every major LangChain agent type — ZeroShotAgent, ReAct, ConversationalAgent, and more — with Python code and agent trace walkthroughs.

May 31, 2026 13 min read

Agent Development

How to Deploy a LangChain App as a FastAPI REST Endpoint

Serve a LangChain app as a production FastAPI REST endpoint with streaming, async chains, error handling, and Docker deployment — full Python code included.

May 31, 2026 11 min read

Go deeper on this topic

10K+ Members Growing Daily

Get Free AI Notes Daily

Join AiTechWorlds on Telegram and get daily AI tips, prompt engineering templates, coding resources, and exclusive content — 100% free!

📚 Free Study Notes🤖 AI Tips Daily⚡ Prompt Templates💻 Coding Resources

Join Free Channel

No spam. Leave anytime.

7 LangChain Output Fixer and Retry Strategies (2026)

Why Output Parsing Fails

Strategy 1: OutputFixingParser

Strategy 2: RetryOutputParser

Strategy 3: with_retry in LCEL

Strategy 4: Fallback Chains

Strategy 5: Custom Error Handler with LCEL

Strategy 6: Pydantic Validators for Pre-Parse Correction

Strategy 7: Exponential Backoff for Rate Limits

LCEL Async Retry with Rate Limit Handling

Comparison of All 7 Strategies

Monitoring Parse Failures in Production

Frequently Asked Questions

💬 DiscussionPowered by GitHub Discussions

Frequently Asked Questions

AiTechWorlds Team

Related Articles

10 LangChain Retrieval Strategies for Better RAG Results

Build a LangChain Agent with Memory and Tools (Full Example)

5 LangChain Agent Types Explained (ZeroShot, ReAct, and More)

How to Deploy a LangChain App as a FastAPI REST Endpoint

Go deeper on this topic

Get Free AI Notes Daily

7 LangChain Output Fixer and Retry Strategies (2026)

Why Output Parsing Fails

Strategy 1: OutputFixingParser

Strategy 2: RetryOutputParser

Strategy 3: with_retry in LCEL

Strategy 4: Fallback Chains

Strategy 5: Custom Error Handler with LCEL

Strategy 6: Pydantic Validators for Pre-Parse Correction

Strategy 7: Exponential Backoff for Rate Limits

LCEL Async Retry with Rate Limit Handling

Comparison of All 7 Strategies

Monitoring Parse Failures in Production

Frequently Asked Questions

💬 DiscussionPowered by GitHub Discussions

Frequently Asked Questions

AiTechWorlds Team

Related Articles

10 LangChain Retrieval Strategies for Better RAG Results

Build a LangChain Agent with Memory and Tools (Full Example)

5 LangChain Agent Types Explained (ZeroShot, ReAct, and More)

How to Deploy a LangChain App as a FastAPI REST Endpoint

Go deeper on this topic

Get Free AI Notes Daily