Build a Content Research Agent with AutoGPT (Trends, Outlines)
Build an AutoGPT content research agent that finds trending topics, analyzes SERPs, and generates SEO-ready outlines automatically — full workflow inside.
Get more content like this on Telegram!
Daily AI tips, notes & resources — free
Content planning is one of the most time-consuming parts of a content marketing workflow — and one of the most automatable. Trend research, competitor gap analysis, keyword clustering, outline generation — these are all pattern-recognition tasks that an autonomous agent handles well when set up correctly.
This guide builds a complete AutoGPT content research agent from scratch. By the end, you will have an agent that finds trending topics in a niche, analyzes what ranks on page one, identifies gaps competitors are missing, and produces a structured, SEO-ready content outline — all from a single goal prompt.
What the Agent Actually Does
Before touching any code, it helps to map out the full workflow. The agent will execute these steps autonomously:
- Trend identification — Search for rising topics in your niche using Google Trends signals and recent high-traffic content
- SERP analysis — Pull the top 10 results for target keywords and summarize the angle each piece takes
- Competitor gap analysis — Compare what ranks versus what questions remain unanswered
- Keyword clustering — Group related terms by intent (informational, commercial, navigational)
- Outline generation — Build a structured H2/H3 outline with word count recommendations per section
- Export — Save everything to organized markdown files in the workspace
This is content planning automation at a level that previously required either a team or a stack of separate tools. AutoGPT handles the orchestration.
Prerequisites and Setup
You need:
- AutoGPT installed (Docker or pip)
- OpenAI API key with GPT-4o access
- Serper.dev API key (free tier: 2,500 searches/month)
- Optional: Reddit API credentials for community trend data
- A workspace directory with write access
Install dependencies:
pip install autogpt-core serper-python requests python-dotenv
Configure your environment:
# .env configuration for content research agent
OPENAI_API_KEY=your_openai_key
SERPER_API_KEY=your_serper_key
SMART_LLM_MODEL=gpt-4o
FAST_LLM_MODEL=gpt-4o-mini
MEMORY_BACKEND=local_json
WORKSPACE_DIRECTORY=./content-research-workspace
RESTRICT_TO_WORKSPACE=True
CONTINUOUS_LIMIT=30
EXECUTE_LOCAL_COMMANDS=False
The CONTINUOUS_LIMIT=30 gives the agent enough steps to complete a full research run while preventing runaway behavior. Thirty steps is sufficient for trend research through outline generation.
Goal Configuration for SEO Research
The quality of your agent's output depends heavily on how you write the goal. Content research goals need to be specific about the niche, the intended audience, and the output format.
Here is a goal template that consistently produces useful results:
# content_research_goal.py
agent_name = "ContentResearchAgent"
agent_role = """
A specialized SEO content research assistant that identifies high-opportunity
content topics, analyzes search intent, and produces structured content outlines.
You work methodically, saving intermediate research before building on it.
"""
agent_goals = [
"""
Research trending topics in the niche: [YOUR NICHE - e.g., 'Python machine learning tutorials']
Target audience: [AUDIENCE - e.g., 'intermediate Python developers learning ML']
Step 1: Search for 'trending [niche] topics 2026' and 'most searched [niche] questions'.
Identify 10 candidate topics with apparent search interest. Save to 'candidate_topics.md'.
""",
"""
For each of the top 5 candidate topics from candidate_topics.md:
- Search Google for the exact topic phrase
- Note the titles and angles of the top 5 results
- Identify what type of content ranks (tutorials, comparisons, lists, case studies)
Save SERP analysis to 'serp_analysis.md' with one section per topic.
""",
"""
Analyze serp_analysis.md to identify content gaps:
- What questions do the top results NOT answer?
- What angles are missing (e.g., all guides are beginner-level, no intermediate content)
- Which topics have weak competition (thin content, outdated articles)
Save gap analysis to 'content_gaps.md' with opportunity scores (High/Medium/Low).
""",
"""
Select the single highest-opportunity topic from content_gaps.md.
Generate a complete content outline including:
- Title (SEO-optimized, under 60 characters)
- Target keyword and 5 secondary keywords
- Meta description (155-160 characters)
- H2 and H3 headings with 2-3 sentence descriptions of each section
- Recommended word count per section
- Internal linking opportunities
- FAQ section with 5 questions
Save to 'content_outline_[topic-slug].md'.
""",
"""
Create a summary file 'research_summary.md' that includes:
- The selected topic and why it was chosen
- Estimated content length and time to rank
- Top 3 competitor URLs to analyze before writing
- Recommended publish date based on trend timing
Stop after saving this file.
"""
]
agent_constraints = [
"Save intermediate results to files before proceeding to the next step",
"Use only web search and file write tools",
"Do not generate the full article content — outlines and research only",
"Cite sources for trend data",
"Stop after completing all 5 goal steps"
]
The key design decision here is breaking the research into explicit, sequential steps with file saves between each one. This does two things: it prevents context loss as the agent runs longer, and it gives you checkpoints to review and redirect if the research is heading in the wrong direction.
Tool Integrations for Richer Research
AutoGPT's built-in web search covers most content research needs, but a few additional integrations significantly improve output quality.
Google Search via Serper
Serper returns structured Google results including featured snippets, "people also ask" boxes, and related searches — exactly what you need for content gap analysis.
# tools/serper_search.py
import requests
import json
def serper_search(query: str, num_results: int = 10) -> dict:
"""
Perform a Google search via Serper API and return structured results.
Useful for SERP analysis and competitor research.
"""
url = "https://google.serper.dev/search"
payload = {
"q": query,
"num": num_results,
"gl": "us", # Country for results
"hl": "en" # Language
}
headers = {
"X-API-KEY": os.environ.get("SERPER_API_KEY"),
"Content-Type": "application/json"
}
response = requests.post(url, headers=headers, json=payload)
data = response.json()
# Extract the most useful fields for content research
results = {
"organic": [
{
"position": r.get("position"),
"title": r.get("title"),
"url": r.get("link"),
"snippet": r.get("snippet"),
"date": r.get("date", "Unknown")
}
for r in data.get("organic", [])
],
"people_also_ask": [
q.get("question") for q in data.get("peopleAlsoAsk", [])
],
"related_searches": [
s.get("query") for s in data.get("relatedSearches", [])
]
}
return results
def format_serp_for_agent(query: str) -> str:
"""Format SERP results as readable text for agent context."""
results = serper_search(query)
output = [f"## SERP Results for: {query}\n"]
output.append("### Top Results:")
for r in results["organic"][:5]:
output.append(f"{r['position']}. **{r['title']}**")
output.append(f" URL: {r['url']}")
output.append(f" {r['snippet']}\n")
if results["people_also_ask"]:
output.append("\n### People Also Ask:")
for q in results["people_also_ask"]:
output.append(f"- {q}")
if results["related_searches"]:
output.append("\n### Related Searches:")
for s in results["related_searches"]:
output.append(f"- {s}")
return "\n".join(output)
Reddit Trend Research
Reddit's "people also ask" equivalent is its upvoted posts. High-upvote posts in niche subreddits are a reliable signal for what content resonates with real audiences.
# tools/reddit_trends.py
import requests
def get_reddit_trending(subreddit: str, limit: int = 25) -> list:
"""
Fetch trending posts from a subreddit to identify content opportunities.
No OAuth required for read-only access.
"""
url = f"https://www.reddit.com/r/{subreddit}/top.json"
params = {
"t": "month", # Top posts from last month
"limit": limit
}
headers = {"User-Agent": "ContentResearchBot/1.0"}
response = requests.get(url, params=params, headers=headers)
data = response.json()
posts = []
for post in data["data"]["children"]:
p = post["data"]
posts.append({
"title": p["title"],
"score": p["score"],
"comments": p["num_comments"],
"url": p["url"],
"self_text": p.get("selftext", "")[:200] # First 200 chars
})
return sorted(posts, key=lambda x: x["score"], reverse=True)
def identify_content_signals(subreddit: str) -> str:
"""Convert Reddit trending data into content planning signals."""
posts = get_reddit_trending(subreddit)
output = [f"## Reddit Content Signals: r/{subreddit}\n"]
output.append("### High-Engagement Topics (last month):\n")
for i, post in enumerate(posts[:10], 1):
output.append(f"{i}. **{post['title']}**")
output.append(f" Score: {post['score']} | Comments: {post['comments']}")
if post['self_text']:
output.append(f" Context: {post['self_text'][:150]}...")
output.append("")
return "\n".join(output)
Output Format Examples
When the agent runs successfully, here is what the output files look like.
candidate_topics.md (Step 1 Output)
# Candidate Topics Research
Generated: 2026-05-31
## Trending Topics in: Python Machine Learning Tutorials
### High-Interest Candidates
1. **LLM fine-tuning with LoRA** — Multiple recent posts, rapidly growing search volume
2. **Vector database comparison 2026** — High search intent, commercial + informational
3. **AutoML vs manual feature engineering** — Long-tail opportunity, low competition
4. **Python agents with LangGraph** — New topic, early mover advantage
5. **ML model deployment on AWS Lambda** — Practical tutorial gap identified
### Medium-Interest Candidates
6. Scikit-learn pipeline optimization
7. Time series forecasting with Prophet
8. Hugging Face model quantization
9. MLflow experiment tracking tutorial
10. FastAPI ML model serving
### Trend Signals
- "LLM fine-tuning" searches: +340% YoY
- "AI agents Python" queries: +210% YoY
- "AutoML" declining slightly from 2025 peak
serp_analysis.md (Step 2 Output)
# SERP Analysis
Date: 2026-05-31
## Topic: LLM fine-tuning with LoRA
### Top 5 Results Analysis
**Position 1: Hugging Face Blog — "Fine-Tuning LLMs with LoRA"**
- Type: Technical tutorial
- Angle: Official documentation style, high technical depth
- Published: 2025-03-15
- Gap: No cost comparison, assumes cloud compute
**Position 2: Towards Data Science — "LoRA Explained for Practitioners"**
- Type: Conceptual explanation + code
- Angle: Beginner-friendly, good visualizations
- Published: 2025-08-20
- Gap: No comparison with QLoRA, no local GPU instructions
**Position 3: YouTube transcript — freeCodeCamp**
- Type: Video-turned-article
- Angle: Walkthrough with Colab notebook
- Gap: Text content is thin, no troubleshooting section
### People Also Ask (from SERP)
- "What is the difference between LoRA and full fine-tuning?"
- "Can I fine-tune an LLM on consumer hardware?"
- "How much training data do I need for LoRA fine-tuning?"
- "Is QLoRA better than LoRA for small datasets?"
### Content Gap Identified
No comprehensive comparison of LoRA vs QLoRA vs full fine-tuning with cost/quality trade-off table. Most guides assume cloud GPUs — no content targets consumer hardware specifically.
content_outline_llm-finetuning-lora-qlora-comparison.md (Step 4 Output)
# Content Outline
## Target Topic: LoRA vs QLoRA for LLM Fine-Tuning
**SEO Title:** "LoRA vs QLoRA: Which Fine-Tuning Method is Right for You? (2026)"
**Target Keyword:** LLM fine-tuning LoRA comparison
**Secondary Keywords:** QLoRA tutorial, fine-tune LLM consumer GPU, LoRA vs full fine-tuning, PEFT methods 2026
**Meta Description:** Compare LoRA and QLoRA for LLM fine-tuning: memory requirements, training speed, output quality, and when to choose each method. Includes code examples. (158 chars)
---
## Recommended Structure (Target: 3,500 words)
### Introduction (~300 words)
Hook with the core trade-off: LoRA trains fast but needs VRAM, QLoRA compresses the model but adds complexity. State what the reader will know by the end.
### What is LoRA? (~400 words)
- Low-rank adaptation explained visually
- What "rank" means and how it controls capacity vs efficiency
- When LoRA was introduced and which papers matter
- Code: basic LoRA config with PEFT library
### What is QLoRA? (~400 words)
- 4-bit quantization + LoRA explained
- bitsandbytes library role
- Memory savings in practice: 65B model on single GPU
- Code: QLoRA config example
### LoRA vs QLoRA Comparison Table (~200 words)
| Factor | LoRA | QLoRA |
|---|---|---|
| VRAM requirement | High | Low (4x reduction) |
| Training speed | Faster | Slightly slower |
| Output quality | Slightly higher | Near-identical |
| Hardware requirement | A100/H100 recommended | Consumer 24GB GPU viable |
| Complexity | Lower | Higher (quantization bugs) |
### Consumer Hardware Deep-Dive (~500 words)
- RTX 4090 benchmark: 7B and 13B models
- Minimum viable hardware table
- Common OOM errors and fixes
- Code: memory-efficient training config
### Step-by-Step QLoRA Tutorial (~800 words)
- Environment setup
- Loading base model with 4-bit config
- Training loop
- Merging adapter weights
- Evaluation
### When to Choose Each Method (~300 words)
- Use LoRA when: cloud compute, speed priority, larger budgets
- Use QLoRA when: consumer GPU, cost sensitivity, 7B-13B models
### FAQ Section (~300 words)
5 questions from "People Also Ask" research
### Internal Links
- [Hugging Face transformers tutorial](/post/hugging-face-transformers-tutorial)
- [Vector database guide](/post/vector-database-guide)
- [Deploy AI model to production](/post/deploy-ai-model-production)
Running the Complete Workflow
With the goal configuration and tools defined, here is the complete execution script:
# run_content_research.py
import os
import asyncio
from pathlib import Path
from dotenv import load_dotenv
load_dotenv()
def setup_workspace():
"""Create workspace directories for this research run."""
workspace = Path(os.environ.get("WORKSPACE_DIRECTORY", "./content-research-workspace"))
workspace.mkdir(exist_ok=True)
run_id = os.popen("date +%Y%m%d_%H%M%S").read().strip()
run_dir = workspace / f"research_run_{run_id}"
run_dir.mkdir(exist_ok=True)
os.environ["WORKSPACE_DIRECTORY"] = str(run_dir)
return run_dir
def configure_agent(niche: str, audience: str, workspace: Path):
"""Build agent configuration for a content research run."""
goals = [
f"Search for trending topics in '{niche}' targeting '{audience}'. "
f"Find 10 high-interest candidates with apparent search demand. "
f"Save to '{workspace}/candidate_topics.md'.",
f"For the top 5 topics from candidate_topics.md, search Google for each. "
f"Document the title, URL, angle, and content gaps in top 5 results. "
f"Save SERP analysis to '{workspace}/serp_analysis.md'.",
f"Analyze serp_analysis.md. Score each topic High/Medium/Low opportunity "
f"based on content quality of existing results and unanswered questions. "
f"Save to '{workspace}/content_gaps.md'.",
f"Select the highest opportunity topic. Generate a complete H2/H3 content "
f"outline with section descriptions, word counts, and internal link suggestions. "
f"Save to '{workspace}/content_outline.md'.",
f"Write a one-page research summary to '{workspace}/research_summary.md' "
f"covering the chosen topic, rationale, competitor URLs, and next steps. "
f"Stop after saving."
]
return {
"name": "ContentResearcher",
"role": "SEO content research specialist",
"goals": goals,
"constraints": [
"Save files between each step",
"Research only, no full article writing",
"Maximum 30 agent steps",
"Use web search and file tools only"
]
}
def display_results(workspace: Path):
"""Print a summary of generated files."""
files = list(workspace.glob("*.md"))
print(f"\nGenerated {len(files)} research files:")
for f in sorted(files):
size_kb = f.stat().st_size / 1024
print(f" {f.name} ({size_kb:.1f} KB)")
# Show outline preview if it exists
outline = workspace / "content_outline.md"
if outline.exists():
print("\n--- Content Outline Preview ---")
print(outline.read_text()[:1500])
if __name__ == "__main__":
import sys
niche = sys.argv[1] if len(sys.argv) > 1 else "Python machine learning"
audience = sys.argv[2] if len(sys.argv) > 2 else "intermediate developers"
workspace = setup_workspace()
config = configure_agent(niche, audience, workspace)
print(f"Starting content research for: {niche}")
print(f"Target audience: {audience}")
print(f"Workspace: {workspace}")
print("Running agent...\n")
# Agent execution
# In production: pass config to your AutoGPT runner
display_results(workspace)
Run it with:
python run_content_research.py "Python machine learning" "intermediate developers"
python run_content_research.py "home automation with Home Assistant" "DIY smart home enthusiasts"
python run_content_research.py "e-commerce email marketing" "Shopify store owners"
Comparing Manual vs Agent-Assisted Research
| Task | Manual Time | Agent Time | Quality Difference |
|---|---|---|---|
| Initial topic ideation (20 ideas) | 45 min | 8 min | Comparable |
| SERP analysis (5 topics) | 90 min | 12 min | Agent misses visual/UX signals |
| Content gap identification | 60 min | 15 min | Human catches nuanced intent better |
| Keyword clustering | 30 min | 5 min | Similar quality |
| H2/H3 outline generation | 45 min | 10 min | Agent produces good first drafts |
| Total | 270 min | 50 min | Human review adds 30 min |
The agent saves roughly three hours per content research cycle. The remaining gap is quality review — a human content strategist reading the outputs and making judgment calls the agent cannot. That combination (agent for speed, human for judgment) is the correct model for content planning automation.
For teams managing large content operations, CrewAI tutorial shows how to extend this into multi-agent workflows where one agent researches while another drafts and a third edits — all coordinated automatically.
Connecting to Your Content Workflow
The agent's output files integrate naturally with downstream tools:
CMS Integration: The outline file structure maps directly to WordPress or Contentful post drafts. A simple script parses the outline and creates draft posts via REST API.
SEO Tool Validation: Export the keyword list to Ahrefs or SEMrush for volume verification before committing to a content calendar.
Team Handoff: The research summary and outline files are the complete brief a writer needs. No context-transfer meetings required.
For teams using LangChain for more complex workflows, Build AI agent with LangChain covers the orchestration patterns that let you chain this research agent with a writing agent and an SEO-check agent in a single pipeline.
The AI research agent build guide covers persistent agent architecture for organizations running daily automated research cycles rather than on-demand runs.
Getting the Most Out of Your Agent
A few patterns that consistently improve content research agent output:
Be specific about the niche. "Python machine learning for intermediate developers" produces better-targeted results than "technology." The agent uses your niche description in search queries directly.
Review candidate_topics.md before continuing. The 10 candidate topics are worth a 5-minute human review. If three of them are off-brand or already covered in your existing content, note that in the goal before the agent proceeds to SERP analysis.
Calibrate the opportunity score criteria. Tell the agent what "high opportunity" means for your site specifically. A new site should target low-competition long-tail terms. An established site can target competitive terms. Add this context to the constraint list.
Run weekly, not daily. Content trends change at a weekly cadence, not daily. Running this agent weekly for each content cluster gives you a consistent pipeline without unnecessary API spend.
Content planning automation is one of the highest-ROI applications of autonomous agents for content teams. The research is mechanical enough to delegate but time-consuming enough that automating it genuinely changes what a small team can produce. This agent handles the mechanical part — you handle the strategy.
Frequently Asked Questions
How accurate is AutoGPT's keyword research compared to dedicated SEO tools? AutoGPT aggregates data from web search APIs and can identify trending topics, but it does not have access to actual search volume data like Ahrefs or SEMrush. It works best as an ideation and trend-spotting layer that feeds into a proper SEO tool for volume validation.
Can this agent post content directly to WordPress or a CMS? Yes, if you configure the WordPress REST API integration. AutoGPT can call the API to create draft posts. However, publishing directly without human review is not recommended — use it to create drafts that a human editor approves before publishing.
How long does a full content research run take? A typical run covering one topic cluster — trend identification, SERP analysis, competitor gaps, and outline generation — takes 15 to 30 minutes and uses roughly 20,000 to 35,000 tokens. Cost varies by model choice but averages $0.50 to $1.50 per full run with GPT-4o.
Can AutoGPT research multiple topics in one run? Yes, but it is more reliable to run separate sessions per topic cluster rather than asking the agent to handle five unrelated topics in one goal. Single-focus runs produce better-structured outputs and are easier to debug when something goes wrong.
What search APIs work best with AutoGPT for content research? Serper.dev offers the best cost-to-quality ratio for Google search results. Bing Search API is reliable and has a free tier. For Reddit and forum research, use the Reddit API directly. Combining Serper for Google results with Reddit API for community signals gives the most well-rounded content intelligence.
Frequently Asked Questions
AiTechWorlds Team
✓ Verified WriterThe AiTechWorlds team is passionate about AI, technology, and education. We create high-quality, research-backed content to help you learn, grow, and succeed in the modern digital world.
Related Articles
10 AutoGPT Command Line Arguments (Continuous Mode, Speak)
Complete reference for AutoGPT's 10 most powerful CLI arguments. Master continuous mode, headless operation, and CI/CD integration for automated agent workflows.
10 AutoGPT Configuration Tweaks for Better Performance
10 proven AutoGPT configuration tweaks to improve speed, cut costs, and boost task success. Model selection, temperature, token limits, and workspace settings.
Build a Data Analysis Agent with AutoGPT (CSV, SQL, Plots)
Build a data analysis agent using AutoGPT that reads CSVs, queries SQL databases, and generates plots automatically. Full code with pandas and matplotlib.
10 AutoGPT Environment Variables You Need to Configure
Master AutoGPT configuration with these 10 essential environment variables. Set API keys, select models, control costs, and tune performance.