Architecture & Design

The 5-Agent Pipeline

ResearchCrew executes research as a sequential pipeline of 5 specialized agents. Each agent handles a specific aspect of the research process.

Pipeline Flow

Research Topic
    ↓
[1] Research Planner
    ├─ Analyzes the research topic
    ├─ Identifies gaps based on prior research
    └─ Generates 2-3 concrete search queries
    ↓
[2] Web Crawler
    ├─ Searches for authoritative URLs (EXASearch)
    ├─ Applies quality filters
    └─ Deduplicates against previously crawled content
    ↓
[3] Content Extractor
    ├─ Fetches full page content
    ├─ Extracts verifiable claims with quotes
    └─ Records confidence levels per claim
    ↓
[4] Synthesis Researcher
    ├─ Groups claims by theme/subtopic
    ├─ Identifies patterns and contradictions
    └─ Flags conflicting information
    ↓
[5] Reporting Analyst
    ├─ Writes narrative prose
    ├─ Integrates claims into coherent findings
    └─ Generates [claim](URL) citations
    ↓
Publication-Ready Report

Each Agent's Role

1. Research Planner

Input: Research topic + previous research context (if iterating)

What it does:

Analyzes what has already been researched (by reading prior outputs)
Identifies gaps and new angles to explore
Creates 2-3 concrete, specific search queries
Considers user feedback if this is an iteration

Output: Structured research plan with search queries

Example:

Topic: AI in Healthcare
Prior research: Diagnostic tools, regulatory compliance
Gaps to explore: Implementation in rural hospitals, cost analysis, AI-human collaboration

Search queries:
1. "AI diagnostic tools implementation challenges rural hospitals 2024"
2. "Cost-benefit analysis AI medical imaging systems"
3. "Human-AI collaboration in clinical workflows"

2. Web Crawler

Input: Search queries from Research Planner

What it does:

Uses EXASearch to semantically search for authoritative URLs
Applies quality filters (excludes Medium, Reddit, low-signal domains)
Deduplicates against previously crawled URLs (via memory)
Ranks results by relevance and authority
Returns 3-5 URLs per topic

Output: List of high-confidence URLs with relevance scores

Reliability Features:

Source filtering (skips unreliable domains)
Deduplication (avoids redundant crawling)
Authority ranking

3. Content Extractor

Input: URLs from Web Crawler

What it does:

Fetches the full page content
Extracts specific claims and supporting quotes
Records confidence levels: HIGH / MEDIUM / LOW
Structures output as JSON with metadata

Output: JSON file with extracted claims

Example:

{
  "url": "https://example.com/ai-healthcare",
  "claims": [
    {
      "claim": "AI diagnostic tools can achieve 95% accuracy in early cancer detection",
      "quote": "Our AI system achieved 95% accuracy...",
      "confidence": "HIGH",
      "source_context": "Published medical journal study"
    }
  ]
}

Reliability Features:

Verbatim quotes (not paraphrases)
Confidence scoring
Source context preservation

4. Synthesis Researcher

Input: Extracted claims from Content Extractor

What it does:

Groups claims by theme (e.g., diagnostic tools, implementation barriers, costs)
Identifies patterns and trends across sources
Flags contradictions between sources
Notes areas where data is insufficient
Preserves URL attribution for traceability

Output: Markdown document with synthesized findings

Example:

## Diagnostic Accuracy

Multiple sources report 90-95% accuracy for AI tools in early detection:
- Study A: 95% accuracy in cancer detection (https://...)
- Study B: 92% accuracy in radiology (https://...)
- Study C: 88% accuracy in pathology (https://...)

Synthesis: Accuracy varies by domain and dataset, but generally 85-95%.

Reliability Features:

Cross-source pattern identification
Contradiction flagging
Gap identification

5. Reporting Analyst

Input: Synthesized findings from Synthesis Researcher

What it does:

Writes narrative prose integrating findings
Creates direct citations using claim format
Preserves source context from synthesis
Notes "Insufficient data" where gaps exist
Ensures publication-ready markdown

Output: Final research report (report.md)

Example:

## Current State of AI in Diagnostics

AI diagnostic tools have achieved significant accuracy in recent years. 
[Multiple studies report 90-95% accuracy in early cancer detection](https://...), 
with [pathology being slightly less accurate at 88%](https://...).

However, [implementation challenges remain in rural hospitals](https://...),
particularly around infrastructure and training requirements.

Reliability Features:

Direct URL citations (every claim is traceable)
No invented quotes or sources
Explicit gap flagging

Key Design Decisions

Why Sequential Pipeline?

Each agent builds on the output of the previous agent. This allows for:

Specialization: Each agent focuses on one aspect
Quality gates: Poor outputs from one stage affect the next (visible)
Debugging: Easy to isolate where quality issues arise
Transparency: You can inspect intermediate outputs

Why Human-in-the-Loop?

Research is iterative and guided. After each run:

Human reviews findings for accuracy and completeness
Human identifies gaps or misconceptions
Human provides guidance for next research round
System learns from feedback and refines

This prevents:

Narrow or biased research (humans catch tunnel vision)
Hallucinations (humans validate against domain knowledge)
Wasted research (humans redirect effort)

Why Citations First?

Every claim must be traceable to a source:

Anti-hallucination: Forces facts to come from real sources
Credibility: Readers can verify claims themselves
Accountability: Clear lineage from source to report

This means:

No invented quotes
No made-up statistics
No synthesis without source attribution

Why Multi-Day Memory?

Research questions are complex and often require multiple rounds:

Context preservation: Remember what was already researched
Deduplication: Avoid re-crawling the same sources
Iteration: Build on prior findings, not restart each time

The system uses LanceDB to store:

Prior research outputs
Extracted claims and sources
Crawler history (to avoid redundant searches)

LLM Configuration

All agents use a single LLM (configurable via .env):

Model: Set OPENROUTER_MODEL_NAME
Provider: OpenRouter supports most major LLM providers
Trade-off: Faster models (Haiku) for speed; better models (GPT-4, Claude) for accuracy

Future versions may use different LLMs for different agents based on complexity.

Reliability Pipeline Summary

Research Topic
    ↓
[Planning] Identify gaps and search strategy
    ↓
[Crawling] Quality-filtered source discovery
    ↓
[Extraction] Verbatim claim extraction with confidence
    ↓
[Synthesis] Cross-source pattern identification + contradiction flagging
    ↓
[Reporting] Direct URL citations, gap flagging
    ↓
Publication-Ready, Verified, Cited Report

Each stage includes checks to ensure reliability:

Source filtering (web crawler)
Confidence scoring (extractor)
Pattern & contradiction detection (synthesis)
Explicit gap identification (reporter)

The result is research output you can trust and cite.