Configuration
Customize LLMs, tools, memory, and output behavior.
Environment Variables
All configuration is managed via .env file (copy from .env.example).
Core Configuration
# LLM Selection (required)
OPENROUTER_API_KEY=sk-your-api-key
OPENROUTER_MODEL_NAME=openai/gpt-4-turbo
# Web Search (required)
EXA_API_KEY=your-exa-api-key
# Optional: Embeddings for memory
GOOGLE_API_KEY=your-google-api-key # For Google Gemini embeddings
LLM Selection
Choose your LLM based on quality vs. cost tradeoff.
Fast Research (Cost-Optimized)
- Speed: Fast pipeline execution
- Cost: Lowest
- Quality: Good for initial exploration, may have more hallucinations
- Best for: Quick research, exploratory passes
Balanced Research
OPENROUTER_MODEL_NAME=openrouter/openai/gpt-4o
- **Speed:** Moderate
- **Cost:** Medium
- **Quality:** Reliable, good citation accuracy
- **Best for:** Most production research
High-Accuracy Research
- Speed: Slower
- Cost: Higher
- Quality: Excellent, minimal hallucinations
- Best for: Critical research, high-stakes decisions
Cutting-Edge Models
OPENROUTER_MODEL_NAME=openrouter/openai/gpt-4o # Latest GPT-4 Omni
OPENROUTER_MODEL_NAME=openrouter/anthropic/claude-3.5-sonnet # Latest Claude
- Speed: Varies
- Cost: Varies
- Quality: State-of-the-art
- Best for: Maximum quality requirements
Find more models: OpenRouter models list Free models: OpenRouter free models
Memory Configuration
Memory is handled automatically, but you can customize:
Enable/Disable Memory
In crew.py, modify the Crew initialization:
@crew
def crew(self) -> Crew:
return Crew(
agents=self.agents,
tasks=self.tasks,
memory=True, # Enable long-term memory
verbose=True,
embedder={ # Embeddings for memory
"provider": "google",
"config": {"model": "models/embedding-001"}
}
)
Memory stores:
- Prior research outputs
- Extracted claims and URLs
- Deduplication history
Memory benefits:
- Avoids re-crawling same sources
- Provides context for iterative research
- Improves subsequent run quality
Web Search Configuration
ResearchCrew uses EXASearch for semantic web search.
Enable Quality Filters
In tools/ai_tools.py, quality filters are already enabled:
# Excluded domains (unreliable sources)
EXCLUDED_DOMAINS = [
"medium.com",
"reddit.com",
"stackoverflow.com", # For research, not coding Q&A
"twitter.com",
"linkedin.com",
"youtube.com",
]
Modify this list to add/remove excluded domains.
Search Results
By default, web crawler returns 3-5 URLs per search query. To customize:
In tasks.yaml, modify the web crawler task description:
web_crawler_task:
description: |
Find 3-5 high-quality URLs using semantic search.
(Change this number as needed)
# ...
Output Configuration
Report Location
By default, reports are saved to:
researchcrew/
├── outputs/
│ ├── 20250516.md # initial report
| ├── 20250517.md # subsequent report (if running multiple rounds)
| └── ...
To change output directory, modify in .env:
Report Format
Reports are generated in markdown and saved into outputs/[yyyymmdd].md. The format is fixed (publication-ready), but you can customize via task descriptions in tasks.yaml.
Agent Customization
All agent behavior is defined in config/agents.yaml and config/tasks.yaml.
Modify Agent Instructions
Edit config/agents.yaml:
research_planner:
role: Research Planner
goal: Create comprehensive research plans # Change goal
backstory: You are an expert research strategist... # Change backstory
Modify Task Instructions
Edit config/tasks.yaml:
research_planner_task:
description: |
Create a detailed research plan for: {{ topic }}
Consider: (add custom considerations here)
- Existing research
- Knowledge gaps
- Search strategy
expected_output: |
Structured research plan with:
- Summary of prior research
- Identified gaps
- 2-3 search queries (customize this)
agent: research_planner
Performance Tuning
Faster Runs
- Use faster LLM:
- Reduce search scope (in tasks.yaml):
Better Quality
- Use better LLM:
-
Increase search depth — Find more URLs per query
-
Use iterative refinement — Multiple rounds with feedback
Debugging Configuration
Verbose Output
Enable detailed logging in crew.py:
@crew
def crew(self) -> Crew:
return Crew(
# ...
verbose=True, # Shows agent thinking and reasoning
)
Common Configuration Issues
-
"API key is invalid"
-
Verify
.envfile exists and is readable - Check API key format (starts with
sk-) -
Verify key is for correct provider
-
"Import error: No module named 'crewai'"
-
Reinstall:
crewai install -
Or:
pip install crewai -
"Rate limited by API"
-
Too many requests too quickly
- Add delays between runs
- Check your API quota/credits
Best Practices
-
Do:
-
Start with GPT-4o-mini (cheaper initial testing)
- Switch to Claude 3 Opus for production research
- Use iterative feedback for complex topics
-
Save
.envin.gitignore(don't commit API keys!) -
Don't:
-
Use free/unknown LLM APIs (quality/reliability unknown)
- Keep API keys in code or version control
- Run unlimited research runs without monitoring costs
- Disable memory for multi-round research
Next Steps
- Usage Guide — Single and multi-round workflows
- Examples — See real research examples