🔧 Having trouble? Don’t worry! This comprehensive guide will help you diagnose and fix common issues quickly. We’ve got your back!
Quick diagnostics
Before diving into specific issues, let’s run a quick health check:System Check
Validate Config
Test Connection
Common error messages and solutions
🔑 Authentication errors
Error: Invalid API key or authentication failed
Error: Invalid API key or authentication failed
Symptoms:Solutions:
-
Check environment variables:
-
Use secrets file:
-
Validate configuration:
.gitignore
for secrets files.Error: Rate limit exceeded
Error: Rate limit exceeded
Symptoms:Solutions:
-
Reduce concurrency:
-
Add retry logic:
-
Use different models for testing vs judging:
🔌 Server connection issues
Error: MCP server not found or failed to start
Error: MCP server not found or failed to start
Symptoms:Solutions:
-
Verify server configuration:
-
Test server manually:
-
Debug with verbose output:
-
Common fixes:
- Install server dependencies:
pip install -r requirements.txt
- Use absolute paths:
/full/path/to/server.py
- Check file permissions:
chmod +x server.py
- Verify Python version compatibility
- Install server dependencies:
Error: No tools detected from server
Error: No tools detected from server
Symptoms:Solutions:
-
Check server is listed in agent:
-
Verify tool discovery:
-
Check MCP protocol implementation:
- Server must implement
tools/list
method - Tools must have proper schemas
- Server must be running when agent connects
- Server must implement
-
Enable debug logging:
⏱️ Timeout and performance issues
Error: Test execution timed out
Error: Test execution timed out
Symptoms:Solutions:
-
Increase timeout globally:
-
Set per-test timeout:
-
Optimize test prompts:
-
Add performance assertions:
-
Profile slow tests:
Error: High token usage or costs
Error: High token usage or costs
Symptoms:Solutions:
-
Use cheaper models for testing:
-
Limit response length:
-
Cache responses during development:
-
Monitor token usage:
🧪 Test execution problems
Error: Assertion failed but seems correct
Error: Assertion failed but seems correct
Symptoms:Solutions:
-
Check case sensitivity:
-
Use regex for flexible matching:
-
Debug actual output:
-
Use partial matching for tools:
Error: Flaky or inconsistent test results
Error: Flaky or inconsistent test results
Symptoms:Solutions:
-
Set deterministic model parameters:
-
Use objective assertions:
-
Add retry logic for network calls:
-
Isolate test environment:
Debug mode walkthrough
When tests fail mysteriously, enable debug mode for detailed insights:Step 1: Enable debug output
Step 2: Examine the debug output
Look for these key sections:Step 3: Inspect OTEL traces
- Tool call sequences
- Error spans
- Timing information
- Token usage per call
Network and connectivity debugging
Testing behind a proxy
Debugging SSL/TLS issues
Testing with local servers
Performance troubleshooting
Identifying bottlenecks
llm_time_ms
: Time spent in LLM callstool_time_ms
: Time in tool executionidle_time_ms
: Wasted time between operationsmax_concurrent_operations
: Parallelism level
Optimization strategies
Reduce LLM calls
Parallel execution
Cache results
Optimize prompts
Platform-specific issues
macOS
Command not found errors
Command not found errors
Windows
Path and encoding issues
Path and encoding issues
Linux/Docker
Permission and container issues
Permission and container issues
Getting help
Self-service debugging
-
Run diagnostics:
-
Check logs:
-
Validate everything:
Prepare an issue report
If you’re still stuck, let’s gather information for a bug report:Community support
- 💬 Discord: Join our community
- 🐛 GitHub Issues: Report bugs
- 💡 Discussions: Ask questions
- 📚 FAQ: Check our frequently asked questions
Quick reference: Error codes
Code | Meaning | Quick Fix |
---|---|---|
AUTH001 | Invalid API key | Check environment variables |
SRV001 | Server not found | Verify server name in config |
SRV002 | Server failed to start | Check command and dependencies |
TOOL001 | Tool not found | Verify server implements tool |
TIMEOUT001 | Test timeout | Increase timeout_seconds |
ASSERT001 | Assertion failed | Check expected vs actual values |
NET001 | Network error | Check connectivity and proxy |
RATE001 | Rate limited | Reduce concurrency or add delays |
Still stuck? Don’t hesitate to reach out! We’re here to help you succeed with mcp-eval. Remember, every great developer has faced these issues - you’re in good company! 🚀