Core Debugging Knowledge
Understanding Test Results
- OTEL traces are the single source of truth
- Metrics derived from span attributes
- Test reports in JSON, HTML, and Markdown formats
- Assertion results with detailed failure reasons
Common Failure Patterns
1. Tool Not Found Errors
- Ensure server is configured in
mcp.servers
section - Verify tool name matches exactly (case-sensitive)
- Check agent has correct
server_names
list
2. Assertion Failures
Content Assertion Failures
Tool Sequence Failures
3. Performance Issues
Timeout Errors
High Token Usage
4. Configuration Issues
API Key Errors
Model Not Found
5. LLM Judge Failures
Low Judge Scores
Debugging Tools and Commands
CLI Debugging Commands
Analyzing Test Reports
JSON Report Analysis
OTEL Trace Analysis
Span Tree Analysis
Debugging Patterns
Pattern 1: Binary Search for Failures
Pattern 2: Progressive Relaxation
Pattern 3: Metric-Based Debugging
Configuration Debugging
Debug mcpeval.yaml Issues
Debug Agent Configuration
Error Recovery Strategies
Strategy 1: Retry with Backoff
Strategy 2: Fallback Assertions
Debug Checklist
When debugging test failures:-
Check configuration:
- API keys set correctly
- Servers configured in mcpeval.yaml
- Agent has correct server_names
-
Verify tool usage:
- Tool names match exactly
- Tools are being called
- Tool outputs are as expected
-
Review assertions:
- Assertions match actual behavior
- Case sensitivity is appropriate
- Judge rubrics are clear
-
Analyze metrics:
- Performance within limits
- Token usage reasonable
- No timeout issues
-
Check traces:
- No rephrasing loops
- Efficient tool paths
- Error recovery working
-
Environment:
- Correct Python version
- Dependencies installed
- Network connectivity
mcp-eval doctor --full
for comprehensive diagnostics!