Step-by-step guides for typical mcp-eval tasks, from writing your first test to CI/CD integration.
Learn proven patterns for testing MCP servers and agents. Each workflow includes practical examples and tips from real-world usage.
Choose a test style
Add meaningful assertions
Run and iterate
Identify server capabilities
Create test scenarios for each tool
Test error handling
Create a comprehensive dataset
Define the ideal tool sequence
Add path efficiency assertion
Debug path violations
Refine agent instructions
allow_extra_steps
or checking only critical waypoints.Start with a simple rubric
Combine with structural checks
Use multi-criteria for complex evaluation
Calibrate thresholds
Add GitHub Actions workflow
.github/workflows/mcp-eval.yml
:Add test badges
Configure failure conditions
Use the generate command
Review and customize
Update existing tests
Enable verbose output
Examine OTEL traces
test-reports/test_name_*/trace.jsonl
:Use doctor and validate commands
Add debug assertions