Core Generation Knowledge
mcp-eval provides two generation approaches:- Structured scenario generation: Agent-driven generation with assertion specs
- Simple dataset generation: Backward-compatible basic test cases
CLI Generation Commands
Basic Generation
Advanced Generation Options
Generated Test Patterns
Scenario Structure
Assertion Types for Generation
Generation Templates
Pytest Template Structure
Decorator Template Structure
Generation Best Practices
1. Tool Discovery First
2. Iterative Refinement
3. Custom Instructions
Scenario Categories
When generating, create diverse test scenarios across:Basic Functionality
- Simple tool usage
- Expected outputs
- Success paths
Error Handling
- Invalid inputs
- Network failures
- Tool errors
- Recovery patterns
Edge Cases
- Empty inputs
- Large payloads
- Special characters
- Boundary values
Performance
- Response times
- Token usage
- Iteration counts
- Concurrent operations
Integration
- Multi-tool workflows
- Tool sequencing
- State management
- Complex operations
Generation Examples
Example 1: Generate for Fetch Server
Example 2: Generate for Calculator Server
Example 3: Generate Dataset Tests
Customizing Generated Tests
After generation, enhance tests by:1. Adding Setup/Teardown
2. Adding Custom Assertions
3. Adding Parametrization
Quality Checks for Generated Tests
After generation, verify:- Tool names are correct: Match actual MCP server tools
- Assertions are appropriate: Mix of deterministic and judge-based
- Coverage is complete: All tools and major scenarios covered
- Error handling included: Negative test cases present
- Performance checks added: Response time and efficiency tests
- Documentation clear: Test purposes are documented
Generation Workflow
-
Discover server tools:
-
Generate initial tests:
-
Review and refine:
- Check generated scenarios
- Add missing test cases
- Enhance assertions
-
Run and validate:
-
Iterate based on results:
- Add tests for uncovered paths
- Improve failing assertions
- Optimize performance tests
Common Generation Issues and Fixes
Issue: Generated tests reference wrong tool names
Fix: Use--discover-tools
flag or specify correct names in extra instructions
Issue: Tests are too simple
Fix: Use--refine
flag and provide detailed --extra-instructions