Learn about mcp-eval, the comprehensive framework for testing MCP servers and tool-using agents in production-like environments.
mcp-eval is your βflight simulatorβ for tool-using LLMs. Connect agents to real MCP servers, run realistic scenarios, and get production-grade insights into behavior and performance.
pytest
, datasets or @task
decorators.@task
decorators for quick testsPytest integration: Use familiar pytest fixtures and markers (run with uv run pytest
)Dataset driven: Systematic evaluation with test matricesAI generation: Let Claude/GPT generate test scenariosParameterization: Test variations with minimal codemcp-eval init
sets up everythingSmart CLI: Discover servers, generate tests, validate configRich reports: Console, JSON, Markdown, interactive HTMLCI/CD ready: GitHub Actions, exit codes, artifact uploadsHelpful diagnostics: doctor
and validate
commandsConfigure your environment
Write test scenarios
Execute tests
Process traces into metrics
Apply assertions
Generate reports