The comprehensive testing framework for MCP servers and tool-using agents.
Your flight simulator for MCP servers and agents — Connect agents to real MCP servers, run realistic scenarios, and calculate metrics for tool calls and more.
Model Context Protocol standardizes how applications provide context to large language models (LLMs). Think of MCP like a USB-C port for AI applications.mcp-eval ensures your MCP servers, and agents built with them, work reliably in production.
# Install mcp-eval globally (for CLI)uv tool install mcpevals# Add mcp-eval dependency to your projectuv add mcpevals# Initialize your project (interactive setup)mcp-eval init# Add your MCP server to testmcp-eval server add# Auto-generate tests with an LLMmcp-eval generate# Run decorator/dataset testsmcp-eval run tests/# Run pytest tests (use pytest)uv run pytest -q tests
Test any MCP server: It doesn’t matter what language your MCP server is written in - Python, TypeScript, Go, Rust, Java, or any other. As long as it implements the MCP protocol, mcp-eval can test it!
from mcp_eval import task, Expect@task("Verify fetch server works correctly")async def test_fetch(agent, session): # Ask the agent to fetch a webpage response = await agent.generate_str("Fetch https://example.com and summarize it") # Assert the right tool was called await session.assert_that(Expect.tools.was_called("fetch")) # Verify the content is correct await session.assert_that(Expect.content.contains("Example Domain"), response=response) # Check performance await session.assert_that(Expect.performance.response_time_under(5000))