> ## Documentation Index > Fetch the complete documentation index at: https://mcp-eval.ai/llms.txt > Use this file to discover all available pages before exploring further. # mcp-eval Documentation > The comprehensive testing framework for MCP servers and tool-using agents. **Your flight simulator for MCP servers and agents** — Connect agents to real MCP servers, run realistic scenarios, and calculate metrics for tool calls and more. [Model Context Protocol](https://modelcontextprotocol.io/docs/getting-started/intro) standardizes how applications provide context to large language models (LLMs). Think of MCP like a USB-C port for AI applications. **`mcp-eval`** ensures your MCP servers, and agents built with them, work reliably in production. ## What `mcp-eval` Does for You Ensure your MCP servers respond correctly to agent requests and handle edge cases gracefully Measure how effectively agents use tools, follow instructions, and recover from errors Monitor latency, token usage, cost, and success rates with OpenTelemetry-backed metrics Use structural checks, LLM judges, and path efficiency validators to ensure high quality ## Get Started in 30 Seconds We recommend using [uv](https://docs.astral.sh/uv/): ```bash uv (recommended) theme={null} # Install mcp-eval globally (for CLI) uv tool install mcpevals # Add mcp-eval dependency to your project uv add mcpevals # Initialize your project (interactive setup) mcp-eval init # Add your MCP server to test mcp-eval server add # Auto-generate tests with an LLM mcp-eval generate # Run decorator/dataset tests mcp-eval run tests/ # Run pytest tests (use pytest) uv run pytest -q tests ``` ```bash pip theme={null} # Install mcp-eval pip install mcpevals # Initialize your project mcp-eval init # Add your MCP server mcp-eval server add # Run decorator/dataset tests mcp-eval run tests/ # Run pytest tests (use pytest) pytest -q tests ``` **Test any MCP server:** It doesn't matter what language your MCP server is written in - Python, TypeScript, Go, Rust, Java, or any other. As long as it implements the MCP protocol, `mcp-eval` can test it! You're ready to start testing! [Continue with the Quickstart →](./quickstart) ## 🎮 Choose Your Testing Adventure What are you evaluating today? **You built an MCP server** (in any language!) and want to ensure it handles agent requests correctly. mcp-eval will spin up an AI agent to test your server with realistic requests, edge cases, and error scenarios. **Your server could be:** * A streamable HTTP database connector * An SSE API wrapper * A stdio file system server * Any server that speaks MCP! MCP Server Testing Guide Testing the Fetch Server **You built an AI agent** that uses MCP servers and want to ensure it uses tools effectively. mcp-eval will connect your agent to MCP servers and verify it uses tools correctly, handles errors, and meets performance targets. **Your agent could be:** * A customer service bot * A coding assistant * A deep research agent * Any MCP agent! Agent Evaluation Guide Common Testing Patterns **You're building a complete system** with both MCP servers and agents. mcp-eval can test your entire integration - ensuring servers handle requests correctly AND agents use tools effectively. 5-minute setup Core concepts Browse all examples ## Why Teams Choose `mcp-eval` * **Production-readiness**: Built on OpenTelemetry for enterprise-grade observability * **Multiple test styles**: Choose between decorators, pytest, or dataset-driven testing * **Rich assertions**: Content checks, tool verification, performance gates, and LLM judges * **CI/CD friendly**: GitHub Actions support, JSON/HTML reports, and regression detection * **Language agnostic**: Test MCP servers written in any language ## Quick Navigation Get up and running in 5 minutes Step-by-step guides for typical tasks Complete assertion catalog and APIs ## Learning Path Understand `mcp-eval`'s architecture and philosophy Your first test in 5 minutes Core concepts and terminology The unified Expect API for all assertions Practical testing patterns AI-powered test creation Testing MCP server implementations Measuring agent effectiveness Systematic evaluation suites Settings and customization GitHub Actions and automation Understanding test outputs Complete command documentation Detailed API documentation Common issues and solutions Frequently asked questions ## Example: Your First Test ```python test_fetch.py theme={null} from mcp_eval import task, Expect @task("Verify fetch server works correctly") async def test_fetch(agent, session): # Ask the agent to fetch a webpage response = await agent.generate_str("Fetch https://example.com and summarize it") # Assert the right tool was called await session.assert_that(Expect.tools.was_called("fetch")) # Verify the content is correct await session.assert_that(Expect.content.contains("Example Domain"), response=response) # Check performance await session.assert_that(Expect.performance.response_time_under(5000)) ``` ```python pytest_style.py theme={null} import pytest from mcp_eval import create_agent, Expect @pytest.mark.asyncio async def test_fetch_with_pytest(): agent = await create_agent("claude-3-5-sonnet") response = await agent.generate_str("Fetch https://example.com") assert "Example Domain" in response assert agent.tools_called == ["fetch"] ``` [See more examples →](./examples) ## Join the Community Report issues and contribute Get help and share experiences