Traces → metrics

The session writes OTEL spans to JSONL; mcp-eval converts them to rich metrics:
  • Tool calls (names, args, times, errors)
  • Iteration count, response latency
  • Token and cost estimates
  • Tool coverage by server (available vs used)
Sources:

Span tree analysis

SpanTree enables:
  • LLM rephrasing loop detection
  • Inefficient tool paths analysis
  • Error recovery sequences

Artifacts

  • Traces: ./test-reports/*.jsonl
  • Per‑test JSON results: ./test-reports/*_results.json
  • Combined JSON/Markdown/HTML (via runner options)