API
Case[Input, Output, Metadata]Dataset[Input, Output, Metadata]
Programmatic
YAML/JSON
Save/load viaDataset.to_file and Dataset.from_file. Schema: mcpeval.config.schema.json.
YAML example (from basic_fetch_dataset.yaml):
Concurrency
Dataset.evaluate(..., max_concurrency=N) runs cases in parallel.