Decision Workspace

llm-test-bench-core vs llm-test-bench vs serdes-ai-evals

Side-by-side comparison of Rust crates

llm-test-bench-core

experimentalv0.1.0

Core library for LLM Test Bench - comprehensive testing framework for Large Language Models with 65+ supported models across 14+ providers

experimentalv0.1.0

A production-grade CLI for testing and benchmarking LLM applications with support for GPT-5, Claude Opus 4, Gemini 2.5, and 65+ models

serdes-ai-evals

experimentalv0.2.6

Evaluation framework for testing and benchmarking serdes-ai agents

Core Metrics

	llm-test-bench-core	llm-test-bench	serdes-ai-evals
Health Score	42	40	48
Total Downloads	127	40	399
30d Downloads	26	5	109
Dependents	1	0	8
Releases	1	1	10
Last Updated	143d ago	143d ago	35d ago
Age	4m	4m	2m

Health Breakdown

llm-test-bench-core

Maintenance

6

Quality

11

Community

7

Popularity

3

Documentation

15

llm-test-bench

Maintenance

6

Quality

11

Community

6

Popularity

2

Documentation

15

serdes-ai-evals

Maintenance

11

Quality

14

Community

8

Popularity

3

Documentation

12

Technical Details

	llm-test-bench-core	llm-test-bench	serdes-ai-evals
Version	0.1.0	0.1.0	0.2.6
Stable (≥1.0)	✗ No	✗ No	✗ No
License	MIT OR Apache-2.0	MIT OR Apache-2.0	MIT
Dependencies	55	22	15
Crate Size	355KB	79KB	29KB
Features	2	0	2
Yanked %	0.0%	0.0%	0.0%
Edition	2021	2021	2021
MSRV	1.75.0	1.75.0	1.75.0
Owners	1	1	1

Links

llm-test-bench-core detail →GitHub ↗Docs ↗crates.io ↗

llm-test-bench detail →GitHub ↗Docs ↗crates.io ↗

serdes-ai-evals detail →GitHub ↗crates.io ↗

Quick Verdict

•serdes-ai-evals leads with a health score of 48/100, but none of the options score above 80.