GenAI Bench¶

Unified, accurate, and beautiful LLM Benchmarking

What is GenAI Bench?¶

Genai-bench is a powerful benchmark tool designed for comprehensive token-level performance evaluation of large language model (LLM) serving systems.

It provides detailed insights into model serving performance, offering both a user-friendly CLI and a live UI for real-time progress monitoring.

Live UI Dashboard¶

GenAI Bench includes a real-time dashboard that provides live monitoring of your benchmarks:

GenAI Bench UI Dashboard

Key Features¶

🛠️ CLI Tool: Validates user inputs and initiates benchmarks seamlessly.
📊 Live UI Dashboard: Displays current progress, logs, and real-time metrics.
📝 Rich Logs: Automatically flushed to both terminal and file upon experiment completion.
📈 Experiment Analyzer: Generates comprehensive Excel reports with pricing and raw metrics data, plus flexible plot configurations (default 2x4 grid) that visualize key performance metrics including throughput, latency (TTFT, E2E, TPOT), error rates, and RPS across different traffic scenarios and concurrency levels. Supports custom plot layouts and multi-line comparisons.

Quick Start¶

Get started with GenAI Bench in minutes:

# Install from PyPI
pip install genai-bench

# Run your first benchmark
genai-bench benchmark --help

For detailed installation and usage instructions, see our Installation Guide.

Supported Tasks¶

GenAI Bench supports multiple benchmark types:

Task	Description	Use Case
`text-to-text`	Benchmarks generating text output from text input	Chat, QA
`text-to-embeddings`	Benchmarks generating embeddings from text input	Semantic search
`image-text-to-text`	Benchmarks generating text from images and text prompts	Visual question answering
`image-to-embeddings`	Benchmarks generating embeddings from images	Image similarity

Documentation Sections¶

🚀 Getting Started¶

Installation - Detailed installation guide
Task Definition - Understanding different benchmark tasks
Command Guidelines - Command usage guidelines
Metrics Definition - Understanding benchmark metrics

📖 User Guide¶

Run Benchmark - How to run benchmarks
Multi-Cloud Authentication & Storage - Comprehensive guide for cloud provider authentication
Multi-Cloud Quick Reference - Quick examples for common scenarios
Run Benchmark with Docker - Docker-based benchmarking
Generate Excel Sheet - Creating Excel reports
Generate Plot - Creating visualizations
Upload Benchmark Results - Uploading results

🔧 Development¶

Contributing - How to contribute to GenAI Bench

Support¶

If you encounter any issues or have questions, please: - Check our documentation for detailed guides - Report issues on our GitHub repository - Join our community discussions

License¶

GenAI Bench is open source and available under the MIT License.