GenAI Bench¶
Unified, accurate, and beautiful LLM Benchmarking
What is GenAI Bench?¶
Genai-bench is a powerful benchmark tool designed for comprehensive token-level performance evaluation of large language model (LLM) serving systems.
It provides detailed insights into model serving performance, offering both a user-friendly CLI and a live UI for real-time progress monitoring.
Live UI Dashboard¶
GenAI Bench includes a real-time dashboard that provides live monitoring of your benchmarks:
Key Features¶
- 🛠️ CLI Tool: Validates user inputs and initiates benchmarks seamlessly.
- 📊 Live UI Dashboard: Displays current progress, logs, and real-time metrics.
- 📝 Rich Logs: Automatically flushed to both terminal and file upon experiment completion.
- 📈 Experiment Analyzer: Generates comprehensive Excel reports with pricing and raw metrics data, plus flexible plot configurations (default 2x4 grid) that visualize key performance metrics including throughput, latency (TTFT, E2E, TPOT), error rates, and RPS across different traffic scenarios and concurrency levels. Supports custom plot layouts and multi-line comparisons.
Quick Start¶
Get started with GenAI Bench in minutes:
For detailed installation and usage instructions, see our Installation Guide.
Supported Tasks¶
GenAI Bench supports multiple benchmark types:
Task | Description | Use Case |
---|---|---|
text-to-text |
Benchmarks generating text output from text input | Chat, QA |
text-to-embeddings |
Benchmarks generating embeddings from text input | Semantic search |
image-text-to-text |
Benchmarks generating text from images and text prompts | Visual question answering |
image-to-embeddings |
Benchmarks generating embeddings from images | Image similarity |
Documentation Sections¶
🚀 Getting Started¶
- Installation - Detailed installation guide
- Task Definition - Understanding different benchmark tasks
- Command Guidelines - Command usage guidelines
- Metrics Definition - Understanding benchmark metrics
📖 User Guide¶
- Run Benchmark - How to run benchmarks
- Multi-Cloud Authentication & Storage - Comprehensive guide for cloud provider authentication
- Multi-Cloud Quick Reference - Quick examples for common scenarios
- Run Benchmark with Docker - Docker-based benchmarking
- Generate Excel Sheet - Creating Excel reports
- Generate Plot - Creating visualizations
- Upload Benchmark Results - Uploading results
🔧 Development¶
- Contributing - How to contribute to GenAI Bench
Support¶
If you encounter any issues or have questions, please: - Check our documentation for detailed guides - Report issues on our GitHub repository - Join our community discussions
License¶
GenAI Bench is open source and available under the MIT License.