Skip to content

User Guide

This guide covers everything you need to know to effectively use GenAI Bench for benchmarking LLM endpoints.

What You'll Learn

  • Running Benchmarks


    Learn how to run benchmarks against various LLM endpoints

    Run Benchmark

  • Multi-Cloud Setup


    Configure authentication for AWS, Azure, GCP, OCI, and more

    Multi-Cloud Guide

  • Docker Deployment


    Run GenAI Bench in containerized environments

    Docker Guide

  • Excel Reports


    Generate comprehensive Excel reports from benchmark results

    Excel Reports

Common Workflows

Basic Benchmarking

  1. Choose your model provider - OpenAI, AWS Bedrock, Azure OpenAI, etc.
  2. Configure authentication - API keys, IAM roles, or service accounts
  3. Run the benchmark - Specify task type and parameters
  4. Analyze results - View real-time dashboard or generate reports

Cross-Cloud Benchmarking

Benchmark models from one provider while storing results in another:

# Benchmark OpenAI, store in AWS S3
genai-bench benchmark \
  --api-backend openai \
  --api-key $OPENAI_KEY \
  --upload-results \
  --storage-provider aws \
  --storage-bucket my-results

Multi-Modal Tasks

Support for text, embeddings, and vision tasks:

  • text-to-text - Chat and completion tasks
  • text-to-embeddings - Embedding generation
  • image-text-to-text - Vision-language tasks
  • text-to-rerank - Document reranking

Need Help?