Multi-Cloud Authentication and Storage Guide¶

genai-bench now supports comprehensive multi-cloud authentication for both model endpoints and storage services. This guide covers how to configure and use authentication for various cloud providers.

Table of Contents¶

Overview
Model Provider Authentication
OpenAI
OCI Cohere
AWS Bedrock
Azure OpenAI
GCP Vertex AI
SGLang / vLLM
Storage Provider Authentication
OCI Object Storage
AWS S3
Azure Blob Storage
GCP Cloud Storage
GitHub Releases
Command Examples
Environment Variables
Best Practices

Overview¶

genai-bench separates authentication into two categories: 1. Model Authentication: For accessing LLM endpoints 2. Storage Authentication: For uploading benchmark results

This separation allows you to benchmark models from one provider while storing results in another provider's storage service.

Model Provider Authentication¶

OpenAI¶

OpenAI uses API key authentication.

Required parameters: - --api-backend openai - --api-key or --model-api-key: Your OpenAI API key

Example:

genai-bench benchmark \
  --api-backend openai \
  --api-base https://api.openai.com/v1 \
  --api-key sk-... \
  --api-model-name gpt-4 \
  --model-tokenizer gpt2 \
  --task text-to-text \
  --max-requests-per-run 100 \
  --max-time-per-run 10

Environment variable alternative:

export MODEL_API_KEY=sk-...
genai-bench benchmark --api-backend openai ...

OCI Cohere¶

OCI supports multiple authentication methods.

Authentication types: - user_principal: Default, uses OCI config file - instance_principal: For compute instances - security_token: For delegation tokens - instance_obo_user: Instance principal with user delegation

Required parameters: - --api-backend oci-cohere or --api-backend cohere - --auth: Authentication type (default: user_principal)

User Principal example:

genai-bench benchmark \
  --api-backend oci-cohere \
  --api-base https://inference.generativeai.us-chicago-1.oci.oraclecloud.com \
  --auth user_principal \
  --config-file ~/.oci/config \
  --profile DEFAULT \
  --api-model-name cohere.command-r-plus \
  --model-tokenizer Cohere/command-r-plus \
  --task text-to-text \
  --max-requests-per-run 100 \
  --max-time-per-run 10

Instance Principal example:

genai-bench benchmark \
  --api-backend oci-cohere \
  --api-base https://inference.generativeai.us-chicago-1.oci.oraclecloud.com \
  --auth instance_principal \
  --region us-chicago-1 \
  --api-model-name cohere.command-r-plus \
  --model-tokenizer Cohere/command-r-plus \
  --task text-to-text \
  --max-requests-per-run 100 \
  --max-time-per-run 10

AWS Bedrock¶

AWS Bedrock supports IAM credentials and profiles.

Authentication methods: 1. IAM Credentials: Access key ID and secret access key 2. AWS Profile: Named profile from credentials file 3. Environment variables: AWS SDK default behavior

Required parameters: - --api-backend aws-bedrock - --aws-region: AWS region for Bedrock

IAM Credentials example:

genai-bench benchmark \
  --api-backend aws-bedrock \
  --api-base https://bedrock-runtime.us-east-1.amazonaws.com \
  --aws-access-key-id AKIAIOSFODNN7EXAMPLE \
  --aws-secret-access-key wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY \
  --aws-region us-east-1 \
  --api-model-name anthropic.claude-3-sonnet-20240229-v1:0 \
  --model-tokenizer Anthropic/claude-3-sonnet \
  --task text-to-text \
  --max-requests-per-run 100 \
  --max-time-per-run 10

AWS Profile example:

genai-bench benchmark \
  --api-backend aws-bedrock \
  --api-base https://bedrock-runtime.us-west-2.amazonaws.com \
  --aws-profile production \
  --aws-region us-west-2 \
  --api-model-name amazon.titan-text-express-v1 \
  --model-tokenizer amazon/titan \
  --task text-to-text \
  --max-requests-per-run 100 \
  --max-time-per-run 10

Environment variables:

export AWS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE
export AWS_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
export AWS_DEFAULT_REGION=us-east-1

genai-bench benchmark --api-backend aws-bedrock ...

Azure OpenAI¶

Azure OpenAI supports API key and Azure AD authentication.

Authentication methods: 1. API Key: Traditional API key authentication 2. Azure AD: Azure Active Directory token

Required parameters: - --api-backend azure-openai - --azure-endpoint: Your Azure OpenAI endpoint - --azure-deployment: Your deployment name - --azure-api-version: API version (default: 2024-02-01)

API Key example:

genai-bench benchmark \
  --api-backend azure-openai \
  --api-base https://myresource.openai.azure.com \
  --azure-endpoint https://myresource.openai.azure.com \
  --azure-deployment my-gpt-4-deployment \
  --model-api-key YOUR_AZURE_API_KEY \
  --api-model-name gpt-4 \
  --model-tokenizer gpt2 \
  --task text-to-text \
  --max-requests-per-run 100 \
  --max-time-per-run 10

Azure AD example:

genai-bench benchmark \
  --api-backend azure-openai \
  --api-base https://myresource.openai.azure.com \
  --azure-endpoint https://myresource.openai.azure.com \
  --azure-deployment my-gpt-4-deployment \
  --azure-ad-token YOUR_AAD_TOKEN \
  --api-model-name gpt-4 \
  --model-tokenizer gpt2 \
  --task text-to-text \
  --max-requests-per-run 100 \
  --max-time-per-run 10

GCP Vertex AI¶

GCP Vertex AI supports service account and API key authentication.

Authentication methods: 1. Service Account: JSON key file 2. API Key: For certain Vertex AI services 3. Application Default Credentials: GCP SDK default

Required parameters: - --api-backend gcp-vertex - --gcp-project-id: Your GCP project ID - --gcp-location: GCP region (default: us-central1)

Service Account example:

genai-bench benchmark \
  --api-backend gcp-vertex \
  --api-base https://us-central1-aiplatform.googleapis.com \
  --gcp-project-id my-project-123 \
  --gcp-location us-central1 \
  --gcp-credentials-path /path/to/service-account.json \
  --api-model-name gemini-1.5-pro \
  --model-tokenizer google/gemini \
  --task text-to-text \
  --max-requests-per-run 100 \
  --max-time-per-run 10

Environment variable:

export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json
export GCP_PROJECT_ID=my-project-123

genai-bench benchmark --api-backend gcp-vertex ...

SGLang or vLLM¶

vLLM and SGLang use OpenAI-compatible APIs with optional authentication.

Required parameters: - --api-backend sglang or --api-backend vllm - --api-base: Your server endpoint - --api-key or --model-api-key: Optional API key if authentication is enabled

Example:

genai-bench benchmark \
  --api-backend vllm \
  --api-base http://localhost:8000 \
  --api-key optional-key \
  --api-model-name meta-llama/Llama-2-7b-hf \
  --model-tokenizer meta-llama/Llama-2-7b-hf \
  --task text-to-text \
  --max-requests-per-run 100 \
  --max-time-per-run 10

Storage Provider Authentication¶

Storage authentication is configured separately from model authentication, allowing you to store results in any supported storage service.

Common Storage Parameters¶

All storage providers share these common parameters: - --upload-results: Flag to enable result upload - --storage-provider: Storage provider type (oci, aws, azure, gcp, github) - --storage-bucket: Bucket/container name - --storage-prefix: Optional prefix for uploaded objects

OCI Object Storage¶

Authentication types: Same as OCI model authentication (user_principal, instance_principal, etc.)

Example:

genai-bench benchmark \
  ... \
  --upload-results \
  --storage-provider oci \
  --storage-bucket my-benchmark-results \
  --storage-prefix experiments/2024 \
  --storage-auth-type user_principal \
  --namespace my-namespace

AWS S3¶

Authentication methods: 1. IAM Credentials 2. AWS Profile 3. Environment variables

IAM Credentials example:

genai-bench benchmark \
  ... \
  --upload-results \
  --storage-provider aws \
  --storage-bucket my-benchmark-results \
  --storage-prefix experiments/2024 \
  --storage-aws-access-key-id AKIAIOSFODNN7EXAMPLE \
  --storage-aws-secret-access-key wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY \
  --storage-aws-region us-east-1

AWS Profile example:

genai-bench benchmark \
  ... \
  --upload-results \
  --storage-provider aws \
  --storage-bucket my-benchmark-results \
  --storage-prefix experiments/2024 \
  --storage-aws-profile production \
  --storage-aws-region us-west-2

Azure Blob Storage¶

Authentication methods: 1. Storage Account Key 2. Connection String 3. SAS Token 4. Azure AD

Account Key example:

genai-bench benchmark \
  ... \
  --upload-results \
  --storage-provider azure \
  --storage-bucket my-container \
  --storage-prefix experiments/2024 \
  --storage-azure-account-name mystorageaccount \
  --storage-azure-account-key YOUR_ACCOUNT_KEY

Connection String example:

genai-bench benchmark \
  ... \
  --upload-results \
  --storage-provider azure \
  --storage-bucket my-container \
  --storage-azure-connection-string "DefaultEndpointsProtocol=https;AccountName=..."

GCP Cloud Storage¶

Authentication methods: 1. Service Account 2. Application Default Credentials

Example:

genai-bench benchmark \
  ... \
  --upload-results \
  --storage-provider gcp \
  --storage-bucket my-benchmark-results \
  --storage-prefix experiments/2024 \
  --storage-gcp-project-id my-project-123 \
  --storage-gcp-credentials-path /path/to/service-account.json

GitHub Releases¶

GitHub storage uploads results as release artifacts.

Required parameters: - --github-token: Personal access token with repo permissions - --github-owner: Repository owner (user or organization) - --github-repo: Repository name

Example:

genai-bench benchmark \
  ... \
  --upload-results \
  --storage-provider github \
  --github-token ghp_xxxxxxxxxxxxxxxxxxxx \
  --github-owner myorg \
  --github-repo benchmark-results

Command Examples¶

Cross-Cloud Benchmarking¶

Benchmark OpenAI and store in AWS S3:

genai-bench benchmark \
  --api-backend openai \
  --api-base https://api.openai.com/v1 \
  --api-key sk-... \
  --api-model-name gpt-4 \
  --model-tokenizer gpt2 \
  --task text-to-text \
  --max-requests-per-run 100 \
  --max-time-per-run 10 \
  --upload-results \
  --storage-provider aws \
  --storage-bucket my-benchmarks \
  --storage-aws-profile default \
  --storage-aws-region us-east-1

Benchmark AWS Bedrock and store in Azure Blob:

genai-bench benchmark \
  --api-backend aws-bedrock \
  --api-base https://bedrock-runtime.us-east-1.amazonaws.com \
  --aws-profile bedrock-user \
  --aws-region us-east-1 \
  --api-model-name anthropic.claude-3-sonnet-20240229-v1:0 \
  --model-tokenizer Anthropic/claude-3-sonnet \
  --task text-to-text \
  --max-requests-per-run 100 \
  --max-time-per-run 10 \
  --upload-results \
  --storage-provider azure \
  --storage-bucket benchmarks \
  --storage-azure-connection-string "DefaultEndpointsProtocol=..."

Image-to-text with GCP Vertex AI:

genai-bench benchmark \
  --api-backend gcp-vertex \
  --api-base https://us-central1-aiplatform.googleapis.com \
  --gcp-project-id my-project \
  --gcp-location us-central1 \
  --gcp-credentials-path /path/to/service-account.json \
  --api-model-name gemini-1.5-pro-vision \
  --model-tokenizer google/gemini \
  --task image-text-to-text \
  --dataset-path /path/to/image/dataset \
  --max-requests-per-run 50 \
  --max-time-per-run 10

Environment Variables¶

genai-bench supports environment variables for sensitive credentials:

Model Authentication¶

MODEL_API_KEY: API key for OpenAI, Azure OpenAI, or GCP
AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN: AWS credentials
AWS_PROFILE, AWS_DEFAULT_REGION: AWS configuration
AZURE_OPENAI_ENDPOINT, AZURE_OPENAI_DEPLOYMENT, AZURE_OPENAI_API_VERSION: Azure configuration
AZURE_AD_TOKEN: Azure AD authentication token
GCP_PROJECT_ID, GCP_LOCATION: GCP configuration
GOOGLE_APPLICATION_CREDENTIALS: Path to GCP service account JSON

Storage Authentication¶

AZURE_STORAGE_ACCOUNT_NAME, AZURE_STORAGE_ACCOUNT_KEY: Azure storage credentials
AZURE_STORAGE_CONNECTION_STRING, AZURE_STORAGE_SAS_TOKEN: Azure storage alternatives
GITHUB_TOKEN, GITHUB_OWNER, GITHUB_REPO: GitHub configuration

General¶

HF_TOKEN: HuggingFace token for downloading tokenizers

Best Practices¶

Security¶

Never commit credentials: Use environment variables or secure credential stores
Use least privilege: Grant only necessary permissions for benchmarking
Rotate credentials regularly: Update API keys and tokens periodically
Use service accounts: Prefer service accounts over personal credentials for automation

Performance¶

Choose nearby regions: Select cloud regions close to your location for lower latency
Batch operations: Use appropriate batch sizes for embedding tasks
Monitor costs: Be aware of API pricing and set appropriate limits

Organization¶

Use consistent naming: Adopt a naming convention for storage prefixes
Separate environments: Use different buckets/prefixes for dev/test/prod
Tag resources: Use cloud provider tags for cost tracking and organization

Important Notes¶

Task-specific behavior:
For text-to-embeddings and text-to-rerank tasks, the iteration type automatically switches to batch_size
For other tasks, num_concurrency iteration is used
This is handled automatically by the CLI
Image format requirements:
Image inputs are expected to be in JPEG format for multi-modal tasks
Base64 encoding is handled automatically
Token counting:
Different providers may use different tokenization methods
Token estimates for embeddings tasks may vary by provider

Troubleshooting¶

Check credentials: Verify authentication credentials are correct
Verify permissions: Ensure accounts have necessary permissions
Check regions: Confirm services are available in selected regions
Review quotas: Check API quotas and rate limits
Enable logging: Use verbose logging for debugging authentication issues

Migration from Legacy CLI¶

If you're migrating from the legacy OCI-only CLI:

Old command:

genai-bench benchmark \
  --api-backend oci-cohere \
  --bucket my-bucket \
  --prefix my-prefix \
  ...

New command:

genai-bench benchmark \
  --api-backend oci-cohere \
  --storage-bucket my-bucket \
  --storage-prefix my-prefix \
  --storage-provider oci \
  ...

The main changes are: - --bucket → --storage-bucket - --prefix → --storage-prefix - Add --storage-provider oci (though OCI is the default for backward compatibility)

Multi-Cloud Authentication and Storage Guide¶

Table of Contents¶

Overview¶

Model Provider Authentication¶

OpenAI¶

OCI Cohere¶

AWS Bedrock¶

Azure OpenAI¶

GCP Vertex AI¶

SGLang or vLLM¶

Storage Provider Authentication¶

Common Storage Parameters¶

OCI Object Storage¶

AWS S3¶

Azure Blob Storage¶

GCP Cloud Storage¶

GitHub Releases¶

Command Examples¶

Cross-Cloud Benchmarking¶

Multi-Modal Tasks¶

Environment Variables¶

Model Authentication¶

Storage Authentication¶

General¶

Best Practices¶

Security¶

Performance¶

Organization¶

Important Notes¶

Troubleshooting¶

Migration from Legacy CLI¶