Multi-Cloud Authentication and Storage Guide¶
genai-bench now supports comprehensive multi-cloud authentication for both model endpoints and storage services. This guide covers how to configure and use authentication for various cloud providers.
Table of Contents¶
- Overview
- Model Provider Authentication
- OpenAI
- OCI Cohere
- AWS Bedrock
- Azure OpenAI
- GCP Vertex AI
- SGLang / vLLM
- Storage Provider Authentication
- OCI Object Storage
- AWS S3
- Azure Blob Storage
- GCP Cloud Storage
- GitHub Releases
- Command Examples
- Environment Variables
- Best Practices
Overview¶
genai-bench separates authentication into two categories: 1. Model Authentication: For accessing LLM endpoints 2. Storage Authentication: For uploading benchmark results
This separation allows you to benchmark models from one provider while storing results in another provider's storage service.
Model Provider Authentication¶
OpenAI¶
OpenAI uses API key authentication.
Required parameters:
- --api-backend openai
- --api-key
or --model-api-key
: Your OpenAI API key
Example:
genai-bench benchmark \
--api-backend openai \
--api-base https://api.openai.com/v1 \
--api-key sk-... \
--api-model-name gpt-4 \
--model-tokenizer gpt2 \
--task text-to-text \
--max-requests-per-run 100 \
--max-time-per-run 10
Environment variable alternative:
OCI Cohere¶
OCI supports multiple authentication methods.
Authentication types:
- user_principal
: Default, uses OCI config file
- instance_principal
: For compute instances
- security_token
: For delegation tokens
- instance_obo_user
: Instance principal with user delegation
Required parameters:
- --api-backend oci-cohere
or --api-backend cohere
- --auth
: Authentication type (default: user_principal)
User Principal example:
genai-bench benchmark \
--api-backend oci-cohere \
--api-base https://inference.generativeai.us-chicago-1.oci.oraclecloud.com \
--auth user_principal \
--config-file ~/.oci/config \
--profile DEFAULT \
--api-model-name cohere.command-r-plus \
--model-tokenizer Cohere/command-r-plus \
--task text-to-text \
--max-requests-per-run 100 \
--max-time-per-run 10
Instance Principal example:
genai-bench benchmark \
--api-backend oci-cohere \
--api-base https://inference.generativeai.us-chicago-1.oci.oraclecloud.com \
--auth instance_principal \
--region us-chicago-1 \
--api-model-name cohere.command-r-plus \
--model-tokenizer Cohere/command-r-plus \
--task text-to-text \
--max-requests-per-run 100 \
--max-time-per-run 10
AWS Bedrock¶
AWS Bedrock supports IAM credentials and profiles.
Authentication methods: 1. IAM Credentials: Access key ID and secret access key 2. AWS Profile: Named profile from credentials file 3. Environment variables: AWS SDK default behavior
Required parameters:
- --api-backend aws-bedrock
- --aws-region
: AWS region for Bedrock
IAM Credentials example:
genai-bench benchmark \
--api-backend aws-bedrock \
--api-base https://bedrock-runtime.us-east-1.amazonaws.com \
--aws-access-key-id AKIAIOSFODNN7EXAMPLE \
--aws-secret-access-key wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY \
--aws-region us-east-1 \
--api-model-name anthropic.claude-3-sonnet-20240229-v1:0 \
--model-tokenizer Anthropic/claude-3-sonnet \
--task text-to-text \
--max-requests-per-run 100 \
--max-time-per-run 10
AWS Profile example:
genai-bench benchmark \
--api-backend aws-bedrock \
--api-base https://bedrock-runtime.us-west-2.amazonaws.com \
--aws-profile production \
--aws-region us-west-2 \
--api-model-name amazon.titan-text-express-v1 \
--model-tokenizer amazon/titan \
--task text-to-text \
--max-requests-per-run 100 \
--max-time-per-run 10
Environment variables:
export AWS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE
export AWS_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
export AWS_DEFAULT_REGION=us-east-1
genai-bench benchmark --api-backend aws-bedrock ...
Azure OpenAI¶
Azure OpenAI supports API key and Azure AD authentication.
Authentication methods: 1. API Key: Traditional API key authentication 2. Azure AD: Azure Active Directory token
Required parameters:
- --api-backend azure-openai
- --azure-endpoint
: Your Azure OpenAI endpoint
- --azure-deployment
: Your deployment name
- --azure-api-version
: API version (default: 2024-02-01)
API Key example:
genai-bench benchmark \
--api-backend azure-openai \
--api-base https://myresource.openai.azure.com \
--azure-endpoint https://myresource.openai.azure.com \
--azure-deployment my-gpt-4-deployment \
--model-api-key YOUR_AZURE_API_KEY \
--api-model-name gpt-4 \
--model-tokenizer gpt2 \
--task text-to-text \
--max-requests-per-run 100 \
--max-time-per-run 10
Azure AD example:
genai-bench benchmark \
--api-backend azure-openai \
--api-base https://myresource.openai.azure.com \
--azure-endpoint https://myresource.openai.azure.com \
--azure-deployment my-gpt-4-deployment \
--azure-ad-token YOUR_AAD_TOKEN \
--api-model-name gpt-4 \
--model-tokenizer gpt2 \
--task text-to-text \
--max-requests-per-run 100 \
--max-time-per-run 10
GCP Vertex AI¶
GCP Vertex AI supports service account and API key authentication.
Authentication methods: 1. Service Account: JSON key file 2. API Key: For certain Vertex AI services 3. Application Default Credentials: GCP SDK default
Required parameters:
- --api-backend gcp-vertex
- --gcp-project-id
: Your GCP project ID
- --gcp-location
: GCP region (default: us-central1)
Service Account example:
genai-bench benchmark \
--api-backend gcp-vertex \
--api-base https://us-central1-aiplatform.googleapis.com \
--gcp-project-id my-project-123 \
--gcp-location us-central1 \
--gcp-credentials-path /path/to/service-account.json \
--api-model-name gemini-1.5-pro \
--model-tokenizer google/gemini \
--task text-to-text \
--max-requests-per-run 100 \
--max-time-per-run 10
Environment variable:
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json
export GCP_PROJECT_ID=my-project-123
genai-bench benchmark --api-backend gcp-vertex ...
SGLang or vLLM¶
vLLM and SGLang use OpenAI-compatible APIs with optional authentication.
Required parameters:
- --api-backend sglang
or --api-backend vllm
- --api-base
: Your server endpoint
- --api-key
or --model-api-key
: Optional API key if authentication is enabled
Example:
genai-bench benchmark \
--api-backend vllm \
--api-base http://localhost:8000 \
--api-key optional-key \
--api-model-name meta-llama/Llama-2-7b-hf \
--model-tokenizer meta-llama/Llama-2-7b-hf \
--task text-to-text \
--max-requests-per-run 100 \
--max-time-per-run 10
Storage Provider Authentication¶
Storage authentication is configured separately from model authentication, allowing you to store results in any supported storage service.
Common Storage Parameters¶
All storage providers share these common parameters:
- --upload-results
: Flag to enable result upload
- --storage-provider
: Storage provider type (oci, aws, azure, gcp, github)
- --storage-bucket
: Bucket/container name
- --storage-prefix
: Optional prefix for uploaded objects
OCI Object Storage¶
Authentication types: Same as OCI model authentication (user_principal, instance_principal, etc.)
Example:
genai-bench benchmark \
... \
--upload-results \
--storage-provider oci \
--storage-bucket my-benchmark-results \
--storage-prefix experiments/2024 \
--storage-auth-type user_principal \
--namespace my-namespace
AWS S3¶
Authentication methods: 1. IAM Credentials 2. AWS Profile 3. Environment variables
IAM Credentials example:
genai-bench benchmark \
... \
--upload-results \
--storage-provider aws \
--storage-bucket my-benchmark-results \
--storage-prefix experiments/2024 \
--storage-aws-access-key-id AKIAIOSFODNN7EXAMPLE \
--storage-aws-secret-access-key wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY \
--storage-aws-region us-east-1
AWS Profile example:
genai-bench benchmark \
... \
--upload-results \
--storage-provider aws \
--storage-bucket my-benchmark-results \
--storage-prefix experiments/2024 \
--storage-aws-profile production \
--storage-aws-region us-west-2
Azure Blob Storage¶
Authentication methods: 1. Storage Account Key 2. Connection String 3. SAS Token 4. Azure AD
Account Key example:
genai-bench benchmark \
... \
--upload-results \
--storage-provider azure \
--storage-bucket my-container \
--storage-prefix experiments/2024 \
--storage-azure-account-name mystorageaccount \
--storage-azure-account-key YOUR_ACCOUNT_KEY
Connection String example:
genai-bench benchmark \
... \
--upload-results \
--storage-provider azure \
--storage-bucket my-container \
--storage-azure-connection-string "DefaultEndpointsProtocol=https;AccountName=..."
GCP Cloud Storage¶
Authentication methods: 1. Service Account 2. Application Default Credentials
Example:
genai-bench benchmark \
... \
--upload-results \
--storage-provider gcp \
--storage-bucket my-benchmark-results \
--storage-prefix experiments/2024 \
--storage-gcp-project-id my-project-123 \
--storage-gcp-credentials-path /path/to/service-account.json
GitHub Releases¶
GitHub storage uploads results as release artifacts.
Required parameters:
- --github-token
: Personal access token with repo permissions
- --github-owner
: Repository owner (user or organization)
- --github-repo
: Repository name
Example:
genai-bench benchmark \
... \
--upload-results \
--storage-provider github \
--github-token ghp_xxxxxxxxxxxxxxxxxxxx \
--github-owner myorg \
--github-repo benchmark-results
Command Examples¶
Cross-Cloud Benchmarking¶
Benchmark OpenAI and store in AWS S3:
genai-bench benchmark \
--api-backend openai \
--api-base https://api.openai.com/v1 \
--api-key sk-... \
--api-model-name gpt-4 \
--model-tokenizer gpt2 \
--task text-to-text \
--max-requests-per-run 100 \
--max-time-per-run 10 \
--upload-results \
--storage-provider aws \
--storage-bucket my-benchmarks \
--storage-aws-profile default \
--storage-aws-region us-east-1
Benchmark AWS Bedrock and store in Azure Blob:
genai-bench benchmark \
--api-backend aws-bedrock \
--api-base https://bedrock-runtime.us-east-1.amazonaws.com \
--aws-profile bedrock-user \
--aws-region us-east-1 \
--api-model-name anthropic.claude-3-sonnet-20240229-v1:0 \
--model-tokenizer Anthropic/claude-3-sonnet \
--task text-to-text \
--max-requests-per-run 100 \
--max-time-per-run 10 \
--upload-results \
--storage-provider azure \
--storage-bucket benchmarks \
--storage-azure-connection-string "DefaultEndpointsProtocol=..."
Multi-Modal Tasks¶
Image-to-text with GCP Vertex AI:
genai-bench benchmark \
--api-backend gcp-vertex \
--api-base https://us-central1-aiplatform.googleapis.com \
--gcp-project-id my-project \
--gcp-location us-central1 \
--gcp-credentials-path /path/to/service-account.json \
--api-model-name gemini-1.5-pro-vision \
--model-tokenizer google/gemini \
--task image-text-to-text \
--dataset-path /path/to/image/dataset \
--max-requests-per-run 50 \
--max-time-per-run 10
Environment Variables¶
genai-bench supports environment variables for sensitive credentials:
Model Authentication¶
MODEL_API_KEY
: API key for OpenAI, Azure OpenAI, or GCPAWS_ACCESS_KEY_ID
,AWS_SECRET_ACCESS_KEY
,AWS_SESSION_TOKEN
: AWS credentialsAWS_PROFILE
,AWS_DEFAULT_REGION
: AWS configurationAZURE_OPENAI_ENDPOINT
,AZURE_OPENAI_DEPLOYMENT
,AZURE_OPENAI_API_VERSION
: Azure configurationAZURE_AD_TOKEN
: Azure AD authentication tokenGCP_PROJECT_ID
,GCP_LOCATION
: GCP configurationGOOGLE_APPLICATION_CREDENTIALS
: Path to GCP service account JSON
Storage Authentication¶
AZURE_STORAGE_ACCOUNT_NAME
,AZURE_STORAGE_ACCOUNT_KEY
: Azure storage credentialsAZURE_STORAGE_CONNECTION_STRING
,AZURE_STORAGE_SAS_TOKEN
: Azure storage alternativesGITHUB_TOKEN
,GITHUB_OWNER
,GITHUB_REPO
: GitHub configuration
General¶
HF_TOKEN
: HuggingFace token for downloading tokenizers
Best Practices¶
Security¶
- Never commit credentials: Use environment variables or secure credential stores
- Use least privilege: Grant only necessary permissions for benchmarking
- Rotate credentials regularly: Update API keys and tokens periodically
- Use service accounts: Prefer service accounts over personal credentials for automation
Performance¶
- Choose nearby regions: Select cloud regions close to your location for lower latency
- Batch operations: Use appropriate batch sizes for embedding tasks
- Monitor costs: Be aware of API pricing and set appropriate limits
Organization¶
- Use consistent naming: Adopt a naming convention for storage prefixes
- Separate environments: Use different buckets/prefixes for dev/test/prod
- Tag resources: Use cloud provider tags for cost tracking and organization
Important Notes¶
- Task-specific behavior:
- For
text-to-embeddings
andtext-to-rerank
tasks, the iteration type automatically switches tobatch_size
- For other tasks,
num_concurrency
iteration is used -
This is handled automatically by the CLI
-
Image format requirements:
- Image inputs are expected to be in JPEG format for multi-modal tasks
-
Base64 encoding is handled automatically
-
Token counting:
- Different providers may use different tokenization methods
- Token estimates for embeddings tasks may vary by provider
Troubleshooting¶
- Check credentials: Verify authentication credentials are correct
- Verify permissions: Ensure accounts have necessary permissions
- Check regions: Confirm services are available in selected regions
- Review quotas: Check API quotas and rate limits
- Enable logging: Use verbose logging for debugging authentication issues
Migration from Legacy CLI¶
If you're migrating from the legacy OCI-only CLI:
Old command:
New command:
genai-bench benchmark \
--api-backend oci-cohere \
--storage-bucket my-bucket \
--storage-prefix my-prefix \
--storage-provider oci \
...
The main changes are:
- --bucket
→ --storage-bucket
- --prefix
→ --storage-prefix
- Add --storage-provider oci
(though OCI is the default for backward compatibility)