Configuration Reference
This guide covers all configuration options for MĀRGA, including server settings, provider configuration, routing rules, and monitoring setup.
Configuration File
MĀRGA uses a YAML configuration file (default: config.yaml). You can specify a custom location with the CONFIG_FILE environment variable.
# Full configuration example
server:
port: 8080
host: 0.0.0.0
timeout: 30s
max_request_size: 10MB
logging:
level: info
format: json
metrics:
enabled: true
path: /v1/metrics
datadog:
enabled: true
service_name: marga
environment: production
providers:
- name: openai
type: openai
enabled: true
# ... provider config
routing:
strategy: failover
# ... routing config
rate_limit:
enabled: true
# ... rate limit config
security:
api_key_required: true
# ... security config
health:
enabled: true
# ... health configServer Configuration
Controls the HTTP server behavior.
server:
port: 8080 # Port to listen on
host: 0.0.0.0 # Host to bind to (0.0.0.0 for all interfaces)
timeout: 30s # Request timeout
max_request_size: 10MB # Maximum request body sizeEnvironment Variable Overrides
| Config | Environment Variable | Default |
|---|---|---|
port | PORT | 8080 |
host | HOST | 0.0.0.0 |
timeout | SERVER_TIMEOUT | 30s |
Logging Configuration
Controls logging behavior and output format.
logging:
level: info # Log level: debug, info, warn, error
format: json # Output format: json, textLog Levels
| Level | Description | When to Use |
|---|---|---|
debug | Detailed debugging info | Development, troubleshooting |
info | General operational info | Production default |
warn | Warning conditions | Production monitoring |
error | Error conditions only | Minimal production logging |
Environment Variable Overrides
| Config | Environment Variable | Default |
|---|---|---|
level | LOG_LEVEL | info |
format | LOG_FORMAT | json |
Metrics Configuration
Controls Prometheus metrics collection and Datadog integration.
metrics:
enabled: true # Enable metrics collection
path: /v1/metrics # Metrics endpoint path
datadog:
enabled: true # Enable Datadog integration
service_name: marga # Service name in Datadog
environment: production # Environment tagAvailable Metrics
| Metric | Type | Description |
|---|---|---|
marga_requests_total | Counter | Total requests processed |
marga_request_duration_seconds | Histogram | Request latency distribution |
marga_requests_in_flight | Gauge | Current concurrent requests |
marga_provider_requests_total | Counter | Requests per provider |
marga_provider_errors_total | Counter | Errors per provider |
marga_provider_health | Gauge | Provider health status |
marga_model_requests_total | Counter | Requests per model |
Provider Configuration
Configure LLM providers and their settings.
OpenAI Provider
providers:
- name: openai
type: openai
enabled: true
endpoint: https://api.openai.com/v1
api_key_env: OPENAI_API_KEY
models:
- gpt-4o
- gpt-4o-mini
- gpt-3.5-turbo
priority: 1
rate_limit:
requests_per_minute: 3000
tokens_per_minute: 150000
timeout: 30sAnthropic Provider
providers:
- name: anthropic
type: anthropic
enabled: true
endpoint: https://api.anthropic.com/v1
api_key_env: ANTHROPIC_API_KEY
models:
- claude-3-5-sonnet-20241022
- claude-3-5-haiku-20241022
- claude-3-opus-20240229
priority: 2
rate_limit:
requests_per_minute: 4000
tokens_per_minute: 400000
timeout: 60sOllama Provider (Local Models)
providers:
- name: ollama
type: ollama
enabled: false
endpoint: http://localhost:11434
models:
- llama3.1:8b
- llama3.1:70b
- mistral:7b
priority: 3
rate_limit:
requests_per_minute: 100
timeout: 120sTogether AI Provider
providers:
- name: together
type: openai # OpenAI-compatible
enabled: false
endpoint: https://api.together.xyz/v1
api_key_env: TOGETHER_API_KEY
models:
- meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo
- meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo
priority: 4
rate_limit:
requests_per_minute: 600
tokens_per_minute: 1000000
timeout: 60sProvider Fields Reference
| Field | Type | Required | Description |
|---|---|---|---|
name | string | ✅ | Unique provider identifier |
type | string | ✅ | Provider type: openai, anthropic, ollama |
enabled | boolean | ✅ | Whether provider is active |
endpoint | string | ✅ | API endpoint URL |
api_key_env | string | ❌ | Environment variable for API key |
models | array | ✅ | List of supported models |
priority | integer | ✅ | Lower = higher priority |
rate_limit | object | ❌ | Provider-specific rate limits |
timeout | duration | ❌ | Request timeout |
Routing Configuration
Controls how requests are routed to providers.
routing:
strategy: failover # Routing strategy
# Model mappings for transparent routing
model_mappings:
gpt-4: openai/gpt-4o
claude-3-sonnet: anthropic/claude-3-5-sonnet-20241022
llama-8b: ollama/llama3.1:8b
# Failover configuration
failover:
max_retries: 3
retry_delay: 1s
health_check_interval: 30s
# Load balancing (when strategy is load_balance)
load_balance:
algorithm: round_robin # round_robin, weighted, least_connections
health_aware: true
# Cost optimization (when strategy is cost_optimize)
cost_optimize:
prefer_cheaper: true
cost_threshold: 0.01 # USD per 1K tokensRouting Strategies
| Strategy | Description | Use Case |
|---|---|---|
failover | Try providers in priority order | High availability |
load_balance | Distribute requests across providers | High throughput |
cost_optimize | Route to cheapest available provider | Cost efficiency |
Model Mappings
Model mappings provide a unified interface by mapping generic model names to specific provider models:
model_mappings:
# Generic name: provider/specific-model
gpt-4: openai/gpt-4o
gpt-3.5: openai/gpt-3.5-turbo
claude-3: anthropic/claude-3-5-sonnet-20241022
llama-8b: ollama/llama3.1:8b
llama-70b: together/meta-llama/Meta-Llama-3.1-70B-Instruct-TurboThis allows clients to request gpt-4 and automatically get routed to the best available GPT-4 variant.
Rate Limiting Configuration
Control request rates globally and per client.
rate_limit:
enabled: true # Enable rate limiting
# Global limits across all clients
global:
requests_per_minute: 10000
burst: 100
# Per-client limits (by API key or IP)
per_client:
requests_per_minute: 1000
burst: 50Rate Limit Fields
| Field | Description | Example |
|---|---|---|
requests_per_minute | Maximum requests per minute | 1000 |
tokens_per_minute | Maximum tokens per minute | 100000 |
burst | Burst capacity for short spikes | 50 |
Rate Limiting Algorithms
MĀRGA uses a token bucket algorithm for smooth rate limiting:
- Bucket Size: Set by
burstparameter - Refill Rate: Set by
requests_per_minute - Overflow Handling: Requests are rejected with HTTP 429
Security Configuration
Configure authentication, CORS, and access controls.
security:
api_key_required: true # Require API key for access
api_key_header: X-API-Key # Header name for API key
# CORS configuration
allowed_origins:
- "https://myapp.com"
- "https://admin.myapp.com"
cors:
enabled: true
credentials: falseAPI Key Authentication
When api_key_required is true, requests must include an API key in:
Authorization: Bearer YOUR_KEYheader, or- Custom header specified by
api_key_header
# Using Authorization header
curl -H "Authorization: Bearer your-api-key" \
https://marga.example.com/v1/chat/completions
# Using custom header
curl -H "X-API-Key: your-api-key" \
https://marga.example.com/v1/chat/completionsCORS Configuration
| Field | Description | Example |
|---|---|---|
allowed_origins | Allowed origin domains | ["https://myapp.com", "*"] |
cors.enabled | Enable CORS middleware | true |
cors.credentials | Allow credentials in CORS | false |
Health Check Configuration
Configure provider health monitoring.
health:
enabled: true # Enable health checks
path: /health # Health endpoint path
check_providers: true # Check individual providers
timeout: 10s # Health check timeoutHealth Check Behavior
When enabled, MĀRGA:
- Exposes
/healthendpoint for load balancer checks - Periodically checks provider health (if
check_providers: true) - Removes unhealthy providers from routing
- Automatically re-adds providers when they recover
Environment Variables
All configuration can be overridden with environment variables:
Server Variables
PORT=8080
HOST=0.0.0.0
SERVER_TIMEOUT=30s
MAX_REQUEST_SIZE=10MBLogging Variables
LOG_LEVEL=info
LOG_FORMAT=jsonProvider API Keys
OPENAI_API_KEY=sk-your-openai-key
ANTHROPIC_API_KEY=sk-ant-your-anthropic-key
TOGETHER_API_KEY=your-together-key
OLLAMA_ENDPOINT=http://localhost:11434Security Variables
MARGA_API_KEY=your-secure-api-key
API_KEY_REQUIRED=true
ALLOWED_ORIGINS=https://myapp.com,https://admin.myapp.comMonitoring Variables
METRICS_ENABLED=true
DD_API_KEY=your-datadog-key
DD_ENV=production
DD_SERVICE=marga
DD_VERSION=0.1.0Configuration Examples
Development Configuration
# config-dev.yaml - Development setup
server:
port: 8080
timeout: 30s
logging:
level: debug
format: text
metrics:
enabled: true
datadog:
enabled: false
providers:
- name: openai
type: openai
enabled: true
endpoint: https://api.openai.com/v1
api_key_env: OPENAI_API_KEY
models: [gpt-3.5-turbo]
priority: 1
routing:
strategy: failover
security:
api_key_required: false
cors:
enabled: trueProduction Configuration
# config-prod.yaml - Production setup
server:
port: 8080
host: 0.0.0.0
timeout: 30s
max_request_size: 10MB
logging:
level: info
format: json
metrics:
enabled: true
datadog:
enabled: true
service_name: marga
environment: production
providers:
- name: openai
type: openai
enabled: true
endpoint: https://api.openai.com/v1
api_key_env: OPENAI_API_KEY
models: [gpt-4o, gpt-4o-mini]
priority: 1
rate_limit:
requests_per_minute: 3000
- name: anthropic
type: anthropic
enabled: true
endpoint: https://api.anthropic.com/v1
api_key_env: ANTHROPIC_API_KEY
models: [claude-3-5-sonnet-20241022]
priority: 2
rate_limit:
requests_per_minute: 4000
routing:
strategy: failover
model_mappings:
gpt-4: openai/gpt-4o
claude-3-sonnet: anthropic/claude-3-5-sonnet-20241022
failover:
max_retries: 3
retry_delay: 2s
rate_limit:
enabled: true
global:
requests_per_minute: 10000
per_client:
requests_per_minute: 1000
security:
api_key_required: true
allowed_origins:
- https://myapp.com
cors:
enabled: true
health:
enabled: true
check_providers: true
timeout: 10sHigh-Availability Configuration
# config-ha.yaml - High availability setup
routing:
strategy: failover
model_mappings:
gpt-4: openai/gpt-4o
claude-3-sonnet: anthropic/claude-3-5-sonnet-20241022
llama-70b: together/meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo
failover:
max_retries: 5
retry_delay: 1s
health_check_interval: 15s
providers:
- name: openai-primary
type: openai
enabled: true
endpoint: https://api.openai.com/v1
api_key_env: OPENAI_API_KEY_PRIMARY
models: [gpt-4o, gpt-4o-mini]
priority: 1
- name: openai-secondary
type: openai
enabled: true
endpoint: https://api.openai.com/v1
api_key_env: OPENAI_API_KEY_SECONDARY
models: [gpt-4o, gpt-4o-mini]
priority: 2
- name: anthropic
type: anthropic
enabled: true
endpoint: https://api.anthropic.com/v1
api_key_env: ANTHROPIC_API_KEY
models: [claude-3-5-sonnet-20241022]
priority: 3
- name: together
type: openai
enabled: true
endpoint: https://api.together.xyz/v1
api_key_env: TOGETHER_API_KEY
models: [meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo]
priority: 4
health:
enabled: true
check_providers: true
timeout: 5sCost-Optimized Configuration
# config-cost.yaml - Cost optimization setup
routing:
strategy: cost_optimize
cost_optimize:
prefer_cheaper: true
cost_threshold: 0.01
model_mappings:
gpt-4: openai/gpt-4o-mini # Use mini for cost savings
claude-3: anthropic/claude-3-5-haiku-20241022 # Use Haiku
providers:
- name: together
type: openai
enabled: true
endpoint: https://api.together.xyz/v1
api_key_env: TOGETHER_API_KEY
models: [meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo]
priority: 1 # Cheapest first
- name: openai
type: openai
enabled: true
endpoint: https://api.openai.com/v1
api_key_env: OPENAI_API_KEY
models: [gpt-4o-mini, gpt-3.5-turbo]
priority: 2
- name: anthropic
type: anthropic
enabled: true
endpoint: https://api.anthropic.com/v1
api_key_env: ANTHROPIC_API_KEY
models: [claude-3-5-haiku-20241022]
priority: 3Configuration Validation
MĀRGA validates configuration on startup and reports errors:
Common Validation Errors
# Missing required fields
Error: Provider 'openai' missing required field 'type'
# Invalid values
Error: Invalid log level 'invalid', must be: debug, info, warn, error
# Duplicate names
Error: Duplicate provider name 'openai'
# Invalid endpoints
Error: Provider 'openai' has invalid endpoint URLConfiguration Testing
Test your configuration before deploying:
# Dry run to validate config
./marga --config config.yaml --validate-only
# Check specific provider
./marga --config config.yaml --test-provider openai