Enterprise Use Cases
MĀRGA enables sophisticated LLM routing strategies for enterprise environments. Here are four key use cases with practical implementation examples.
1. Multi-Provider Failover Strategy
Problem Statement
Enterprise applications need high availability for LLM services. Single-provider dependency creates risk of service interruptions, affecting critical business workflows.
Solution with MĀRGA
Implement cascading failover across multiple providers with automatic retry and fallback logic.
Configuration Example
providers:
openai:
api_key: "${OPENAI_API_KEY}"
base_url: "https://api.openai.com/v1"
timeout: 30s
retry_attempts: 2
anthropic:
api_key: "${ANTHROPIC_API_KEY}"
base_url: "https://api.anthropic.com"
timeout: 30s
retry_attempts: 2
ollama:
base_url: "http://localhost:11434"
timeout: 60s
retry_attempts: 1
routing:
strategies:
- name: "high_availability"
rules:
- condition: "always"
actions:
- provider: "openai"
model: "gpt-4"
fallback: "anthropic"
- provider: "anthropic"
model: "claude-3-sonnet-20240229"
fallback: "ollama"
- provider: "ollama"
model: "llama2:13b"
fallback: null
default_strategy: "high_availability"Expected Outcome
- 99.9% uptime through multi-provider resilience
- Automatic failover within 2-3 seconds
- Zero manual intervention during provider outages
- Consistent API interface regardless of active provider
2. Cost Optimization Through Smart Routing
Problem Statement
Enterprise LLM costs can spiral quickly. Simple queries don’t need expensive frontier models, but determining complexity programmatically is challenging.
Solution with MĀRGA
Route requests based on complexity heuristics and user-defined cost tiers.
Configuration Example
cost_optimization:
enable: true
budgets:
daily_limit: 100.00 # USD
per_request_limit: 2.00 # USD
tier_mapping:
cheap:
providers: ["openai"]
models: ["gpt-3.5-turbo"]
max_cost_per_1k_tokens: 0.002
balanced:
providers: ["anthropic", "openai"]
models: ["claude-3-haiku-20240307", "gpt-4o-mini"]
max_cost_per_1k_tokens: 0.010
premium:
providers: ["openai", "anthropic"]
models: ["gpt-4", "claude-3-sonnet-20240229"]
max_cost_per_1k_tokens: 0.030
routing:
strategies:
- name: "cost_aware"
rules:
- condition: "input_tokens < 100 AND simple_query"
actions:
- tier: "cheap"
- condition: "input_tokens < 500"
actions:
- tier: "balanced"
- condition: "input_tokens >= 500 OR complex_reasoning"
actions:
- tier: "premium"
complexity_detection:
simple_keywords: ["hello", "hi", "thanks", "yes", "no"]
complex_indicators: ["analyze", "explain", "compare", "generate", "code"]Expected Outcome
- 60-80% cost reduction on simple queries
- Automatic tier selection based on complexity
- Budget enforcement with daily/per-request limits
- Performance maintained for complex tasks
3. A/B Testing Models in Production
Problem Statement
Evaluating new models in production requires careful testing with real workloads while minimizing risk to user experience.
Solution with MĀRGA
Implement traffic splitting with comprehensive metrics collection and automatic rollback capabilities.
Configuration Example
ab_testing:
enable: true
experiments:
- name: "claude_vs_gpt4"
description: "Testing Claude Sonnet against GPT-4"
start_time: "2024-01-15T00:00:00Z"
end_time: "2024-01-22T00:00:00Z"
control_group:
percentage: 85
provider: "openai"
model: "gpt-4"
test_groups:
- name: "claude_test"
percentage: 15
provider: "anthropic"
model: "claude-3-sonnet-20240229"
success_criteria:
min_response_quality: 0.85
max_latency_p95: 5000 # ms
min_requests: 1000
rollback_triggers:
error_rate_threshold: 0.05
latency_degradation: 2.0 # 2x slower
quality_degradation: 0.1
monitoring:
datadog:
enable: true
custom_metrics:
- "marga.ab_test.response_time"
- "marga.ab_test.quality_score"
- "marga.ab_test.error_rate"
quality_assessment:
enable: true
sample_rate: 0.1 # Assess 10% of responses
evaluator_model: "gpt-4"Expected Outcome
- Risk-free model evaluation with automated rollback
- Statistical significance through controlled traffic splitting
- Real-world performance data under production load
- Continuous quality monitoring with instant alerts
4. Data Compliance and PII Protection
Problem Statement
Enterprise data contains sensitive information that cannot be sent to external LLM providers due to regulatory requirements (GDPR, HIPAA, SOX).
Solution with MĀRGA
Route sensitive data to on-premises models while maintaining performance for non-sensitive workloads.
Configuration Example
data_protection:
enable: true
pii_detection:
enable: true
confidence_threshold: 0.8
patterns:
email: '\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b'
ssn: '\b\d{3}-\d{2}-\d{4}\b'
credit_card: '\b\d{4}[\s-]?\d{4}[\s-]?\d{4}[\s-]?\d{4}\b'
phone: '\b\d{3}[\s.-]?\d{3}[\s.-]?\d{4}\b'
classification_rules:
- name: "contains_pii"
condition: "pii_detected"
action: "route_local"
- name: "internal_only"
condition: "classification:internal OR domain:company.com"
action: "route_private_cloud"
- name: "public_safe"
condition: "classification:public"
action: "route_external"
routing:
data_classification_strategy:
rules:
- condition: "contains_pii"
actions:
- provider: "ollama"
model: "codellama:13b"
location: "on_premises"
- condition: "classification:internal"
actions:
- provider: "azure_openai"
model: "gpt-4"
location: "private_cloud"
- condition: "classification:public"
actions:
- provider: "openai"
model: "gpt-4"
location: "external"
compliance:
audit_logging:
enable: true
log_level: "detailed"
retention_days: 2555 # 7 years
destinations:
- type: "datadog"
service: "marga-audit"
- type: "file"
path: "/var/log/marga/audit.log"
data_residency:
enforce: true
allowed_regions: ["us-east-1", "eu-west-1"]
encryption:
in_transit: true
at_rest: true
key_rotation_days: 90Expected Outcome
- 100% PII protection through local processing
- Automated compliance enforcement with audit trails
- Zero external exposure of sensitive data
- Regulatory compliance (GDPR, HIPAA, SOX ready)
- Performance optimization for non-sensitive workloads
Implementation Considerations
Monitoring and Observability
All use cases require comprehensive monitoring:
- Request routing decisions and performance
- Model response quality and latency
- Cost tracking and budget alerts
- Security and compliance metrics
Testing and Validation
- Canary deployments for routing rule changes
- Synthetic testing for all providers
- Quality regression testing
- Performance benchmarking
Operational Excellence
- Automated rollback capabilities
- Circuit breaker patterns
- Rate limiting and throttling
- Comprehensive alerting
MĀRGA provides the foundation for these enterprise patterns while maintaining simplicity and reliability.