Enterprise Use Cases

MĀRGA enables sophisticated LLM routing strategies for enterprise environments. Here are four key use cases with practical implementation examples.

1. Multi-Provider Failover Strategy

Problem Statement

Enterprise applications need high availability for LLM services. Single-provider dependency creates risk of service interruptions, affecting critical business workflows.

Solution with MĀRGA

Implement cascading failover across multiple providers with automatic retry and fallback logic.

Configuration Example

providers:
  openai:
    api_key: "${OPENAI_API_KEY}"
    base_url: "https://api.openai.com/v1"
    timeout: 30s
    retry_attempts: 2
  anthropic:
    api_key: "${ANTHROPIC_API_KEY}"
    base_url: "https://api.anthropic.com"
    timeout: 30s
    retry_attempts: 2
  ollama:
    base_url: "http://localhost:11434"
    timeout: 60s
    retry_attempts: 1
 
routing:
  strategies:
    - name: "high_availability"
      rules:
        - condition: "always"
          actions:
            - provider: "openai"
              model: "gpt-4"
              fallback: "anthropic"
            - provider: "anthropic" 
              model: "claude-3-sonnet-20240229"
              fallback: "ollama"
            - provider: "ollama"
              model: "llama2:13b"
              fallback: null
 
  default_strategy: "high_availability"

Expected Outcome

  • 99.9% uptime through multi-provider resilience
  • Automatic failover within 2-3 seconds
  • Zero manual intervention during provider outages
  • Consistent API interface regardless of active provider

2. Cost Optimization Through Smart Routing

Problem Statement

Enterprise LLM costs can spiral quickly. Simple queries don’t need expensive frontier models, but determining complexity programmatically is challenging.

Solution with MĀRGA

Route requests based on complexity heuristics and user-defined cost tiers.

Configuration Example

cost_optimization:
  enable: true
  budgets:
    daily_limit: 100.00  # USD
    per_request_limit: 2.00  # USD
  
  tier_mapping:
    cheap:
      providers: ["openai"]
      models: ["gpt-3.5-turbo"]
      max_cost_per_1k_tokens: 0.002
    
    balanced:
      providers: ["anthropic", "openai"]
      models: ["claude-3-haiku-20240307", "gpt-4o-mini"]
      max_cost_per_1k_tokens: 0.010
    
    premium:
      providers: ["openai", "anthropic"]
      models: ["gpt-4", "claude-3-sonnet-20240229"]
      max_cost_per_1k_tokens: 0.030
 
routing:
  strategies:
    - name: "cost_aware"
      rules:
        - condition: "input_tokens < 100 AND simple_query"
          actions:
            - tier: "cheap"
        - condition: "input_tokens < 500"
          actions:
            - tier: "balanced"
        - condition: "input_tokens >= 500 OR complex_reasoning"
          actions:
            - tier: "premium"
 
  complexity_detection:
    simple_keywords: ["hello", "hi", "thanks", "yes", "no"]
    complex_indicators: ["analyze", "explain", "compare", "generate", "code"]

Expected Outcome

  • 60-80% cost reduction on simple queries
  • Automatic tier selection based on complexity
  • Budget enforcement with daily/per-request limits
  • Performance maintained for complex tasks

3. A/B Testing Models in Production

Problem Statement

Evaluating new models in production requires careful testing with real workloads while minimizing risk to user experience.

Solution with MĀRGA

Implement traffic splitting with comprehensive metrics collection and automatic rollback capabilities.

Configuration Example

ab_testing:
  enable: true
  experiments:
    - name: "claude_vs_gpt4"
      description: "Testing Claude Sonnet against GPT-4"
      start_time: "2024-01-15T00:00:00Z"
      end_time: "2024-01-22T00:00:00Z"
      
      control_group:
        percentage: 85
        provider: "openai"
        model: "gpt-4"
        
      test_groups:
        - name: "claude_test"
          percentage: 15
          provider: "anthropic"
          model: "claude-3-sonnet-20240229"
      
      success_criteria:
        min_response_quality: 0.85
        max_latency_p95: 5000  # ms
        min_requests: 1000
      
      rollback_triggers:
        error_rate_threshold: 0.05
        latency_degradation: 2.0  # 2x slower
        quality_degradation: 0.1
 
monitoring:
  datadog:
    enable: true
    custom_metrics:
      - "marga.ab_test.response_time"
      - "marga.ab_test.quality_score"
      - "marga.ab_test.error_rate"
    
  quality_assessment:
    enable: true
    sample_rate: 0.1  # Assess 10% of responses
    evaluator_model: "gpt-4"

Expected Outcome

  • Risk-free model evaluation with automated rollback
  • Statistical significance through controlled traffic splitting
  • Real-world performance data under production load
  • Continuous quality monitoring with instant alerts

4. Data Compliance and PII Protection

Problem Statement

Enterprise data contains sensitive information that cannot be sent to external LLM providers due to regulatory requirements (GDPR, HIPAA, SOX).

Solution with MĀRGA

Route sensitive data to on-premises models while maintaining performance for non-sensitive workloads.

Configuration Example

data_protection:
  enable: true
  
  pii_detection:
    enable: true
    confidence_threshold: 0.8
    patterns:
      email: '\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b'
      ssn: '\b\d{3}-\d{2}-\d{4}\b'
      credit_card: '\b\d{4}[\s-]?\d{4}[\s-]?\d{4}[\s-]?\d{4}\b'
      phone: '\b\d{3}[\s.-]?\d{3}[\s.-]?\d{4}\b'
  
  classification_rules:
    - name: "contains_pii"
      condition: "pii_detected"
      action: "route_local"
      
    - name: "internal_only"
      condition: "classification:internal OR domain:company.com"
      action: "route_private_cloud"
      
    - name: "public_safe"
      condition: "classification:public"
      action: "route_external"
 
routing:
  data_classification_strategy:
    rules:
      - condition: "contains_pii"
        actions:
          - provider: "ollama"
            model: "codellama:13b"
            location: "on_premises"
            
      - condition: "classification:internal"
        actions:
          - provider: "azure_openai"
            model: "gpt-4"
            location: "private_cloud"
            
      - condition: "classification:public"
        actions:
          - provider: "openai"
            model: "gpt-4"
            location: "external"
 
compliance:
  audit_logging:
    enable: true
    log_level: "detailed"
    retention_days: 2555  # 7 years
    destinations:
      - type: "datadog"
        service: "marga-audit"
      - type: "file"
        path: "/var/log/marga/audit.log"
  
  data_residency:
    enforce: true
    allowed_regions: ["us-east-1", "eu-west-1"]
    
  encryption:
    in_transit: true
    at_rest: true
    key_rotation_days: 90

Expected Outcome

  • 100% PII protection through local processing
  • Automated compliance enforcement with audit trails
  • Zero external exposure of sensitive data
  • Regulatory compliance (GDPR, HIPAA, SOX ready)
  • Performance optimization for non-sensitive workloads

Implementation Considerations

Monitoring and Observability

All use cases require comprehensive monitoring:

  • Request routing decisions and performance
  • Model response quality and latency
  • Cost tracking and budget alerts
  • Security and compliance metrics

Testing and Validation

  • Canary deployments for routing rule changes
  • Synthetic testing for all providers
  • Quality regression testing
  • Performance benchmarking

Operational Excellence

  • Automated rollback capabilities
  • Circuit breaker patterns
  • Rate limiting and throttling
  • Comprehensive alerting

MĀRGA provides the foundation for these enterprise patterns while maintaining simplicity and reliability.