DevOps RAGOverview

DevOps RAG ज्ञानकोश - Enterprise Documentation

Avyay DevOps RAG is an enterprise-grade Retrieval-Augmented Generation system that transforms your DevOps runbooks and documentation into an intelligent, queryable knowledge base. Get instant answers to operational questions with cited sources and context.

🏗️ RAG Architecture

┌─────────────────┐    Ingest     ┌─────────────────┐    Query      ┌─────────────────┐
│   Runbooks      │ ──────────── ►│  Vector Index   │ ◄────────── ─│   User Query    │
│   (.md files)   │               │  (embeddings)   │               │                 │
└─────────────────┘               └─────────────────┘               └─────────────────┘
         │                                 │                                 │
         │ 1. Chunk documents              │ 3. Cosine similarity           │
         │    (512 tokens + 64 overlap)   │    search (top-k)              │
         │                                 │                                 │
         ▼                                 ▼                                 ▼
┌─────────────────┐               ┌─────────────────┐               ┌─────────────────┐
│  OpenAI API     │               │  Retrieve       │               │  GPT-4o-mini    │
│  Embeddings     │               │  Context        │               │  Generation     │
│  (1536-dim)     │               │  Chunks         │               │  + Citations    │
└─────────────────┘               └─────────────────┘               └─────────────────┘

Flow:

  1. Ingest: Runbooks → Chunked → Embedded → Indexed
  2. Retrieve: Query → Embedded → Cosine Similarity → Top-k Chunks
  3. Generate: Context + Query → GPT-4o-mini → Cited Answer

✨ Key Features

🎯 Intelligent Retrieval

  • Vector Search: OpenAI text-embedding-3-small (1536 dimensions)
  • Optimized Chunking: 512-token chunks with 64-token overlap for perfect recall
  • Source Citations: Every answer includes relevance scores and source excerpts
  • 100% Hit Rate: Tuned for perfect source document retrieval

🚀 Production Ready

  • Docker Distribution: Available at ghcr.io/gaurav21/devops-rag:latest
  • Cloud Native: Deployed on Google Cloud Run with auto-scaling
  • Zero Dependencies: JSON-based vector index, no external databases
  • Fast Inference: Sub-500ms average query latency

🔧 Developer Experience

  • REST API: Full FastAPI server with OpenAPI docs
  • React UI: Modern chat interface with markdown rendering
  • CLI Interface: Ingest, query, and test from command line
  • Auto-Ingest: Watch for runbook changes and re-index

📊 Enterprise Features

  • Datadog Integration: APM traces, metrics, and monitoring
  • Deployment Flexibility: Docker, docker-compose, Kubernetes, Cloud Run
  • Configuration Tuning: Chunk size, overlap, top-k, embedding models
  • Quality Metrics: Retrieval accuracy, latency tracking, relevance scoring

🏃‍♂️ Quick Start

# Pull from GitHub Container Registry
docker pull ghcr.io/gaurav21/devops-rag:latest
 
# Run with sample runbooks
docker run -p 8080:8080 \
  -e OPENAI_API_KEY=your_key_here \
  ghcr.io/gaurav21/devops-rag:latest
 
# Query the API
curl -X POST http://localhost:8080/ask \
  -H "Content-Type: application/json" \
  -d '{"question": "How do I fix CrashLoopBackOff in Kubernetes?"}'
 
# Access React UI
open http://localhost:8080

📚 Knowledge Base

18 Pre-loaded Runbooks:

  • Kubernetes Troubleshooting & Networking
  • AWS Incident Response & EKS Operations
  • Docker & Container Operations
  • Database Operations & Migrations
  • Monitoring (Datadog, Prometheus, Grafana)
  • CI/CD Pipeline Management
  • Linux Server Maintenance
  • Terraform Infrastructure as Code
  • Elasticsearch & Logging Systems
  • Security (RBAC, SSL, Compliance)
  • Incident Management & Post-mortems

45 Indexed Chunks covering operational scenarios across cloud-native and traditional infrastructure.

📖 Documentation


Built with ❤️ by Avyay for DevOps teams who value speed, accuracy, and intelligence in their operations.