DevOps RAG ज्ञानकोश - Enterprise Documentation
Avyay DevOps RAG is an enterprise-grade Retrieval-Augmented Generation system that transforms your DevOps runbooks and documentation into an intelligent, queryable knowledge base. Get instant answers to operational questions with cited sources and context.
🏗️ RAG Architecture
┌─────────────────┐ Ingest ┌─────────────────┐ Query ┌─────────────────┐
│ Runbooks │ ──────────── ►│ Vector Index │ ◄────────── ─│ User Query │
│ (.md files) │ │ (embeddings) │ │ │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│ │ │
│ 1. Chunk documents │ 3. Cosine similarity │
│ (512 tokens + 64 overlap) │ search (top-k) │
│ │ │
▼ ▼ ▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ OpenAI API │ │ Retrieve │ │ GPT-4o-mini │
│ Embeddings │ │ Context │ │ Generation │
│ (1536-dim) │ │ Chunks │ │ + Citations │
└─────────────────┘ └─────────────────┘ └─────────────────┘Flow:
- Ingest: Runbooks → Chunked → Embedded → Indexed
- Retrieve: Query → Embedded → Cosine Similarity → Top-k Chunks
- Generate: Context + Query → GPT-4o-mini → Cited Answer
✨ Key Features
🎯 Intelligent Retrieval
- Vector Search: OpenAI
text-embedding-3-small(1536 dimensions) - Optimized Chunking: 512-token chunks with 64-token overlap for perfect recall
- Source Citations: Every answer includes relevance scores and source excerpts
- 100% Hit Rate: Tuned for perfect source document retrieval
🚀 Production Ready
- Docker Distribution: Available at
ghcr.io/gaurav21/devops-rag:latest - Cloud Native: Deployed on Google Cloud Run with auto-scaling
- Zero Dependencies: JSON-based vector index, no external databases
- Fast Inference: Sub-500ms average query latency
🔧 Developer Experience
- REST API: Full FastAPI server with OpenAPI docs
- React UI: Modern chat interface with markdown rendering
- CLI Interface: Ingest, query, and test from command line
- Auto-Ingest: Watch for runbook changes and re-index
📊 Enterprise Features
- Datadog Integration: APM traces, metrics, and monitoring
- Deployment Flexibility: Docker, docker-compose, Kubernetes, Cloud Run
- Configuration Tuning: Chunk size, overlap, top-k, embedding models
- Quality Metrics: Retrieval accuracy, latency tracking, relevance scoring
🏃♂️ Quick Start
# Pull from GitHub Container Registry
docker pull ghcr.io/gaurav21/devops-rag:latest
# Run with sample runbooks
docker run -p 8080:8080 \
-e OPENAI_API_KEY=your_key_here \
ghcr.io/gaurav21/devops-rag:latest
# Query the API
curl -X POST http://localhost:8080/ask \
-H "Content-Type: application/json" \
-d '{"question": "How do I fix CrashLoopBackOff in Kubernetes?"}'
# Access React UI
open http://localhost:8080📚 Knowledge Base
18 Pre-loaded Runbooks:
- Kubernetes Troubleshooting & Networking
- AWS Incident Response & EKS Operations
- Docker & Container Operations
- Database Operations & Migrations
- Monitoring (Datadog, Prometheus, Grafana)
- CI/CD Pipeline Management
- Linux Server Maintenance
- Terraform Infrastructure as Code
- Elasticsearch & Logging Systems
- Security (RBAC, SSL, Compliance)
- Incident Management & Post-mortems
45 Indexed Chunks covering operational scenarios across cloud-native and traditional infrastructure.
📖 Documentation
- 📋 Quick Start - 5-minute setup guide
- 🔌 API Reference - Complete REST API docs
- 🚀 Deployment - Docker, K8s, Cloud Run
- ⚙️ Configuration - Model tuning and settings
- 📚 Knowledge Base - Runbook authoring best practices
- 🎯 Use Cases - Real-world applications
- 📊 Monitoring - Datadog dashboards and alerts
- 🔧 Tuning - Performance optimization guide
🔗 Links
- Live Demo: https://devops-rag-449012790678.asia-southeast1.run.app
- GitHub: gaurav21/devops-rag
- Docker Image:
ghcr.io/gaurav21/devops-rag:latest - Company: Avyay AI - Building intelligent systems for operational excellence
Built with ❤️ by Avyay for DevOps teams who value speed, accuracy, and intelligence in their operations.