API Reference - DevOps RAG
Complete REST API documentation for the DevOps RAG system. All examples use the live Cloud Run deployment at https://devops-rag-449012790678.asia-southeast1.run.app.
Base URL
- Production:
https://devops-rag-449012790678.asia-southeast1.run.app - Local:
http://localhost:8080
Authentication
No authentication required for the current version. The API is stateless and uses the server’s OpenAI API key for embeddings and generation.
Endpoints
GET /health
Health check endpoint with index status.
Response Schema
{
"status": "ok" | "error",
"index_ready": boolean,
"total_chunks": integer,
"total_sources": integer
}Example
curl https://devops-rag-449012790678.asia-southeast1.run.app/healthResponse:
{
"status": "ok",
"index_ready": true,
"total_chunks": 45,
"total_sources": 18
}POST /ask
Main query endpoint. Retrieves relevant context from runbooks and generates an answer with citations.
Request Schema
{
"question": string, // Required: Your question
"top_k": integer, // Optional: Number of chunks to retrieve (default: 5)
"verbose": boolean // Optional: Include debug info (default: false)
}Response Schema
{
"answer": string, // Generated answer with markdown
"sources": [string], // List of source document names
"citations": [ // Detailed citation info
{
"source": string, // Source document name
"relevance": float, // Cosine similarity score (0-1)
"excerpt": string, // Relevant text excerpt
"chunk_id": string // Unique chunk identifier
}
],
"context_chunks": integer, // Number of chunks used
"top_score": float, // Highest relevance score
"avg_score": float, // Average relevance score
"latency_ms": float // Query processing time
}Examples
Basic Query:
curl -X POST https://devops-rag-449012790678.asia-southeast1.run.app/ask \
-H "Content-Type: application/json" \
-d '{
"question": "How do I fix CrashLoopBackOff in Kubernetes?"
}'Advanced Query with Options:
curl -X POST https://devops-rag-449012790678.asia-southeast1.run.app/ask \
-H "Content-Type: application/json" \
-d '{
"question": "What are the best practices for database backup?",
"top_k": 3,
"verbose": true
}'Response Example:
{
"answer": "## Database Backup Best Practices\n\n1. **Automated Scheduled Backups**:\n - Set up cron jobs for regular backups\n - Use database-specific tools (pg_dump for PostgreSQL, mysqldump for MySQL)\n - Store backups in multiple locations\n\n2. **Test Backup Integrity**:\n ```bash\n # Test PostgreSQL backup\n pg_restore --list backup.sql\n ```\n\n3. **Retention Policy**:\n - Keep daily backups for 30 days\n - Keep weekly backups for 3 months\n - Keep monthly backups for 1 year",
"sources": ["05-database-operations.md"],
"citations": [
{
"source": "05-database-operations.md",
"relevance": 0.8756,
"excerpt": "Database backup is critical for disaster recovery. Always test your backups by restoring to a test environment...",
"chunk_id": "05-database-operations.md:2"
},
{
"source": "05-database-operations.md",
"relevance": 0.8234,
"excerpt": "Backup retention should follow the 3-2-1 rule: 3 copies of data, 2 different media types, 1 offsite...",
"chunk_id": "05-database-operations.md:4"
}
],
"context_chunks": 3,
"top_score": 0.8756,
"avg_score": 0.8495,
"latency_ms": 1247.8
}GET /stats
Returns vector index statistics and metadata.
Response Schema
{
"total_chunks": integer, // Total indexed chunks
"total_sources": integer, // Number of source documents
"sources": [string], // List of all source files
"embedding_model": string, // Current embedding model
"chunk_config": { // Chunking configuration
"chunk_size": integer,
"chunk_overlap": integer
}
}Example
curl https://devops-rag-449012790678.asia-southeast1.run.app/statsResponse:
{
"total_chunks": 45,
"total_sources": 18,
"sources": [
"01-kubernetes-troubleshooting.md",
"02-deployment-rollback.md",
"03-linux-server-maintenance.md",
"04-monitoring-alerting.md",
"05-database-operations.md"
],
"embedding_model": "text-embedding-3-small",
"chunk_config": {
"chunk_size": 512,
"chunk_overlap": 64
}
}GET /sources
Lists all ingested runbook sources.
Response Schema
{
"sources": [string], // Array of source document names
"total": integer // Total count
}Example
curl https://devops-rag-449012790678.asia-southeast1.run.app/sourcesResponse:
{
"sources": [
"01-kubernetes-troubleshooting.md",
"02-deployment-rollback.md",
"03-linux-server-maintenance.md",
"04-monitoring-alerting.md",
"05-database-operations.md",
"06-datadog-apm-troubleshooting.md",
"07-aws-incident-response.md",
"08-docker-container-ops.md",
"09-terraform-infrastructure.md",
"10-networking-dns-troubleshooting.md",
"11-kubernetes-networking.md",
"12-cicd-pipeline.md",
"13-prometheus-grafana.md",
"14-kubernetes-rbac-security.md",
"15-datadog-infrastructure-monitoring.md",
"16-elasticsearch-logging.md",
"17-aws-eks-operations.md",
"18-incident-management.md"
],
"total": 18
}GET /graph
Returns knowledge graph visualization data showing relationships between runbooks and concepts.
Response Schema
{
"nodes": [
{
"id": string, // Unique node identifier
"label": string, // Display name
"type": "document" | "concept",
"size": integer // Relative importance
}
],
"edges": [
{
"source": string, // Source node ID
"target": string, // Target node ID
"weight": float // Relationship strength (0-1)
}
]
}Example
curl https://devops-rag-449012790678.asia-southeast1.run.app/graphGET /clusters
Returns topic clusters discovered from the knowledge base.
Response Schema
{
"clusters": [
{
"id": integer,
"name": string, // Cluster theme/topic
"documents": [string], // Documents in this cluster
"keywords": [string], // Key terms
"coherence_score": float // Cluster quality (0-1)
}
]
}Example
curl https://devops-rag-449012790678.asia-southeast1.run.app/clustersGET /patterns
Returns common operational patterns detected across runbooks.
Response Schema
{
"patterns": [
{
"name": string, // Pattern name
"frequency": integer, // How often it appears
"documents": [string], // Where it appears
"description": string // What it represents
}
]
}Example
curl https://devops-rag-449012790678.asia-southeast1.run.app/patternsError Responses
All endpoints return errors in this format:
{
"detail": string // Error description
}Common HTTP Status Codes
- 200: Success
- 400: Bad Request (invalid JSON, missing required fields)
- 422: Validation Error (invalid parameter types)
- 500: Internal Server Error (OpenAI API issues, index corruption)
Example Error
curl -X POST https://devops-rag-449012790678.asia-southeast1.run.app/ask \
-H "Content-Type: application/json" \
-d '{}'Response (400):
{
"detail": [
{
"type": "missing",
"loc": ["body", "question"],
"msg": "Field required"
}
]
}Rate Limits
No explicit rate limits are enforced, but queries are subject to:
- OpenAI API rate limits (varies by plan)
- Cloud Run concurrent request limits (1000 by default)
- Average query latency: ~500ms
OpenAPI Specification
Interactive API documentation is available at:
- Swagger UI: https://devops-rag-449012790678.asia-southeast1.run.app/docs
- ReDoc: https://devops-rag-449012790678.asia-southeast1.run.app/redoc
Client Examples
Python
import requests
def ask_devops_rag(question, top_k=5):
response = requests.post(
"https://devops-rag-449012790678.asia-southeast1.run.app/ask",
json={"question": question, "top_k": top_k}
)
return response.json()
# Usage
result = ask_devops_rag("How do I scale a Kubernetes deployment?")
print(result["answer"])JavaScript
async function askDevOpsRAG(question, topK = 5) {
const response = await fetch(
'https://devops-rag-449012790678.asia-southeast1.run.app/ask',
{
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ question, top_k: topK })
}
);
return response.json();
}
// Usage
const result = await askDevOpsRAG('How do I troubleshoot high CPU usage?');
console.log(result.answer);cURL
#!/bin/bash
ask_rag() {
curl -s -X POST https://devops-rag-449012790678.asia-southeast1.run.app/ask \
-H "Content-Type: application/json" \
-d "{\"question\": \"$1\"}" | jq -r '.answer'
}
# Usage
ask_rag "How do I check disk usage on Linux?"