Configuration Guide
Comprehensive guide for configuring RAKṢĀ scanner settings, custom rules, and runtime behavior.
Environment Variables
Core Settings
| Variable | Default | Description |
|---|---|---|
PORT | 8080 | HTTP server port |
SCAN_UPLOAD_DIR | /tmp/raksha-scans | Temporary scan directory |
MAX_UPLOAD_MB | 50 | Maximum upload file size (MB) |
WORKERS | 1 | Number of uvicorn workers |
Scanner Configuration
| Variable | Default | Description |
|---|---|---|
ENABLE_SEMGREP | auto | Enable Semgrep scanner (auto, true, false) |
ENABLE_BANDIT | auto | Enable Bandit scanner (auto, true, false) |
ENABLE_PATTERNS | true | Enable built-in pattern scanner |
SCAN_TIMEOUT | 300 | Maximum scan duration (seconds) |
Datadog APM
| Variable | Default | Description |
|---|---|---|
DD_API_KEY | - | Datadog API key for APM |
DD_SERVICE | avyay-raksha | Service name in Datadog |
DD_ENV | production | Environment tag |
DD_VERSION | 1.0.0 | Version tag |
DD_AGENT_HOST | localhost | Datadog Agent hostname |
Advanced Settings
| Variable | Default | Description |
|---|---|---|
LOG_LEVEL | INFO | Logging level (DEBUG, INFO, WARN, ERROR) |
CORS_ORIGINS | ["*"] | Allowed CORS origins (JSON array) |
RESULT_TTL_HOURS | 24 | How long to keep scan results |
GITHUB_TOKEN | - | GitHub token for private repository access |
Configuration File
Create /app/config.yaml for complex configurations:
# RAKṢĀ Configuration
scanner:
engines:
patterns:
enabled: true
severity_threshold: "medium"
semgrep:
enabled: true
config: "auto" # or path to custom config
timeout: 120
bandit:
enabled: true
config_file: "/app/rules/bandit.yaml"
severity_threshold: "low"
file_filters:
max_size_mb: 10
include_patterns:
- "*.py"
- "*.js"
- "*.ts"
- "*.java"
- "*.go"
- "*.php"
- "*.rb"
- "*.cpp"
- "*.c"
exclude_patterns:
- "*/node_modules/*"
- "*/venv/*"
- "*/vendor/*"
- "*.min.js"
- "*.test.*"
- "*/tests/*"
custom_rules:
directory: "/app/rules"
auto_reload: true
server:
max_concurrent_scans: 10
result_cache_size: 1000
temp_cleanup_interval: 3600 # seconds
logging:
level: "INFO"
format: "json"
datadog: true
security:
rate_limit:
requests_per_minute: 60
burst_size: 10
upload:
allowed_extensions: [".zip", ".tar", ".tar.gz", ".tgz"]
virus_scanning: false # Enable with ClamAVLoad configuration:
docker run -v /path/to/config.yaml:/app/config.yaml ghcr.io/gaurav21/raksha:latestCustom Security Rules
Pattern-Based Rules
Create custom YAML rule files in /app/rules/:
/app/rules/custom-patterns.yaml
# Custom Security Patterns for RAKṢĀ
patterns:
- id: "hardcoded-aws-keys"
title: "Hardcoded AWS Credentials"
description: "AWS access keys or secret keys found in source code"
severity: "critical"
pattern: "(AKIA[0-9A-Z]{16}|aws_secret_access_key\\s*=\\s*['\"][0-9a-zA-Z/+=]{40}['\"])"
confidence: "high"
cwe: "CWE-798"
languages: ["*"]
remediation: |
Move AWS credentials to environment variables or AWS IAM roles:
- Use AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variables
- For EC2/Lambda, use IAM roles instead of hardcoded keys
- Consider AWS Secrets Manager for secure credential storage
references:
- "https://docs.aws.amazon.com/IAM/latest/UserGuide/best-practices.html"
- id: "sql-injection-concat"
title: "SQL Injection via String Concatenation"
description: "SQL query constructed using string concatenation"
severity: "high"
pattern: "(query|sql)\\s*[+]=?\\s*['\"].+['\"]\\s*\\+\\s*(\\w+|\\$\\{\\w+\\})"
confidence: "medium"
cwe: "CWE-89"
languages: ["java", "javascript", "python", "php"]
remediation: |
Use parameterized queries instead of string concatenation:
- Java: PreparedStatement with placeholders
- Python: Use SQLAlchemy or parameterized queries
- PHP: PDO prepared statements
- JavaScript: Use query builders like Knex.js
- id: "weak-crypto-md5"
title: "Weak Cryptographic Hash (MD5)"
description: "MD5 hash function usage detected"
severity: "medium"
pattern: "(md5|MD5)\\s*\\("
confidence: "medium"
cwe: "CWE-327"
languages: ["*"]
remediation: |
Replace MD5 with stronger hash functions:
- Use SHA-256 or SHA-3 for data integrity
- Use bcrypt, scrypt, or Argon2 for password hashing
- Consider HMAC for message authentication
- id: "debug-mode-enabled"
title: "Debug Mode Enabled in Production"
description: "Application debug mode appears to be enabled"
severity: "low"
pattern: "(DEBUG\\s*=\\s*[Tt]rue|debug\\s*:\\s*true|--debug|development\\s*mode)"
confidence: "low"
cwe: "CWE-489"
languages: ["*"]
remediation: |
Disable debug mode in production:
- Set DEBUG=False in Django settings
- Use NODE_ENV=production for Node.js
- Remove --debug flags from production commands
- id: "jwt-weak-secret"
title: "Weak JWT Secret"
description: "JWT signed with weak or default secret"
severity: "high"
pattern: "(jwt|JWT).*(secret|key)\\s*[=:]\\s*['\"]?(secret|key|password|123|abc)['\"]?"
confidence: "medium"
cwe: "CWE-798"
languages: ["*"]
remediation: |
Use strong, unique JWT secrets:
- Generate cryptographically secure random secrets (256+ bits)
- Store secrets in environment variables or secure vaults
- Rotate secrets regularly
- Consider using RS256 (public key crypto) instead of HS256
file_types:
python: [".py"]
javascript: [".js", ".ts", ".jsx", ".tsx"]
java: [".java"]
php: [".php", ".phtml"]
go: [".go"]
ruby: [".rb"]
cpp: [".cpp", ".cc", ".cxx", ".hpp", ".h"]
csharp: [".cs"]
all: ["*"]Organization-Specific Rules
# /app/rules/company-security.yaml
patterns:
- id: "company-api-key-leak"
title: "Company API Key Exposed"
description: "Internal API key pattern detected"
severity: "critical"
pattern: "(COMPANY_API_[A-Z0-9_]{20,}|cp_[a-f0-9]{32})"
confidence: "high"
cwe: "CWE-798"
- id: "deprecated-crypto-lib"
title: "Deprecated Cryptography Library"
description: "Usage of deprecated crypto library detected"
severity: "medium"
pattern: "(import\\s+pycrypto|from\\s+Crypto|require\\(['\"]crypto['\"])"
confidence: "medium"
cwe: "CWE-327"
remediation: |
Replace deprecated crypto libraries:
- Python: Use cryptography library instead of pycrypto
- Node.js: Use built-in crypto module carefully
- Consider higher-level libraries like libsodium
- id: "admin-interface-exposed"
title: "Admin Interface Potentially Exposed"
description: "Admin URLs or interfaces found in client-side code"
severity: "medium"
pattern: "(/admin|/administrator|/wp-admin|\\.admin\\.|admin_panel)"
confidence: "low"
languages: ["javascript", "html"]
remediation: |
Secure admin interfaces:
- Use separate subdomains for admin interfaces
- Implement IP whitelisting
- Add additional authentication layers
- Avoid exposing admin URLs in client-side codeSemgrep Custom Rules
Create /app/rules/semgrep-custom.yaml:
rules:
- id: company-specific-sql-injection
pattern-either:
- pattern: |
$QUERY = "..." + $VAR + "..."
$DB.execute($QUERY)
- pattern: |
$DB.query(f"...{$VAR}...")
message: "SQL injection vulnerability via string formatting"
severity: ERROR
languages: [python]
metadata:
cwe: "CWE-89"
owasp: "A03:2021 - Injection"
confidence: MEDIUM
- id: hardcoded-encryption-key
patterns:
- pattern-inside: |
class $CLASS:
...
- pattern: $KEY = "..."
- pattern-not: $KEY = ""
- metavariable-pattern:
metavariable: $KEY
patterns:
- pattern-regex: ".*(key|secret|password|token).*"
message: "Hardcoded encryption key detected"
severity: ERROR
languages: [python, java, javascript]
- id: unsafe-redirect
pattern-either:
- pattern: redirect($URL)
- pattern: window.location = $URL
- pattern: res.redirect($URL)
message: "Potential open redirect vulnerability"
severity: WARNING
languages: [python, javascript]
metadata:
cwe: "CWE-601"Scanner-Specific Configuration
Semgrep Configuration
Create /app/rules/semgrep.yml:
# Semgrep configuration
rules:
- security
- owasp-top-ten
- cwe-top-25
exclude:
- "*.test.*"
- "test_*"
- "tests/"
- "node_modules/"
- ".git/"
max_target_bytes: 1000000 # 1MB per file
timeout: 30 # seconds per rule
# Custom rule paths
include:
- /app/rules/semgrep-custom.yamlBandit Configuration
Create /app/rules/bandit.yaml:
# Bandit configuration for Python security scanning
tests:
# Enable all tests except specific ones
- B101 # assert_used
- B102 # exec_used
- B103 # set_bad_file_permissions
- B104 # hardcoded_bind_all_interfaces
- B105 # hardcoded_password_string
- B106 # hardcoded_password_funcarg
- B107 # hardcoded_password_default
- B108 # hardcoded_tmp_directory
- B110 # try_except_pass
- B112 # try_except_continue
- B201 # flask_debug_true
- B301 # pickle
- B302 # marshal
- B303 # md5
- B304 # des
- B305 # cipher
- B306 # random
- B307 # eval
- B308 # mark_safe
- B309 # httpsconnection
- B310 # urllib_urlopen
- B311 # random
- B312 # telnetlib
- B313 # xml_bad_cElementTree
- B314 # xml_bad_ElementTree
- B315 # xml_bad_expatreader
- B316 # xml_bad_expatbuilder
- B317 # xml_bad_sax
- B318 # xml_bad_minidom
- B319 # xml_bad_pulldom
- B320 # xml_bad_etree
- B321 # ftplib
- B322 # input
- B323 # unverified_context
- B324 # hashlib_new_insecure_functions
- B325 # tempnam
- B401 # import_telnetlib
- B402 # import_ftplib
- B403 # import_pickle
- B404 # import_subprocess
- B405 # import_xml_etree
- B406 # import_xml_sax
- B407 # import_xml_expat
- B408 # import_xml_minidom
- B409 # import_xml_pulldom
- B410 # import_lxml
- B411 # import_xmlrpclib
- B412 # import_httpoxy
- B413 # import_pycrypto
- B501 # request_with_no_cert_validation
- B502 # ssl_with_bad_version
- B503 # ssl_with_bad_defaults
- B504 # ssl_with_no_version
- B505 # weak_cryptographic_key
- B506 # yaml_load
- B507 # ssh_no_host_key_verification
- B601 # paramiko_calls
- B602 # subprocess_popen_with_shell_equals_true
- B603 # subprocess_without_shell_equals_false
- B604 # any_other_function_with_shell_equals_true
- B605 # start_process_with_a_shell
- B606 # start_process_with_no_shell
- B607 # start_process_with_partial_path
- B608 # hardcoded_sql_expressions
- B609 # linux_commands_wildcard_injection
- B610 # django_extra_used
- B611 # django_rawsql_used
skips:
- "*/tests/*"
- "*/test_*.py"
- "*_test.py"
- "*/conftest.py"
# Exclude certain directories
exclude_dirs:
- '/tests'
- '/test'
- '/.venv'
- '/venv'
- '/env'
- '/.git'
- '/node_modules'
- '/__pycache__'
- '/.pytest_cache'
# Severity levels
severity:
LOW: 0
MEDIUM: 1
HIGH: 2
CRITICAL: 3
# Report format
formatter: jsonRuntime Configuration
Docker Environment
# Development configuration
docker run -d \
--name raksha-dev \
-p 8430:8080 \
-e LOG_LEVEL=DEBUG \
-e ENABLE_SEMGREP=true \
-e ENABLE_BANDIT=true \
-e MAX_UPLOAD_MB=20 \
-v $(pwd)/rules:/app/rules:ro \
-v $(pwd)/config.yaml:/app/config.yaml:ro \
ghcr.io/gaurav21/raksha:latest
# Production configuration
docker run -d \
--name raksha-prod \
-p 80:8080 \
--restart=unless-stopped \
-e LOG_LEVEL=INFO \
-e SCAN_TIMEOUT=600 \
-e RESULT_TTL_HOURS=48 \
-e DD_API_KEY=${DD_API_KEY} \
-e DD_ENV=production \
-v /opt/raksha/rules:/app/rules:ro \
-v /var/tmp/raksha:/tmp/raksha-scans \
ghcr.io/gaurav21/raksha:latestKubernetes ConfigMap
apiVersion: v1
kind: ConfigMap
metadata:
name: raksha-config
data:
config.yaml: |
scanner:
engines:
patterns:
enabled: true
severity_threshold: "medium"
semgrep:
enabled: true
timeout: 120
bandit:
enabled: true
severity_threshold: "low"
file_filters:
max_size_mb: 10
exclude_patterns:
- "*/node_modules/*"
- "*/venv/*"
- "*/vendor/*"
server:
max_concurrent_scans: 5
result_cache_size: 500
logging:
level: "INFO"
format: "json"
---
apiVersion: v1
kind: ConfigMap
metadata:
name: raksha-rules
data:
custom-patterns.yaml: |
patterns:
- id: "org-api-key"
title: "Organization API Key"
severity: "critical"
pattern: "ORG_API_[A-Z0-9]{32}"
# ... rest of ruleMount in deployment:
spec:
containers:
- name: raksha
volumeMounts:
- name: config
mountPath: /app/config.yaml
subPath: config.yaml
- name: rules
mountPath: /app/rules
volumes:
- name: config
configMap:
name: raksha-config
- name: rules
configMap:
name: raksha-rulesSeverity Mapping
Custom Severity Levels
# /app/config/severity-mapping.yaml
severity_mapping:
patterns:
# Override default severities
"hardcoded-password": "critical"
"sql-injection": "critical"
"xss-vulnerability": "high"
"weak-crypto": "medium"
"code-quality": "low"
cwe_mapping:
# Map CWE codes to severities
"CWE-89": "critical" # SQL Injection
"CWE-79": "high" # XSS
"CWE-798": "critical" # Hardcoded Credentials
"CWE-327": "medium" # Weak Crypto
"CWE-22": "high" # Path Traversal
scanner_weights:
# Weight findings by scanner confidence
semgrep: 1.0
bandit: 0.9
patterns: 0.8
thresholds:
fail_build:
critical: 1 # Fail if any critical issues
high: 3 # Fail if 3+ high issues
total: 10 # Fail if 10+ total issues
notification:
critical: 1 # Alert on any critical
high: 1 # Alert on any highPerformance Tuning
Scan Optimization
# /app/config/performance.yaml
scanning:
parallel_files: 4 # Files to scan concurrently
max_file_size_mb: 50 # Skip files larger than this
timeout_per_file: 30 # Seconds
memory_limit_mb: 2048 # Scanner memory limit
file_type_limits:
javascript: 100 # Max JS files to scan
python: 200 # Max Python files
"*": 500 # Max total files
caching:
enable_file_hash_cache: true
cache_duration_hours: 24
max_cache_size_mb: 1024
threading:
scanner_workers: 2 # Parallel scanner processes
io_workers: 4 # File I/O workersResource Limits
# Docker Compose with resource limits
version: '3.8'
services:
raksha:
image: ghcr.io/gaurav21/raksha:latest
deploy:
resources:
limits:
cpus: '2.0'
memory: 4G
reservations:
cpus: '0.5'
memory: 1G
environment:
- SCAN_TIMEOUT=300
- MAX_CONCURRENT_SCANS=3
ulimits:
nofile: 4096
nproc: 2048Monitoring Configuration
Logging Configuration
# /app/config/logging.yaml
logging:
version: 1
disable_existing_loggers: false
formatters:
json:
class: pythonjsonlogger.jsonlogger.JsonFormatter
format: '%(asctime)s %(name)s %(levelname)s %(message)s'
detailed:
format: '%(asctime)s - %(name)s - %(levelname)s - %(filename)s:%(lineno)d - %(message)s'
handlers:
console:
class: logging.StreamHandler
level: INFO
formatter: json
stream: ext://sys.stdout
file:
class: logging.handlers.RotatingFileHandler
level: DEBUG
formatter: detailed
filename: /var/log/raksha/app.log
maxBytes: 10485760 # 10MB
backupCount: 3
datadog:
class: datadog.DogStatsdLogHandler
level: WARNING
loggers:
raksha:
level: DEBUG
handlers: [console, file, datadog]
propagate: false
uvicorn:
level: INFO
handlers: [console]
propagate: false
root:
level: INFO
handlers: [console]Metrics Configuration
# Custom metrics for monitoring
import time
from datadog import statsd
# Track scan metrics
@statsd.timed('raksha.scan.duration')
def run_scan(directory):
start_time = time.time()
try:
result = perform_scan(directory)
statsd.increment('raksha.scan.success')
statsd.histogram('raksha.scan.files', result.total_files)
statsd.histogram('raksha.scan.findings', result.total_findings)
return result
except Exception as e:
statsd.increment('raksha.scan.error')
raise
finally:
statsd.histogram('raksha.scan.duration', time.time() - start_time)Next: CI/CD Integration Guide for automating security scanning in your development pipeline.