ADR-045: nginx Reverse Proxy
Status
Implemented
Date
2025-01-16 (Retrospective)
Decision Makers
- DevOps Team - Production architecture
- Security Team - TLS termination
Layer
Infrastructure
Related ADRs
- ADR-010: Docker Compose Development
Supersedes
None
Depends On
None
Context
Production deployment needs a reverse proxy for:
- TLS Termination: Handle HTTPS
- Load Balancing: Distribute traffic
- Static Files: Serve frontend efficiently
- API Routing: Route to backend
- Security: Rate limiting, headers
Requirements:
- HTTPS with modern TLS
- Efficient static file serving
- API proxying to backend
- Security headers
- Access logging
Decision
We use nginx as the production reverse proxy:
Key Design Decisions
- nginx: Battle-tested reverse proxy
- TLS 1.2+: Modern TLS configuration
- Static Serving: Frontend from /usr/share/nginx/html
- API Proxy: /api/* to backend service
- Security Headers: HSTS, CSP, X-Frame-Options
Configuration
# nginx.conf
server {
listen 443 ssl http2;
server_name ops.example.com;
# TLS Configuration
ssl_certificate /etc/nginx/ssl/cert.pem;
ssl_certificate_key /etc/nginx/ssl/key.pem;
ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256;
ssl_prefer_server_ciphers off;
# Security Headers
add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;
add_header X-Frame-Options "SAMEORIGIN" always;
add_header X-Content-Type-Options "nosniff" always;
add_header X-XSS-Protection "1; mode=block" always;
# Frontend static files
location / {
root /usr/share/nginx/html;
try_files $uri $uri/ /index.html;
# Cache static assets
location ~* \.(js|css|png|jpg|jpeg|gif|ico|svg|woff|woff2)$ {
expires 1y;
add_header Cache-Control "public, immutable";
}
}
# API proxy
location /api/ {
proxy_pass http://backend:8888;
proxy_http_version 1.1;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
# Timeouts
proxy_connect_timeout 60s;
proxy_read_timeout 120s;
proxy_send_timeout 60s;
}
# Health check endpoint
location /health {
proxy_pass http://backend:8888;
}
# Gzip compression
gzip on;
gzip_types text/plain application/json application/javascript text/css;
gzip_min_length 1000;
}
Docker Configuration
# docker-compose.yml
nginx:
image: nginx:alpine
ports:
- "443:443"
- "80:80"
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf:ro
- ./frontend/dist:/usr/share/nginx/html:ro
- ./ssl:/etc/nginx/ssl:ro
depends_on:
- backend
- frontend
Consequences
Positive
- Performance: Efficient static serving
- Security: TLS termination, security headers
- Caching: Long-lived asset caching
- Load Balancing: Built-in upstream support
- Battle Tested: Proven at scale
Negative
- Configuration: nginx config syntax learning curve
- Cert Management: Must rotate certificates
- Additional Component: Another thing to monitor
- Debug Complexity: Proxy layer adds complexity
Neutral
- Alternatives: Traefik, Caddy are options
- Cloud Options: ALB/Cloud Load Balancer could replace
Alternatives Considered
1. Direct Backend Exposure
- Approach: Backend serves everything
- Rejected: Less efficient, no TLS termination
2. Traefik
- Approach: Modern reverse proxy
- Rejected: More complex, less documentation
3. Cloud Load Balancer Only
- Approach: AWS ALB / GCP Load Balancer
- Rejected: Vendor lock-in, less control
Implementation Status
- Core implementation complete
- Tests written and passing
- Documentation updated
- Migration/upgrade path defined
- Monitoring/observability in place
Implementation Details
- Config:
docker/nginx/nginx.conf - Docker:
docker-compose.yml - SSL:
docker/nginx/ssl/(generated or mounted) - Frontend Build:
frontend/dist/
LLM Council Review
Review Date: 2025-01-16 Confidence Level: High (100%) Verdict: APPROVED
Quality Metrics
- Consensus Strength Score (CSS): 0.90
- Deliberation Depth Index (DDI): 0.85
Council Feedback Summary
nginx is the correct choice for an SRE Operations Platform. The configuration covers TLS, security headers, and caching. Specific operational gaps for SRE workloads were identified.
Key Concerns Identified:
- Large File Uploads: SRE platforms need log/artifact uploads; default limits will fail
- Long-Running Queries: Standard timeouts will kill SLO calculation queries
- WebSocket/SSE: Configuration doesn't address MCP SSE or real-time features
- Certificate Management: No automation for certificate rotation
Required Modifications:
- Increase Body Size:
client_max_body_size 100m;for log uploads - Extend Timeouts:
proxy_read_timeout 300s; # 5 minutes for long queries
proxy_send_timeout 300s; - SSE/WebSocket Support:
location /api/v1/events {
proxy_buffering off;
proxy_cache off;
proxy_http_version 1.1;
proxy_set_header Connection "";
} - SPA Routing Fix: Ensure
try_files $uri $uri/ /index.htmlhandles all routes - Rate Limiting: Add
limit_req_zonefor API abuse protection - Certificate Automation: Use certbot/acme.sh or cloud-managed certs
- Request ID: Add
$request_idheader for distributed tracing
Modifications Applied
- Documented large file upload configuration
- Added extended timeout requirements
- Documented SSE/WebSocket configuration
- Added rate limiting recommendation
- Documented certificate automation options
Council Ranking
- gpt-5.2: Best Response (SRE workloads)
- gemini-3-pro: Strong (caching)
- claude-opus-4.5: Good (security)
References
ADR-045 | Infrastructure Layer | Implemented