Skip to main content

ADR-045: nginx Reverse Proxy

Status

Implemented

Date

2025-01-16 (Retrospective)

Decision Makers

  • DevOps Team - Production architecture
  • Security Team - TLS termination

Layer

Infrastructure

  • ADR-010: Docker Compose Development

Supersedes

None

Depends On

None

Context

Production deployment needs a reverse proxy for:

  1. TLS Termination: Handle HTTPS
  2. Load Balancing: Distribute traffic
  3. Static Files: Serve frontend efficiently
  4. API Routing: Route to backend
  5. Security: Rate limiting, headers

Requirements:

  • HTTPS with modern TLS
  • Efficient static file serving
  • API proxying to backend
  • Security headers
  • Access logging

Decision

We use nginx as the production reverse proxy:

Key Design Decisions

  1. nginx: Battle-tested reverse proxy
  2. TLS 1.2+: Modern TLS configuration
  3. Static Serving: Frontend from /usr/share/nginx/html
  4. API Proxy: /api/* to backend service
  5. Security Headers: HSTS, CSP, X-Frame-Options

Configuration

# nginx.conf
server {
listen 443 ssl http2;
server_name ops.example.com;

# TLS Configuration
ssl_certificate /etc/nginx/ssl/cert.pem;
ssl_certificate_key /etc/nginx/ssl/key.pem;
ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256;
ssl_prefer_server_ciphers off;

# Security Headers
add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;
add_header X-Frame-Options "SAMEORIGIN" always;
add_header X-Content-Type-Options "nosniff" always;
add_header X-XSS-Protection "1; mode=block" always;

# Frontend static files
location / {
root /usr/share/nginx/html;
try_files $uri $uri/ /index.html;

# Cache static assets
location ~* \.(js|css|png|jpg|jpeg|gif|ico|svg|woff|woff2)$ {
expires 1y;
add_header Cache-Control "public, immutable";
}
}

# API proxy
location /api/ {
proxy_pass http://backend:8888;
proxy_http_version 1.1;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;

# Timeouts
proxy_connect_timeout 60s;
proxy_read_timeout 120s;
proxy_send_timeout 60s;
}

# Health check endpoint
location /health {
proxy_pass http://backend:8888;
}

# Gzip compression
gzip on;
gzip_types text/plain application/json application/javascript text/css;
gzip_min_length 1000;
}

Docker Configuration

# docker-compose.yml
nginx:
image: nginx:alpine
ports:
- "443:443"
- "80:80"
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf:ro
- ./frontend/dist:/usr/share/nginx/html:ro
- ./ssl:/etc/nginx/ssl:ro
depends_on:
- backend
- frontend

Consequences

Positive

  • Performance: Efficient static serving
  • Security: TLS termination, security headers
  • Caching: Long-lived asset caching
  • Load Balancing: Built-in upstream support
  • Battle Tested: Proven at scale

Negative

  • Configuration: nginx config syntax learning curve
  • Cert Management: Must rotate certificates
  • Additional Component: Another thing to monitor
  • Debug Complexity: Proxy layer adds complexity

Neutral

  • Alternatives: Traefik, Caddy are options
  • Cloud Options: ALB/Cloud Load Balancer could replace

Alternatives Considered

1. Direct Backend Exposure

  • Approach: Backend serves everything
  • Rejected: Less efficient, no TLS termination

2. Traefik

  • Approach: Modern reverse proxy
  • Rejected: More complex, less documentation

3. Cloud Load Balancer Only

  • Approach: AWS ALB / GCP Load Balancer
  • Rejected: Vendor lock-in, less control

Implementation Status

  • Core implementation complete
  • Tests written and passing
  • Documentation updated
  • Migration/upgrade path defined
  • Monitoring/observability in place

Implementation Details

  • Config: docker/nginx/nginx.conf
  • Docker: docker-compose.yml
  • SSL: docker/nginx/ssl/ (generated or mounted)
  • Frontend Build: frontend/dist/

LLM Council Review

Review Date: 2025-01-16 Confidence Level: High (100%) Verdict: APPROVED

Quality Metrics

  • Consensus Strength Score (CSS): 0.90
  • Deliberation Depth Index (DDI): 0.85

Council Feedback Summary

nginx is the correct choice for an SRE Operations Platform. The configuration covers TLS, security headers, and caching. Specific operational gaps for SRE workloads were identified.

Key Concerns Identified:

  1. Large File Uploads: SRE platforms need log/artifact uploads; default limits will fail
  2. Long-Running Queries: Standard timeouts will kill SLO calculation queries
  3. WebSocket/SSE: Configuration doesn't address MCP SSE or real-time features
  4. Certificate Management: No automation for certificate rotation

Required Modifications:

  1. Increase Body Size: client_max_body_size 100m; for log uploads
  2. Extend Timeouts:
    proxy_read_timeout 300s;  # 5 minutes for long queries
    proxy_send_timeout 300s;
  3. SSE/WebSocket Support:
    location /api/v1/events {
    proxy_buffering off;
    proxy_cache off;
    proxy_http_version 1.1;
    proxy_set_header Connection "";
    }
  4. SPA Routing Fix: Ensure try_files $uri $uri/ /index.html handles all routes
  5. Rate Limiting: Add limit_req_zone for API abuse protection
  6. Certificate Automation: Use certbot/acme.sh or cloud-managed certs
  7. Request ID: Add $request_id header for distributed tracing

Modifications Applied

  1. Documented large file upload configuration
  2. Added extended timeout requirements
  3. Documented SSE/WebSocket configuration
  4. Added rate limiting recommendation
  5. Documented certificate automation options

Council Ranking

  • gpt-5.2: Best Response (SRE workloads)
  • gemini-3-pro: Strong (caching)
  • claude-opus-4.5: Good (security)

References


ADR-045 | Infrastructure Layer | Implemented