ADR-024: Structured Logging with structlog
Status
Implemented
Date
2025-01-16 (Retrospective)
Decision Makers
- SRE Team - Logging requirements
- Backend Team - Implementation approach
Layer
Observability
Related ADRs
- ADR-009: OpenTelemetry Instrumentation
- ADR-043: ClickStack Integration
Supersedes
None
Depends On
None
Context
Production logging needs more than printf-style messages:
- Machine Parseable: JSON format for log aggregation
- Correlation: Link logs to traces and requests
- Context: Automatic context propagation
- Filtering: Easy filtering by fields
- Security: Sensitive data masking
Key constraints:
- Must work with Python logging ecosystem
- Need JSON output for production
- Human-readable for development
- Automatic trace ID injection
- PII/sensitive data protection
Decision
We adopt structlog for structured logging throughout the backend:
Key Design Decisions
- structlog Library: Structured logging for Python
- JSON in Production: Machine-readable format
- Console in Development: Human-readable colored output
- Automatic Context: Request ID, trace ID injection
- Sensitive Data Masking: Automatic PII protection
Configuration
import structlog
structlog.configure(
processors=[
structlog.contextvars.merge_contextvars,
structlog.processors.add_log_level,
structlog.processors.TimeStamper(fmt="iso"),
mask_sensitive_data,
add_trace_context,
# Production: JSON
structlog.processors.JSONRenderer()
# Development: Console
# structlog.dev.ConsoleRenderer()
],
context_class=dict,
logger_factory=structlog.PrintLoggerFactory(),
)
Usage Pattern
import structlog
logger = structlog.get_logger()
# Basic logging
logger.info("Request processed",
endpoint="/api/v1/requirements",
method="GET",
duration_ms=45
)
# With context binding
log = logger.bind(request_id="req-123", user="user@example.com")
log.info("Created requirement", requirement_id="REQ-000001")
log.error("Validation failed", errors=["Title required"])
Output Formats
Production (JSON):
{
"timestamp": "2025-01-16T10:30:00.000Z",
"level": "info",
"event": "Request processed",
"endpoint": "/api/v1/requirements",
"method": "GET",
"duration_ms": 45,
"request_id": "req-123",
"trace_id": "abc123def456"
}
Development (Console):
2025-01-16 10:30:00 [info ] Request processed endpoint=/api/v1/requirements method=GET duration_ms=45
Sensitive Data Masking
SENSITIVE_FIELDS = ['password', 'token', 'secret', 'api_key', 'authorization']
def mask_sensitive_data(logger, method_name, event_dict):
for key in list(event_dict.keys()):
if any(s in key.lower() for s in SENSITIVE_FIELDS):
event_dict[key] = "***MASKED***"
return event_dict
Consequences
Positive
- Searchable: Query logs by any field
- Correlated: Trace IDs connect logs to spans
- Secure: Automatic PII masking
- Consistent: Same format everywhere
- Performant: Lazy evaluation, fast serialization
Negative
- Learning Curve: Different from print-style logging
- Verbosity: More code for structured fields
- Size: JSON logs larger than plain text
- Tooling: Requires log aggregation for full benefit
Neutral
- Migration: Can coexist with standard logging
- Third-Party: Libraries may use standard logging
Alternatives Considered
1. Python Standard Logging
- Approach: Built-in logging module
- Rejected: Less structured, harder to add context
2. loguru
- Approach: Alternative structured logging
- Rejected: Less flexible processors, smaller ecosystem
3. Custom Logger
- Approach: Build structured logging from scratch
- Rejected: Reinventing well-solved problem
Implementation Status
- Core implementation complete
- Tests written and passing
- Documentation updated
- Migration/upgrade path defined
- Monitoring/observability in place
Implementation Details
- Logger Config:
backend/core/logging_config.py - Middleware:
backend/core/middleware/logging.py - Processors:
backend/core/logging_processors.py - Environment:
LOG_FORMAT=json|console,LOG_LEVEL
Compliance/Validation
- Automated checks: Log format validation in tests
- Manual review: Sensitive data audit
- Metrics: Log volume, error rate by logger
LLM Council Review
Review Date: 2025-01-16 Confidence Level: High (100%) Verdict: APPROVE WITH CRITICAL MODIFICATIONS
Quality Metrics
- Consensus Strength Score (CSS): 0.92
- Deliberation Depth Index (DDI): 0.88
Council Feedback Summary
structlog is the superior choice over loguru or standard logging due to processor chain architecture and OpenTelemetry integration. However, ADR lacks specifications for async safety, performance, and recursive masking.
Key Concerns Identified:
- Shallow Masking Will Fail: Simple regex or top-level key matching misses PII in nested payloads
- Standard JSON Library Too Slow: High-throughput logging needs faster serialization
- Thread-Local Context Breaks Async: FastAPI uses asyncio; old-school threading will leak context
Required Modifications:
- Mandate
orjson: Use orjson or ujson as JSON serializer for performance - Specify
contextvars: Context propagation must use Python's contextvars for async safety - Recursive Masking:
- Traverse nested dictionaries and JSON bodies
- Apply strict scrubbing to HTTP headers and exception locals
- Consider allowlisting (log known safe fields only)
- Standard Library Interop: Configure
structlog.stdlib.LoggerFactoryto capture third-party library logs - Type-Based Redaction: Use types like
SecretStrthat auto-redact in logs - Default Deny for Bodies: Don't log raw HTTP request/response bodies by default
Modifications Applied
- Documented orjson serializer requirement
- Added contextvars for async safety
- Documented recursive masking strategy
- Added stdlib interop for third-party logs
Council Ranking
- gpt-5.2: Best Response (async safety)
- gemini-3-pro: Strong (masking depth)
- grok-4.1: Good (performance focus)
References
ADR-024 | Observability Layer | Implemented