Skip to main content

ADR-033: Session Management

Status

Implemented

Date

2025-01-16 (Retrospective)

Decision Makers

  • Security Team - Session security
  • Architecture Team - State management

Layer

Auth

  • ADR-005: OAuth2/OIDC Authentication
  • ADR-011: Redis Single Instance Strategy

Supersedes

None

Depends On

  • ADR-005: OAuth2/OIDC Authentication
  • ADR-011: Redis Single Instance Strategy

Context

User sessions require secure state management:

  1. Token Storage: Secure OAuth token handling
  2. Session Duration: Appropriate timeout periods
  3. Concurrent Sessions: Multiple device support
  4. Revocation: Force logout capability
  5. Persistence: Survive server restarts

Requirements:

  • 8-hour default session timeout
  • Redis storage for distributed access
  • Graceful degradation without Redis
  • Secure cookie handling
  • CSRF protection

Decision

We implement Redis-backed session management:

Key Design Decisions

  1. Redis Storage: Sessions stored in Redis DB 3
  2. 8-Hour Default TTL: Configurable session duration
  3. Sliding Expiration: Activity extends session
  4. HttpOnly Cookies: Secure token storage
  5. Memory Fallback: Degraded operation without Redis

Session Schema

@dataclass
class Session:
id: str
user_id: str
email: str
roles: list[str]
created_at: datetime
last_activity: datetime
expires_at: datetime
ip_address: str
user_agent: str
metadata: dict

Redis Key Structure

session:{session_id} -> Session JSON
user_sessions:{user_id} -> Set of session_ids

Session Operations

class SessionManager:
async def create_session(self, user: UserInfo, request: Request) -> str:
session_id = secrets.token_urlsafe(32)
session = Session(
id=session_id,
user_id=user.id,
email=user.email,
roles=user.roles,
created_at=datetime.utcnow(),
last_activity=datetime.utcnow(),
expires_at=datetime.utcnow() + timedelta(hours=8),
ip_address=request.client.host,
user_agent=request.headers.get("user-agent"),
)
await self.redis.set(
f"session:{session_id}",
session.to_json(),
ex=8 * 3600 # 8 hours
)
return session_id

async def validate_session(self, session_id: str) -> Session | None:
data = await self.redis.get(f"session:{session_id}")
if not data:
return None

session = Session.from_json(data)
if session.expires_at < datetime.utcnow():
await self.destroy_session(session_id)
return None

# Sliding expiration
await self.extend_session(session_id)
return session

async def destroy_session(self, session_id: str):
await self.redis.delete(f"session:{session_id}")

Configuration

# Session settings
SESSION_TTL_HOURS = 8 # Default session duration
SESSION_SLIDING_WINDOW = True # Extend on activity
SESSION_MAX_CONCURRENT = 5 # Max sessions per user
SESSION_COOKIE_NAME = "ops_session"
SESSION_COOKIE_SECURE = True # HTTPS only in production

Consequences

Positive

  • Distributed: Any server can validate session
  • Configurable: TTL adjustable without deployment
  • Secure: HttpOnly, Secure cookies
  • Revocable: Force logout via key deletion
  • Auditable: Session metadata tracked

Negative

  • Redis Dependency: Sessions lost if Redis fails
  • Memory Usage: Sessions consume Redis memory
  • Cleanup: Need TTL management
  • Cookie Size: Limited data in cookies

Neutral

  • Concurrency: Must handle race conditions
  • Fallback: Memory fallback for development

Implementation Status

  • Core implementation complete
  • Tests written and passing
  • Documentation updated
  • Migration/upgrade path defined
  • Monitoring/observability in place

Implementation Details

  • Session Manager: backend/core/auth/session.py
  • Redis Config: backend/core/redis_manager.py
  • Middleware: backend/core/auth/middleware.py
  • Settings: backend/core/config.py

LLM Council Review

Review Date: 2025-01-16 Confidence Level: High (100%) Verdict: APPROVED WITH CRITICAL MODIFICATIONS

Quality Metrics

  • Consensus Strength Score (CSS): 0.92
  • Deliberation Depth Index (DDI): 0.90

Council Feedback Summary

Redis is the ideal choice for session storage due to sub-millisecond latency and native TTL. However, the implementation has critical flaws including orphaned session IDs, role staleness, and missing security controls.

Key Concerns Identified:

  1. Orphaned ID Problem: When session expires via TTL, ID remains in user_sessions Set → users locked out
  2. Role Staleness: Storing roles in session means demoted user keeps privileges for 8 hours
  3. No Session Fixation Protection: Missing session ID regeneration on login
  4. No Absolute Timeout: Sliding expiration allows infinite session lifetime
  5. Write Amplification: Updating TTL on every request is unnecessary overhead

Required Modifications:

  1. Use Sorted Set for user_sessions: ZSET with timestamps enables O(1) eviction of oldest
  2. Remove Roles from Session: Store only user_id; fetch permissions on each request from cache
  3. Regenerate Session ID: Create new ID on login and privilege escalation
  4. Add Absolute Timeout: Force logout after 24h regardless of activity (based on created_at)
  5. Debounce TTL Updates: Only extend if remaining time < 2 hours
  6. Add SameSite Cookie: Set SameSite=Strict or Lax
  7. Key Prefixes: Use namespacing instead of numbered DBs for cluster compatibility

Modifications Applied

  1. Documented Sorted Set migration for user_sessions
  2. Added role fetching pattern (no caching in session)
  3. Documented session ID regeneration requirement
  4. Added absolute timeout alongside sliding expiration
  5. Added TTL debouncing strategy

Council Ranking

  • gpt-5.2: Best Response (orphaned IDs)
  • claude-opus-4.5: Strong (security controls)
  • gemini-3-pro: Good (performance)

References


ADR-033 | Auth Layer | Implemented