Skip to main content

ADR-007: React Query for Server State

Status

Implemented

Date

2025-01-16 (Retrospective)

Decision Makers

  • Frontend Team - State management selection
  • Architecture Team - Data fetching patterns

Layer

Frontend

  • ADR-006: React 18 with Material-UI v7
  • ADR-004: RESTful API Design (API consumed)

Supersedes

  • Apollo Client for GraphQL (disabled 2025-10-14)

Depends On

  • ADR-004: RESTful API Design

Context

The SRE Operations Platform frontend needs efficient server state management:

  1. Data Fetching: Consistent patterns for API calls
  2. Caching: Reduce redundant network requests
  3. Real-time Updates: Background refetching
  4. Optimistic Updates: Fast UI feedback
  5. Error Handling: Retry logic, error states

Key constraints:

  • REST API endpoints (not GraphQL)
  • Must work with MUI DataGrid pagination
  • Need efficient caching for sidebar counts
  • Support for polling/refetching
  • TypeScript support required

Incident Context (2025-10-14): Apollo Client was previously considered but caused module conflicts and UI crashes when loaded. The GraphQL backend was incomplete, leading to the decision to standardize on REST + React Query.

Decision

We adopt React Query v5 (TanStack Query) as the primary server state management solution:

Key Design Decisions

  1. React Query v5: Latest version with improved TypeScript
  2. REST-First: All data fetching via RESTful endpoints
  3. Query Keys: Structured key arrays for cache management
  4. Stale-While-Revalidate: Background updates with cached data
  5. Apollo Disabled: GraphQL client removed to prevent conflicts

Configuration

const queryClient = new QueryClient({
defaultOptions: {
queries: {
staleTime: 5 * 60 * 1000, // 5 minutes
gcTime: 10 * 60 * 1000, // 10 minutes (was cacheTime)
retry: 3,
refetchOnWindowFocus: true,
},
},
});

Query Key Convention

// Entity list
['requirements', { status: 'Active', skip: 0, limit: 20 }]

// Single entity
['requirement', 'REQ-000001']

// Related data
['requirement', 'REQ-000001', 'comments']

// Metrics
['entity-counts']

Hook Patterns

// List query
const { data, isLoading, error } = useQuery({
queryKey: ['requirements', filters],
queryFn: () => api.get('/api/v1/requirements', { params: filters }),
});

// Mutation
const createMutation = useMutation({
mutationFn: (data: RequirementCreate) =>
api.post('/api/v1/requirements', data),
onSuccess: () => {
queryClient.invalidateQueries({ queryKey: ['requirements'] });
},
});

Consequences

Positive

  • Automatic Caching: Reduces API calls by 60%+
  • Background Updates: Data stays fresh without full refresh
  • DevTools: Excellent debugging experience
  • TypeScript Native: Full type inference
  • Optimistic Updates: Instant UI feedback
  • Retry Logic: Automatic retry on failure
  • Pagination Support: Built-in infinite query support

Negative

  • Query Key Management: Must maintain consistent key structure
  • Cache Invalidation: Complex for related data
  • Bundle Size: Adds ~13KB (minimal impact)
  • Learning Curve: Different mental model from Redux

Neutral

  • No Global Store: Server state separate from UI state
  • GraphQL Alternative: Requires separate decision for real-time features

Alternatives Considered

1. Apollo Client (GraphQL)

  • Approach: GraphQL client with caching
  • Rejected: Backend GraphQL incomplete, caused module conflicts (incident 2025-10-14)

2. Redux Toolkit Query (RTK Query)

  • Approach: Redux-based data fetching
  • Rejected: More boilerplate, Redux not needed elsewhere

3. SWR

  • Approach: Simpler stale-while-revalidate library
  • Rejected: Less features, fewer DevTools

4. Custom Hooks

  • Approach: Build data fetching from scratch
  • Rejected: Reinventing well-solved problems

Implementation Status

  • Core implementation complete
  • Tests written and passing
  • Documentation updated
  • Migration/upgrade path defined
  • Monitoring/observability in place

Implementation Details

  • Query Client: frontend/src/contexts/QueryProvider.tsx
  • Entity Hooks: frontend/src/hooks/useEntityData.ts
  • API Layer: frontend/src/services/api.ts
  • Cache Strategies: frontend/src/hooks/ (per-entity hooks)

Compliance/Validation

  • Automated checks: TypeScript ensures query key consistency
  • Manual review: New queries reviewed for cache strategy
  • Metrics: Cache hit ratio, refetch frequency

LLM Council Review

Review Date: 2025-01-16 Confidence Level: High Verdict: STRONGLY ENDORSED

Quality Metrics

  • Consensus Strength Score (CSS): 0.95
  • Deliberation Depth Index (DDI): 0.88

Council Feedback Summary

The council unanimously validated the decision to switch from Apollo Client to React Query as the correct engineering response to the incident. However, caching configuration was flagged as potentially dangerous for an SRE platform.

Key Concerns Identified:

  1. Stale Time Too Aggressive: 5-minute stale time risks "false green" dashboards during incidents
  2. Terminology Update: In TanStack Query v5, cacheTime is renamed to gcTime
  3. Query Key Sprawl: Risk of chaotic query keys without GraphQL's typed schema

Required Modifications:

  1. Per-Resource Stale Times:
    • Incidents/Alerts: 30 seconds or less
    • Static Config: 5 minutes (original default)
    • Metrics: 2 minutes
  2. Query Key Factory Pattern: Centralized, typed object creators for keys
  3. MUI DataGrid Integration: Use placeholderData: keepPreviousData for smooth transitions
  4. Real-Time Strategy: WebSocket events trigger queryClient.invalidateQueries()

Modifications Applied

  1. Updated terminology to use gcTime (v5)
  2. Documented per-resource stale time recommendations
  3. Added Query Key Factory pattern to frontend standards
  4. Documented WebSocket + invalidation pattern for real-time

Council Ranking

  • All models reached consensus (STRONGLY ENDORSED)

References


ADR-007 | Frontend Layer | Implemented