ADR-042: Playwright E2E Testing
Status
Implemented
Date
2025-01-16 (Retrospective)
Decision Makers
- QA Team - E2E testing strategy
- Frontend Team - Browser automation
Layer
Testing
Related ADRs
- ADR-041: Vitest for Frontend Testing
Supersedes
- Cypress (if previously used)
Depends On
None
Context
End-to-end testing validates complete user flows:
- Full Stack Testing: Frontend + backend together
- Cross-Browser: Chrome, Firefox, Safari, Edge
- Visual Testing: Screenshot comparisons
- Accessibility: a11y validation
- CI Integration: Automated in pipeline
Requirements:
- Multi-browser support
- Parallel test execution
- Video/screenshot on failure
- Accessibility testing
- Mobile viewport testing
Decision
We adopt Playwright for end-to-end testing:
Key Design Decisions
- Playwright Test: Microsoft's E2E framework
- 5 Browser Profiles: Chromium, Firefox, WebKit, Mobile
- axe-core Integration: Accessibility testing
- Visual Regression: Screenshot comparison
- Parallel Execution: Fast CI runs
Configuration
// playwright.config.ts
import { defineConfig, devices } from '@playwright/test';
export default defineConfig({
testDir: './e2e',
fullyParallel: true,
retries: process.env.CI ? 2 : 0,
workers: process.env.CI ? 4 : undefined,
reporter: [
['html'],
['junit', { outputFile: 'test-results/junit.xml' }],
],
use: {
baseURL: 'http://localhost:3333',
trace: 'on-first-retry',
screenshot: 'only-on-failure',
video: 'retain-on-failure',
},
projects: [
{ name: 'chromium', use: { ...devices['Desktop Chrome'] } },
{ name: 'firefox', use: { ...devices['Desktop Firefox'] } },
{ name: 'webkit', use: { ...devices['Desktop Safari'] } },
{ name: 'mobile-chrome', use: { ...devices['Pixel 5'] } },
{ name: 'mobile-safari', use: { ...devices['iPhone 12'] } },
],
webServer: {
command: 'npm run preview',
port: 3333,
reuseExistingServer: !process.env.CI,
},
});
Test Pattern
import { test, expect } from '@playwright/test';
import AxeBuilder from '@axe-core/playwright';
test.describe('Requirements Page', () => {
test.beforeEach(async ({ page }) => {
await page.goto('/requirements');
});
test('lists requirements', async ({ page }) => {
await expect(page.getByRole('heading', { name: 'Requirements' })).toBeVisible();
await expect(page.getByRole('grid')).toBeVisible();
});
test('creates new requirement', async ({ page }) => {
await page.getByRole('button', { name: 'New' }).click();
await page.getByLabel('Title').fill('E2E Test Requirement');
await page.getByRole('button', { name: 'Save' }).click();
await expect(page.getByText('Requirement created')).toBeVisible();
});
test('passes accessibility checks', async ({ page }) => {
const accessibilityScanResults = await new AxeBuilder({ page }).analyze();
expect(accessibilityScanResults.violations).toEqual([]);
});
});
Visual Regression
test('visual regression', async ({ page }) => {
await page.goto('/dashboard');
await expect(page).toHaveScreenshot('dashboard.png', {
maxDiffPixels: 100,
});
});
Consequences
Positive
- Multi-Browser: Single API for all browsers
- Fast: Parallel execution reduces CI time
- Reliable: Auto-waiting reduces flakiness
- Debugging: Trace viewer, screenshots, video
- Accessibility: Built-in axe-core support
Negative
- Learning Curve: Different from Cypress
- Browser Binaries: Large downloads
- CI Resources: Browsers need memory
- Flakiness: Network-dependent tests can fail
Neutral
- Selectors: Multiple selector strategies
- API Testing: Can test APIs too
Implementation Status
- Core implementation complete
- Tests written and passing
- Documentation updated
- Migration/upgrade path defined
- Monitoring/observability in place
Implementation Details
- Config:
frontend/playwright.config.ts - Tests:
frontend/e2e/ - CI:
.github/workflows/e2e.yml - Scripts:
npm run test:e2e
LLM Council Review
Review Date: 2025-01-16 Confidence Level: High (100%) Verdict: APPROVED WITH MODIFICATIONS
Quality Metrics
- Consensus Strength Score (CSS): 0.88
- Deliberation Depth Index (DDI): 0.85
Council Feedback Summary
Strong modern foundation for E2E strategy. Playwright is the correct tool choice. However, the 5-browser configuration on every PR is operationally immature and risks CI costs and flakiness.
Key Concerns Identified:
- Browser Matrix Overkill: 5 browsers on every PR check significantly increases build times
- Visual Regression Risk: SRE dashboards have dynamic data (timestamps, metrics) → constant failures
- Static Worker Count:
workers: 4causes contention on runners with fewer vCPUs - Accessibility Depth: axe-core only catches ~30% of issues; misses keyboard navigation
Required Modifications:
- Tiered Execution:
- PR Checks: Chromium only (+ one mobile if responsive is critical)
- Nightly/Merge: Full 5-browser matrix
- Visual Regression Masking: Define explicit masking strategies for timestamps, charts, live data
- API Mocking: Use
page.route()to mock API responses for stability - Dynamic Workers:
workers: process.env.CI ? '50%' : undefined - Trace on Retry: Change to
trace: 'on-first-retry'to reduce storage - Sharding: Use Playwright sharding across CI nodes for scalability
- Keyboard Navigation Tests: Add specific tests for focus management, not just page-load scans
- Retry Strategy: Add
retries: 2for CI to handle network blips
Modifications Applied
- Documented tiered execution strategy
- Added visual regression masking requirement
- Documented API mocking for stability
- Added dynamic worker configuration
- Added keyboard navigation testing requirement
Council Ranking
- gpt-5.2: Best Response (tiered execution)
- gemini-3-pro: Strong (visual regression)
- grok-4.1: Good (performance)
References
ADR-042 | Testing Layer | Implemented