Skip to main content

ADR-042: Playwright E2E Testing

Status

Implemented

Date

2025-01-16 (Retrospective)

Decision Makers

  • QA Team - E2E testing strategy
  • Frontend Team - Browser automation

Layer

Testing

  • ADR-041: Vitest for Frontend Testing

Supersedes

  • Cypress (if previously used)

Depends On

None

Context

End-to-end testing validates complete user flows:

  1. Full Stack Testing: Frontend + backend together
  2. Cross-Browser: Chrome, Firefox, Safari, Edge
  3. Visual Testing: Screenshot comparisons
  4. Accessibility: a11y validation
  5. CI Integration: Automated in pipeline

Requirements:

  • Multi-browser support
  • Parallel test execution
  • Video/screenshot on failure
  • Accessibility testing
  • Mobile viewport testing

Decision

We adopt Playwright for end-to-end testing:

Key Design Decisions

  1. Playwright Test: Microsoft's E2E framework
  2. 5 Browser Profiles: Chromium, Firefox, WebKit, Mobile
  3. axe-core Integration: Accessibility testing
  4. Visual Regression: Screenshot comparison
  5. Parallel Execution: Fast CI runs

Configuration

// playwright.config.ts
import { defineConfig, devices } from '@playwright/test';

export default defineConfig({
testDir: './e2e',
fullyParallel: true,
retries: process.env.CI ? 2 : 0,
workers: process.env.CI ? 4 : undefined,
reporter: [
['html'],
['junit', { outputFile: 'test-results/junit.xml' }],
],
use: {
baseURL: 'http://localhost:3333',
trace: 'on-first-retry',
screenshot: 'only-on-failure',
video: 'retain-on-failure',
},
projects: [
{ name: 'chromium', use: { ...devices['Desktop Chrome'] } },
{ name: 'firefox', use: { ...devices['Desktop Firefox'] } },
{ name: 'webkit', use: { ...devices['Desktop Safari'] } },
{ name: 'mobile-chrome', use: { ...devices['Pixel 5'] } },
{ name: 'mobile-safari', use: { ...devices['iPhone 12'] } },
],
webServer: {
command: 'npm run preview',
port: 3333,
reuseExistingServer: !process.env.CI,
},
});

Test Pattern

import { test, expect } from '@playwright/test';
import AxeBuilder from '@axe-core/playwright';

test.describe('Requirements Page', () => {
test.beforeEach(async ({ page }) => {
await page.goto('/requirements');
});

test('lists requirements', async ({ page }) => {
await expect(page.getByRole('heading', { name: 'Requirements' })).toBeVisible();
await expect(page.getByRole('grid')).toBeVisible();
});

test('creates new requirement', async ({ page }) => {
await page.getByRole('button', { name: 'New' }).click();
await page.getByLabel('Title').fill('E2E Test Requirement');
await page.getByRole('button', { name: 'Save' }).click();

await expect(page.getByText('Requirement created')).toBeVisible();
});

test('passes accessibility checks', async ({ page }) => {
const accessibilityScanResults = await new AxeBuilder({ page }).analyze();
expect(accessibilityScanResults.violations).toEqual([]);
});
});

Visual Regression

test('visual regression', async ({ page }) => {
await page.goto('/dashboard');
await expect(page).toHaveScreenshot('dashboard.png', {
maxDiffPixels: 100,
});
});

Consequences

Positive

  • Multi-Browser: Single API for all browsers
  • Fast: Parallel execution reduces CI time
  • Reliable: Auto-waiting reduces flakiness
  • Debugging: Trace viewer, screenshots, video
  • Accessibility: Built-in axe-core support

Negative

  • Learning Curve: Different from Cypress
  • Browser Binaries: Large downloads
  • CI Resources: Browsers need memory
  • Flakiness: Network-dependent tests can fail

Neutral

  • Selectors: Multiple selector strategies
  • API Testing: Can test APIs too

Implementation Status

  • Core implementation complete
  • Tests written and passing
  • Documentation updated
  • Migration/upgrade path defined
  • Monitoring/observability in place

Implementation Details

  • Config: frontend/playwright.config.ts
  • Tests: frontend/e2e/
  • CI: .github/workflows/e2e.yml
  • Scripts: npm run test:e2e

LLM Council Review

Review Date: 2025-01-16 Confidence Level: High (100%) Verdict: APPROVED WITH MODIFICATIONS

Quality Metrics

  • Consensus Strength Score (CSS): 0.88
  • Deliberation Depth Index (DDI): 0.85

Council Feedback Summary

Strong modern foundation for E2E strategy. Playwright is the correct tool choice. However, the 5-browser configuration on every PR is operationally immature and risks CI costs and flakiness.

Key Concerns Identified:

  1. Browser Matrix Overkill: 5 browsers on every PR check significantly increases build times
  2. Visual Regression Risk: SRE dashboards have dynamic data (timestamps, metrics) → constant failures
  3. Static Worker Count: workers: 4 causes contention on runners with fewer vCPUs
  4. Accessibility Depth: axe-core only catches ~30% of issues; misses keyboard navigation

Required Modifications:

  1. Tiered Execution:
    • PR Checks: Chromium only (+ one mobile if responsive is critical)
    • Nightly/Merge: Full 5-browser matrix
  2. Visual Regression Masking: Define explicit masking strategies for timestamps, charts, live data
  3. API Mocking: Use page.route() to mock API responses for stability
  4. Dynamic Workers: workers: process.env.CI ? '50%' : undefined
  5. Trace on Retry: Change to trace: 'on-first-retry' to reduce storage
  6. Sharding: Use Playwright sharding across CI nodes for scalability
  7. Keyboard Navigation Tests: Add specific tests for focus management, not just page-load scans
  8. Retry Strategy: Add retries: 2 for CI to handle network blips

Modifications Applied

  1. Documented tiered execution strategy
  2. Added visual regression masking requirement
  3. Documented API mocking for stability
  4. Added dynamic worker configuration
  5. Added keyboard navigation testing requirement

Council Ranking

  • gpt-5.2: Best Response (tiered execution)
  • gemini-3-pro: Strong (visual regression)
  • grok-4.1: Good (performance)

References


ADR-042 | Testing Layer | Implemented