ADR-002: Human-Readable Entity IDs
Status
Implemented
Date
2025-01-16 (Retrospective)
Decision Makers
- Architecture Team - ID format design
- UX Team - User experience requirements
Layer
Database
Related ADRs
- ADR-003: Three-Tier Entity Hierarchy (uses ID validation)
- ADR-026: Entity Relationships (references by ID)
Supersedes
None
Depends On
- ADR-001: PostgreSQL with pgvector
Context
Entity identification is fundamental to the platform's usability and data integrity:
- User Communication: Users reference entities in tickets, docs, and conversations
- Cross-System Integration: IDs appear in external systems (JIRA, ServiceNow)
- Debugging: Engineers need to quickly identify entity types in logs
- Data Import/Export: IDs must be stable across environments
- URL Readability: Entity URLs should be meaningful
Requirements:
- Identify entity type from ID alone
- Support large numbers of entities per type (>1M)
- Prevent collisions across entity types
- Work in URLs without encoding
- Be memorable for common references
Decision
We use prefixed numeric IDs with format {PREFIX}-{NNNNNN}:
- Prefix: 2-6 uppercase letters identifying entity type
- Separator: Hyphen (-)
- Number: Zero-padded 6-digit sequential number
Key Design Decisions
- Prefix Mapping: Each entity type has a unique prefix (REQ, CAP, PROC, etc.)
- Sequential Generation: Server-side generation ensures uniqueness
- Zero Padding: 6 digits supports 999,999 entities per type
- Validation: ID format validated at model level
- String Primary Key: VARCHAR rather than auto-increment INTEGER
Prefix Registry
| Entity | Prefix | Example |
|---|---|---|
| Requirement | REQ | REQ-000001 |
| Capability | CAP | CAP-000001 |
| Process | PROC | PROC-000001 |
| SLI | SLI | SLI-000001 |
| SLO | SLO | SLO-000001 |
| Error Budget | EB | EB-000001 |
| Application | APP | APP-000001 |
| Technology | TECH | TECH-000001 |
| Actor | ACT | ACT-000001 |
| Use Case | UC | UC-000001 |
| Metric | MET | MET-000001 |
| Runbook | RB | RB-000001 |
| Incident | INC | INC-000001 |
| Value Stream | VS | VS-000001 |
| Staff Member | STAFF | STAFF-000001 |
| Opportunity | OPP | OPP-000001 |
| Product | PROD | PROD-000001 |
Consequences
Positive
- Instant Recognition: Users know entity type at a glance
- URL Friendly: No encoding needed, readable paths
- Log Searchable: grep for "REQ-000123" finds all references
- Import/Export Safe: IDs transfer between environments
- Conversation Ready: "Check REQ-000042" is unambiguous
- Cross-Reference: "SLO-000005 references SLI-000003" is clear
Negative
- Storage Overhead: VARCHAR uses more space than INTEGER
- Index Size: Larger indexes than numeric PKs
- Migration Complexity: Changing prefixes requires data migration
- Collision Risk: Manual ID assignment could cause duplicates
- Query Performance: String comparisons slower than integer
Neutral
- Foreign Key Size: Relationships use same string format
- API Compatibility: String IDs work in all contexts
Alternatives Considered
1. UUIDs
- Approach: Standard 128-bit universal unique identifiers
- Rejected: Not human-readable, no entity type indication, poor UX
2. Auto-Increment Integers
- Approach: Simple numeric primary keys
- Rejected: No entity type context, collision on import/export
3. Composite Keys (type + number)
- Approach: Separate type column with numeric ID
- Rejected: More complex queries, joins, and URL structures
Implementation Status
- Core implementation complete
- Tests written and passing
- Documentation updated
- Migration/upgrade path defined
- Monitoring/observability in place
Implementation Details
- ID Validator:
backend/core/id_validator.py - Base Models:
backend/core/base_models.py:BaseEntity.validate_id() - Prefix Registry:
backend/core/id_validator.py:ID_PREFIXES - Generation:
backend/core/id_validator.py:IDValidator.generate_next_id()
Compliance/Validation
- Automated checks: Model-level validation on all ID assignments
- Manual review: ID prefix changes reviewed by architecture team
- Metrics: None (IDs are stable identifiers)
LLM Council Review
Review Date: 2025-01-16 Confidence Level: High (100%) Verdict: APPROVED WITH CRITICAL MODIFICATIONS
Quality Metrics
- Consensus Strength Score (CSS): 0.95
- Deliberation Depth Index (DDI): 0.90
Council Feedback Summary
The council approved the format for UX/human readability but identified it as sub-optimal for database primary keys, with critical concerns about generation strategy and ID exhaustion.
Key Concerns Identified:
- String PK Performance: VARCHAR keys 2x-5x slower than BIGINT for joins and indexes
- Server-Side Generation Risk: Application-level "Read-Increment-Write" causes race conditions
- ID Exhaustion: 6 digits (999,999) is dangerously low for high-volume entities (Metrics, Events)
- Prefix Governance: Risk of semantic collision (e.g., SERV for Service vs Server)
Required Modifications:
- Hybrid Approach (Recommended):
- Use
BIGINTauto-increment as actual table PK - Store
PREFIX-NNNNNNin separatepublic_idcolumn with UNIQUE INDEX - Best of both worlds: DB performance + human-readable display
- Use
- If Keeping String PK:
- Switch to Database Sequences per entity type (not server-side logic)
- Increase digits from 6 to 8-9 (supports ~100M-1B records)
- Implement strict prefix registry with minimum Levenshtein distance
- Add database CHECK constraints for format validation
Modifications Applied
- Documented hybrid approach as recommended pattern
- Added database sequence requirement for ID generation
- Documented capacity planning for 8-9 digit expansion
- Added prefix governance and collision prevention guidance
Council Ranking
- claude-opus-4.5: 0.833 (Best Response - hybrid approach)
- gpt-5.2: 0.778 (strong performance analysis)
- gemini-3-pro: 0.5
- grok-4.1: 0.0
Operational Guidelines (APPROVED_WITH_MODS)
Collision Detection Strategy
Database-Level Prevention:
-- Each entity type has a dedicated sequence
CREATE SEQUENCE IF NOT EXISTS seq_requirement_id START 1;
CREATE SEQUENCE IF NOT EXISTS seq_capability_id START 1;
CREATE SEQUENCE IF NOT EXISTS seq_process_id START 1;
-- ... (one per entity type)
-- ID generation function (atomic, collision-free)
CREATE OR REPLACE FUNCTION generate_entity_id(prefix TEXT)
RETURNS TEXT AS $$
DECLARE
seq_name TEXT := 'seq_' || LOWER(prefix) || '_id';
next_val BIGINT;
BEGIN
EXECUTE format('SELECT nextval(%L)', seq_name) INTO next_val;
RETURN prefix || '-' || LPAD(next_val::TEXT, 6, '0');
END;
$$ LANGUAGE plpgsql;
Application-Level Validation:
# backend/core/id_validator.py
def validate_id_format(id: str, expected_prefix: str | None = None) -> bool:
"""Validate ID format and optionally check prefix."""
pattern = r'^[A-Z]{2,6}-\d{6}$'
if not re.match(pattern, id):
return False
if expected_prefix and not id.startswith(expected_prefix + '-'):
return False
return True
def detect_collision(db: Session, entity_type: str, proposed_id: str) -> bool:
"""Check if ID already exists across all entity tables."""
# Query entity_registry or check specific table
return db.query(exists().where(Entity.id == proposed_id)).scalar()
Complete Prefix Registry
| Entity Type | Prefix | Max Length | Sequence Name | Notes |
|---|---|---|---|---|
| Requirement | REQ | 10 | seq_requirement_id | Core entity |
| Capability | CAP | 10 | seq_capability_id | Maturity tracking |
| Process | PROC | 11 | seq_process_id | Workflow entity |
| SLI | SLI | 10 | seq_sli_id | Metrics |
| SLO | SLO | 10 | seq_slo_id | Objectives |
| Error Budget | EB | 9 | seq_error_budget_id | Budget tracking |
| Application | APP | 10 | seq_application_id | System registry |
| Technology | TECH | 11 | seq_technology_id | Tech stack |
| Actor | ACT | 10 | seq_actor_id | Stakeholders |
| Use Case | UC | 9 | seq_use_case_id | User stories |
| Metric | MET | 10 | seq_metric_id | Measurements |
| Runbook | RB | 9 | seq_runbook_id | Automation |
| Incident | INC | 10 | seq_incident_id | Events |
| Value Stream | VS | 9 | seq_value_stream_id | Flow mapping |
| Staff Member | STAFF | 12 | seq_staff_id | Personnel |
| Opportunity | OPP | 10 | seq_opportunity_id | Improvements |
| Product | PROD | 11 | seq_product_id | Deliverables |
Prefix Governance Rules:
- Minimum 2 characters, maximum 6 characters
- Levenshtein distance ≥ 2 from existing prefixes
- Reserved: SYS, ADM, INT, API, DB (system use)
- New prefixes require Architecture Review Board approval
ID Validation at API Boundary
FastAPI Request Validation:
# backend/schemas/base.py
from pydantic import field_validator
class EntityIDMixin:
@field_validator('id', mode='before')
@classmethod
def validate_entity_id(cls, v: str) -> str:
if not v:
return v
if not re.match(r'^[A-Z]{2,6}-\d{6}$', v):
raise ValueError(
f"Invalid ID format: {v}. Expected PREFIX-NNNNNN "
f"(e.g., REQ-000001)"
)
return v.upper()
API Route Validation:
# backend/api/v1/requirements.py
@router.get("/{requirement_id}")
def get_requirement(
requirement_id: str = Path(
...,
regex=r'^REQ-\d{6}$',
description="Requirement ID in format REQ-NNNNNN"
),
db: Session = Depends(get_db)
):
"""Get requirement with validated ID format."""
...
Import/Export Validation:
def validate_import_ids(data: list[dict], entity_type: str) -> list[str]:
"""Validate IDs in import data, return list of errors."""
errors = []
expected_prefix = ID_PREFIXES.get(entity_type)
for i, row in enumerate(data):
if 'id' in row and not validate_id_format(row['id'], expected_prefix):
errors.append(f"Row {i}: Invalid ID format '{row['id']}'")
return errors
References
- REQ-320: Generic Entity Patterns requirement
- Industry patterns: JIRA issue keys, GitHub issue numbers
- PostgreSQL Sequences
ADR-002 | Database Layer | Implemented | APPROVED_WITH_MODS Completed