ADR-020: MCP Server v2 with FastMCP
Status
Implemented
Date
2025-01-16 (Retrospective)
Decision Makers
- MCP Team - Server architecture
- Architecture Team - Integration patterns
Layer
MCP
Related ADRs
- ADR-021: JWT for MCP Authentication
- ADR-039: Workflow Tools
- ADR-040: Client Proxy
Supersedes
- MCP Server v1 (internal implementation)
Depends On
- ADR-008: FastAPI with Pydantic
Context
The Model Context Protocol (MCP) server enables AI assistants to interact with the platform:
- Tool Exposure: 28 generic tools for entity operations
- AI Integration: Claude, GPT, and other LLM access
- Remote Access: Secure access from external clients
- Workflow Support: Complex multi-entity operations
- Audit Trail: Track all AI operations
Requirements:
- Modern async framework
- Type-safe tool definitions
- JWT authentication
- Comprehensive audit logging
- Docker containerized deployment
Decision
We adopt FastMCP as the MCP server framework:
Key Design Decisions
- FastMCP Framework: Modern Python MCP implementation
- 28 Generic Tools: CRUD + specialized operations
- JWT Authentication: Separate from UI OAuth
- Subprocess Architecture: Isolated execution
- SSE Transport: Server-Sent Events for streaming
Tool Categories
| Category | Tools | Description |
|---|---|---|
| Entity CRUD | 6 | create, read, update, delete, list, search |
| Relationships | 4 | link, unlink, get_related, graph |
| Bulk Operations | 3 | bulk_create, bulk_update, bulk_delete |
| Semantic | 3 | semantic_search, find_similar, categorize |
| Workflow | 11 | Complex multi-entity orchestration |
| System | 1 | health_check |
Architecture
Client (Claude Desktop)
↓ SSE/HTTP
MCP Proxy (Docker)
↓ Internal
MCP Server (FastMCP)
↓ JWT Auth
Backend API
↓
Database
Tool Definition Example
@mcp.tool()
async def create_requirement(
title: str,
description: str = "",
type: str = "Functional",
priority: str = "Medium",
ctx: Context = None,
) -> dict:
"""Create a new requirement.
Args:
title: Requirement title
description: Detailed description
type: Functional, Non-Functional, or Constraint
priority: Low, Medium, High, or Critical
Returns:
Created requirement with ID
"""
# Implementation
Consequences
Positive
- Modern Framework: Async-native, type-safe
- Rich Tooling: 28 tools cover all operations
- Isolated Execution: Container separation
- Audit Trail: All operations logged
- Claude Integration: Native Claude Desktop support
Negative
- Protocol Complexity: MCP learning curve
- Debugging: Distributed debugging harder
- Version Sync: Tools must match backend
- Container Overhead: Additional deployment
Neutral
- SSE Limitations: One-way streaming
- Tool Discovery: Clients must refresh tools
Alternatives Considered
1. Custom Protocol
- Approach: Build proprietary tool protocol
- Rejected: MCP is becoming standard
2. OpenAI Function Calling
- Approach: Use OpenAI-specific approach
- Rejected: Vendor lock-in
3. Direct API Access
- Approach: AI clients call API directly
- Rejected: Less control, no tool abstraction
Implementation Status
- Core implementation complete
- Tests written and passing
- Documentation updated
- Migration/upgrade path defined
- Monitoring/observability in place
Implementation Details
- MCP Server:
backend/api/mcp_config.py - Tool Definitions:
backend/api/v1/mcp_*.py - SSE Transport:
backend/api/v1/mcp_sse.py - Workflow Tools:
backend/services/mcp_workflow_tools.py - Audit Logger:
backend/services/mcp_audit_logger.py - Docker:
docker/mcp-proxy/
Compliance/Validation
- Automated checks: Tool response validation
- Manual review: New tools reviewed for security
- Metrics: Tool invocation count, latency
LLM Council Review
Review Date: 2025-01-16 Confidence Level: High (100%) Verdict: CONDITIONALLY ACCEPTED
Quality Metrics
- Consensus Strength Score (CSS): 0.88
- Deliberation Depth Index (DDI): 0.92
Council Feedback Summary
FastMCP and subprocess isolation are excellent choices, but 28 generic tools pose significant risk for AI reliability (context saturation) and security. The security model requires RBAC beyond simple JWTs.
Key Concerns Identified:
- 28 Tools Too Granular: Generic CRUD tools confuse LLMs and increase accidental destructive actions
- Context Pollution: 28 complex schemas clog context window, increasing latency and cost
- Missing RBAC/Scopes: JWT should not grant root access to all tools
- Subprocess Overhead: Spawn-per-call will kill performance; need worker pools
- SSE Limitations: Proxies timeout, browser connection limits impede dashboards
Required Modifications:
- Consolidate to ~15 Intent-Based Tools:
- Replace generic CRUD with domain-specific actions (
remediate_alert,incident_triage) - Merge CRUD verbs into unified tools with operation parameter
- Replace generic CRUD with domain-specific actions (
- Security Model:
- Tool-Level Authorization: Check permissions inside tool based on JWT scopes
- Human-in-the-Loop (HITL): Require confirmation for Bulk/Delete operations
- Dry-Run Default: All mutation tools default to
dry_run=True
- Worker Pools: Pre-warmed processes instead of spawn-per-call
- SSE Hardening:
- Implement heartbeats and reconnection logic (Last-Event-ID)
- Document matching POST endpoint for client-to-server
- Prompt Injection Guardrails: Safeguards against malicious log content
Modifications Applied
- Documented tool consolidation strategy
- Added scope-based authorization requirement
- Documented HITL for destructive operations
- Added worker pool architecture
- Documented prompt injection mitigation
Council Ranking
- claude-opus-4.5: Best Response (security model)
- gpt-5.2: Strong (tool consolidation)
- gemini-3-pro: Good (SSE hardening)
References
- Model Context Protocol
- FastMCP Documentation
/docs/mcp/
ADR-020 | MCP Layer | Implemented