ADR-020: MCP Server v2 with FastMCP

Status

Implemented

Date

2025-01-16 (Retrospective)

Decision Makers

MCP Team - Server architecture
Architecture Team - Integration patterns

Layer

MCP

ADR-021: JWT for MCP Authentication
ADR-039: Workflow Tools
ADR-040: Client Proxy

Supersedes

MCP Server v1 (internal implementation)

Depends On

ADR-008: FastAPI with Pydantic

Context

The Model Context Protocol (MCP) server enables AI assistants to interact with the platform:

Tool Exposure: 28 generic tools for entity operations
AI Integration: Claude, GPT, and other LLM access
Remote Access: Secure access from external clients
Workflow Support: Complex multi-entity operations
Audit Trail: Track all AI operations

Requirements:

Modern async framework
Type-safe tool definitions
JWT authentication
Comprehensive audit logging
Docker containerized deployment

Decision

We adopt FastMCP as the MCP server framework:

Key Design Decisions

FastMCP Framework: Modern Python MCP implementation
28 Generic Tools: CRUD + specialized operations
JWT Authentication: Separate from UI OAuth
Subprocess Architecture: Isolated execution
SSE Transport: Server-Sent Events for streaming

Tool Categories

Category	Tools	Description
Entity CRUD	6	create, read, update, delete, list, search
Relationships	4	link, unlink, get_related, graph
Bulk Operations	3	bulk_create, bulk_update, bulk_delete
Semantic	3	semantic_search, find_similar, categorize
Workflow	11	Complex multi-entity orchestration
System	1	health_check

Architecture

Client (Claude Desktop)
    ↓ SSE/HTTP
MCP Proxy (Docker)
    ↓ Internal
MCP Server (FastMCP)
    ↓ JWT Auth
Backend API
    ↓
Database

Tool Definition Example

@mcp.tool()
async def create_requirement(
    title: str,
    description: str = "",
    type: str = "Functional",
    priority: str = "Medium",
    ctx: Context = None,
) -> dict:
    """Create a new requirement.

    Args:
        title: Requirement title
        description: Detailed description
        type: Functional, Non-Functional, or Constraint
        priority: Low, Medium, High, or Critical

    Returns:
        Created requirement with ID
    """
    # Implementation

Consequences

Positive

Modern Framework: Async-native, type-safe
Rich Tooling: 28 tools cover all operations
Isolated Execution: Container separation
Audit Trail: All operations logged
Claude Integration: Native Claude Desktop support

Negative

Protocol Complexity: MCP learning curve
Debugging: Distributed debugging harder
Version Sync: Tools must match backend
Container Overhead: Additional deployment

Neutral

SSE Limitations: One-way streaming
Tool Discovery: Clients must refresh tools

Alternatives Considered

1. Custom Protocol

Approach: Build proprietary tool protocol
Rejected: MCP is becoming standard

2. OpenAI Function Calling

Approach: Use OpenAI-specific approach
Rejected: Vendor lock-in

3. Direct API Access

Approach: AI clients call API directly
Rejected: Less control, no tool abstraction

Implementation Status

Implementation Details

MCP Server: backend/api/mcp_config.py
Tool Definitions: backend/api/v1/mcp_*.py
SSE Transport: backend/api/v1/mcp_sse.py
Workflow Tools: backend/services/mcp_workflow_tools.py
Audit Logger: backend/services/mcp_audit_logger.py
Docker: docker/mcp-proxy/

Compliance/Validation

Automated checks: Tool response validation
Manual review: New tools reviewed for security
Metrics: Tool invocation count, latency

LLM Council Review

Review Date: 2025-01-16 Confidence Level: High (100%) Verdict: CONDITIONALLY ACCEPTED

Quality Metrics

Consensus Strength Score (CSS): 0.88
Deliberation Depth Index (DDI): 0.92

Council Feedback Summary

FastMCP and subprocess isolation are excellent choices, but 28 generic tools pose significant risk for AI reliability (context saturation) and security. The security model requires RBAC beyond simple JWTs.

Key Concerns Identified:

28 Tools Too Granular: Generic CRUD tools confuse LLMs and increase accidental destructive actions
Context Pollution: 28 complex schemas clog context window, increasing latency and cost
Missing RBAC/Scopes: JWT should not grant root access to all tools
Subprocess Overhead: Spawn-per-call will kill performance; need worker pools
SSE Limitations: Proxies timeout, browser connection limits impede dashboards

Required Modifications:

Consolidate to ~15 Intent-Based Tools:
- Replace generic CRUD with domain-specific actions (remediate_alert, incident_triage)
- Merge CRUD verbs into unified tools with operation parameter
Security Model:
- Tool-Level Authorization: Check permissions inside tool based on JWT scopes
- Human-in-the-Loop (HITL): Require confirmation for Bulk/Delete operations
- Dry-Run Default: All mutation tools default to dry_run=True
Worker Pools: Pre-warmed processes instead of spawn-per-call
SSE Hardening:
- Implement heartbeats and reconnection logic (Last-Event-ID)
- Document matching POST endpoint for client-to-server
Prompt Injection Guardrails: Safeguards against malicious log content

Modifications Applied

Documented tool consolidation strategy
Added scope-based authorization requirement
Documented HITL for destructive operations
Added worker pool architecture
Documented prompt injection mitigation

Council Ranking

claude-opus-4.5: Best Response (security model)
gpt-5.2: Strong (tool consolidation)
gemini-3-pro: Good (SSE hardening)

References

ADR-020 | MCP Layer | Implemented

Status​

Date​

Decision Makers​

Layer​

Related ADRs​

Supersedes​

Depends On​

Context​

Decision​

Key Design Decisions​

Tool Categories​

Architecture​

Tool Definition Example​

Consequences​

Positive​

Negative​

Neutral​

Alternatives Considered​

1. Custom Protocol​

2. OpenAI Function Calling​

3. Direct API Access​

Implementation Status​

Implementation Details​

Compliance/Validation​

LLM Council Review​

Quality Metrics​

Council Feedback Summary​

Key Concerns Identified:​

Required Modifications:​

Modifications Applied​

Council Ranking​

References​