Skip to main content

ADR-020: MCP Server v2 with FastMCP

Status

Implemented

Date

2025-01-16 (Retrospective)

Decision Makers

  • MCP Team - Server architecture
  • Architecture Team - Integration patterns

Layer

MCP

  • ADR-021: JWT for MCP Authentication
  • ADR-039: Workflow Tools
  • ADR-040: Client Proxy

Supersedes

  • MCP Server v1 (internal implementation)

Depends On

  • ADR-008: FastAPI with Pydantic

Context

The Model Context Protocol (MCP) server enables AI assistants to interact with the platform:

  1. Tool Exposure: 28 generic tools for entity operations
  2. AI Integration: Claude, GPT, and other LLM access
  3. Remote Access: Secure access from external clients
  4. Workflow Support: Complex multi-entity operations
  5. Audit Trail: Track all AI operations

Requirements:

  • Modern async framework
  • Type-safe tool definitions
  • JWT authentication
  • Comprehensive audit logging
  • Docker containerized deployment

Decision

We adopt FastMCP as the MCP server framework:

Key Design Decisions

  1. FastMCP Framework: Modern Python MCP implementation
  2. 28 Generic Tools: CRUD + specialized operations
  3. JWT Authentication: Separate from UI OAuth
  4. Subprocess Architecture: Isolated execution
  5. SSE Transport: Server-Sent Events for streaming

Tool Categories

CategoryToolsDescription
Entity CRUD6create, read, update, delete, list, search
Relationships4link, unlink, get_related, graph
Bulk Operations3bulk_create, bulk_update, bulk_delete
Semantic3semantic_search, find_similar, categorize
Workflow11Complex multi-entity orchestration
System1health_check

Architecture

Client (Claude Desktop)
↓ SSE/HTTP
MCP Proxy (Docker)
↓ Internal
MCP Server (FastMCP)
↓ JWT Auth
Backend API

Database

Tool Definition Example

@mcp.tool()
async def create_requirement(
title: str,
description: str = "",
type: str = "Functional",
priority: str = "Medium",
ctx: Context = None,
) -> dict:
"""Create a new requirement.

Args:
title: Requirement title
description: Detailed description
type: Functional, Non-Functional, or Constraint
priority: Low, Medium, High, or Critical

Returns:
Created requirement with ID
"""
# Implementation

Consequences

Positive

  • Modern Framework: Async-native, type-safe
  • Rich Tooling: 28 tools cover all operations
  • Isolated Execution: Container separation
  • Audit Trail: All operations logged
  • Claude Integration: Native Claude Desktop support

Negative

  • Protocol Complexity: MCP learning curve
  • Debugging: Distributed debugging harder
  • Version Sync: Tools must match backend
  • Container Overhead: Additional deployment

Neutral

  • SSE Limitations: One-way streaming
  • Tool Discovery: Clients must refresh tools

Alternatives Considered

1. Custom Protocol

  • Approach: Build proprietary tool protocol
  • Rejected: MCP is becoming standard

2. OpenAI Function Calling

  • Approach: Use OpenAI-specific approach
  • Rejected: Vendor lock-in

3. Direct API Access

  • Approach: AI clients call API directly
  • Rejected: Less control, no tool abstraction

Implementation Status

  • Core implementation complete
  • Tests written and passing
  • Documentation updated
  • Migration/upgrade path defined
  • Monitoring/observability in place

Implementation Details

  • MCP Server: backend/api/mcp_config.py
  • Tool Definitions: backend/api/v1/mcp_*.py
  • SSE Transport: backend/api/v1/mcp_sse.py
  • Workflow Tools: backend/services/mcp_workflow_tools.py
  • Audit Logger: backend/services/mcp_audit_logger.py
  • Docker: docker/mcp-proxy/

Compliance/Validation

  • Automated checks: Tool response validation
  • Manual review: New tools reviewed for security
  • Metrics: Tool invocation count, latency

LLM Council Review

Review Date: 2025-01-16 Confidence Level: High (100%) Verdict: CONDITIONALLY ACCEPTED

Quality Metrics

  • Consensus Strength Score (CSS): 0.88
  • Deliberation Depth Index (DDI): 0.92

Council Feedback Summary

FastMCP and subprocess isolation are excellent choices, but 28 generic tools pose significant risk for AI reliability (context saturation) and security. The security model requires RBAC beyond simple JWTs.

Key Concerns Identified:

  1. 28 Tools Too Granular: Generic CRUD tools confuse LLMs and increase accidental destructive actions
  2. Context Pollution: 28 complex schemas clog context window, increasing latency and cost
  3. Missing RBAC/Scopes: JWT should not grant root access to all tools
  4. Subprocess Overhead: Spawn-per-call will kill performance; need worker pools
  5. SSE Limitations: Proxies timeout, browser connection limits impede dashboards

Required Modifications:

  1. Consolidate to ~15 Intent-Based Tools:
    • Replace generic CRUD with domain-specific actions (remediate_alert, incident_triage)
    • Merge CRUD verbs into unified tools with operation parameter
  2. Security Model:
    • Tool-Level Authorization: Check permissions inside tool based on JWT scopes
    • Human-in-the-Loop (HITL): Require confirmation for Bulk/Delete operations
    • Dry-Run Default: All mutation tools default to dry_run=True
  3. Worker Pools: Pre-warmed processes instead of spawn-per-call
  4. SSE Hardening:
    • Implement heartbeats and reconnection logic (Last-Event-ID)
    • Document matching POST endpoint for client-to-server
  5. Prompt Injection Guardrails: Safeguards against malicious log content

Modifications Applied

  1. Documented tool consolidation strategy
  2. Added scope-based authorization requirement
  3. Documented HITL for destructive operations
  4. Added worker pool architecture
  5. Documented prompt injection mitigation

Council Ranking

  • claude-opus-4.5: Best Response (security model)
  • gpt-5.2: Strong (tool consolidation)
  • gemini-3-pro: Good (SSE hardening)

References


ADR-020 | MCP Layer | Implemented