MCP Bidirectional Traffic: Fixing SSE Buffering and Rate Limits

February 10, 2026 · 4 min read

Published: 2025-01-16

When the LLM Council reviewed our MCP client proxy (ADR-040), they identified a critical gap: our nginx configuration was buffering SSE responses, causing tool execution to hang. Additionally, our standard API rate limits (60 req/min) were breaking MCP negotiation, which is inherently chatty.

This post details how Issue #460 fixed these SSE streaming issues.

The Problem

Problem 1: nginx Buffering Blocks SSE

Default nginx proxy configuration buffers responses:

# Default behavior (problematic for SSE)
location /api/ {
    proxy_pass http://backend;
    # proxy_buffering is ON by default!
}

When SSE events are buffered, they arrive in bursts instead of real-time. For MCP:

Tool execution appears to hang for seconds
Timeouts during long-running operations
Poor user experience in Claude Desktop

Problem 2: Rate Limits Break MCP Negotiation

MCP protocol is chatty during initialization:

Capabilities exchange
Tool listing
Prompt listing
Resource queries

Our standard 60 req/min limit triggered during normal MCP negotiation, causing connection failures.

The Solution

1. nginx SSE Location Block

Added dedicated location block for MCP SSE endpoints:

# MCP SSE endpoint - MUST be before generic /api/
location ~ ^/api/v1/mcp/(sse|message) {
    proxy_pass ${API_PROXY_URL};
    proxy_http_version 1.1;

    # Disable buffering for SSE (critical for real-time events)
    proxy_buffering off;
    proxy_cache off;
    proxy_set_header X-Accel-Buffering "no";

    # Extended timeouts for long-running SSE (4 hours)
    proxy_read_timeout 14400s;
    proxy_send_timeout 14400s;

    # Connection headers for SSE
    proxy_set_header Connection '';
    chunked_transfer_encoding on;
}

Key configurations:

proxy_buffering off: Events stream immediately
X-Accel-Buffering: no: Header for upstream servers
14400s timeouts: 4-hour sessions for long operations
Connection '': Prevent connection header interference

2. Split Rate Limiting

Created separate rate limiters for SSE and messages:

# SSE: Connection-based limit (5 concurrent per user)
class MCPSSERateLimiter:
    def __init__(self, max_connections: int = 5):
        self.max_connections = max_connections

    async def acquire(self, user_id: str, connection_id: str) -> bool:
        """Acquire a connection slot."""
        # Uses Redis SET for atomic connection counting

# Messages: Token bucket (200 req/min, 50 burst)
class MCPMessageRateLimiter:
    def __init__(self, rate_limit: int = 200, burst_limit: int = 50):
        self.rate_limit = rate_limit
        self.burst_limit = burst_limit

    async def check(self, user_id: str) -> bool:
        """Check if message is allowed."""
        # Uses Redis token bucket algorithm

Why split limits?

Endpoint	Limit Type	Value	Reason
SSE `/sse`	Concurrent	5 per user	Long-lived connections, prevent resource exhaustion
POST `/message/{id}`	Token bucket	200/min, 50 burst	Handle chatty negotiation, allow burst

3. Extended Session TTL

MCP sessions now have 4-hour TTL to match nginx timeouts:

# backend/api/v1/mcp_sse.py
MCP_SESSION_TTL_SECONDS = 14400  # 4 hours
SSE_READ_TIMEOUT_SECONDS = 14400  # Matches nginx config

Implementation Details

Rate Limiter Storage

Uses Redis DB 4 (separate from API rate limiting DB 3):

MCP_RATE_LIMIT_DB = 4

# SSE: Uses Redis SET for connection tracking
key = f"mcp_sse:{user_id}:connections"
# SET contains active connection_ids

# Messages: Uses Redis HASH for token bucket
key = f"mcp_msg:{user_id}"
# HASH contains {tokens: N, last_refill: timestamp}

Rate Limiter Release

Crucial: Release SSE slot when connection closes:

async def remove_connection(self, connection_id: str):
    # ... disconnect logic ...

    # Release the SSE rate limit slot
    user_key = connection.user_info.email
    await sse_limiter.release(user_key, connection_id)

Without this, users would exhaust their connection limit and be unable to reconnect.

Impact

Metric	Before	After
SSE event latency	Buffered (seconds)	Real-time (<100ms)
MCP negotiation	Often rate limited	Reliable
Session duration	30 minutes	4 hours
Concurrent connections	No limit	5 per user
ADR-040 verdict	CONDITIONAL	APPROVED

Lessons Learned

SSE needs special handling: Standard proxy configs don't work for SSE
Different endpoints, different limits: API rate limits don't fit all protocols
Match timeouts end-to-end: nginx, backend, and client must agree
Resource cleanup matters: Release rate limit slots on disconnect

Issue #460 | ADR-040 | LLM Council Blocking Issue Resolved

The Problem​

Problem 1: nginx Buffering Blocks SSE​

Problem 2: Rate Limits Break MCP Negotiation​

The Solution​

1. nginx SSE Location Block​

2. Split Rate Limiting​

3. Extended Session TTL​

Implementation Details​

Rate Limiter Storage​

Rate Limiter Release​

Impact​

Lessons Learned​