ADR-007: LLM Integration Architecture

Status

Implemented (Phases 1-4 Complete as of v4.15.0 - 2026-02-03)

Implementation Status

Phase	Version	Status	Description
Phase 1	v4.11.0	✅ Complete	Agent Skills, MCP ReadOnly tools, Chat UI
Phase 1B	v4.11.0	✅ Complete	ReadOnly MCP tools for state visibility
Phase 2	v4.12.0	✅ Complete	Plan/Apply workflow, TOCTOU protection
Phase 3	v4.13.0-v4.15.0	✅ Complete	All 5 providers, real SSE streaming, batch ops, conversation persistence (P3-05), cost tracking UI (P3-06)
Phase 4	v4.14.0	✅ Complete	HardwareIO tier, audit logging, rate limiting, undo/redo
Phase 5	Future	Pending	OAuth2 remote access, A2A protocol, conversation optimization

Original Approval: LLM Council Reviewed - 2026-01-31

Context

Conductor is a mature multi-protocol input mapping system with a Tauri GUI, daemon infrastructure, and comprehensive configuration capabilities. Users currently configure devices, modes, mappings, profiles, and plugins through the GUI or by editing TOML files directly.

Problem Statement

Complex Configuration: Setting up sophisticated mappings requires understanding trigger types, action parameters, velocity curves, and conditional logic
No Natural Language Interface: Users cannot describe their desired behavior in plain language
Limited Automation: No way for AI assistants to help configure the system
Fragmented Tooling: External LLMs cannot interact with a running Conductor daemon

Requirements

R1: Integrate a chat interface into the GUI enabling natural language discussion of requirements and mappings

R2: Enable LLMs to manage MIDI capture, and configure devices, modes, mappings, profiles, plugins, and settings

R3: Support multiple LLM providers: OpenAI, Anthropic, Google, and routers like OpenRouter and LiteLLM

R4: Enable external LLMs to control Conductor via MCP (Model Context Protocol), A2A (Agent-to-Agent), and skills.md

R5: Maintain security boundaries - LLMs should not have unrestricted shell access

Decision

Skills vs MCP: Complementary Architectures

Conductor implements both Agent Skills and MCP because they serve fundamentally different purposes:

Aspect	Agent Skills (SKILL.md)	MCP Tools
Purpose	Knowledge & expertise ("how to")	Capabilities & actions ("what to do")
Analogy	"Brain and playbook"	"Arms and legs"
Token Cost	~100 tokens (metadata) to ~5K (full skill)	10K-50K+ tokens (tool schemas)
Security	Prompt-based, sandboxed by agent	Requires auth, API boundaries
Cross-platform	Claude Code, Copilot, Cursor, Codex	MCP-compatible clients
Example	"How to create a velocity-sensitive mapping"	`conductor_create_mapping(...)`

Why both? "Skills without MCP are well-written instructions. MCP without Skills is raw power with no guidance." Skills teach the agent Conductor's domain concepts (triggers, actions, patterns), while MCP provides the actual execution capabilities.

Architecture Overview

┌─────────────────────────────────────────────────────────────────────────────┐
│                           CONDUCTOR ECOSYSTEM                                │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  ┌─────────────────────────────────────────────────────────────────────┐   │
│  │                      CONDUCTOR GUI (Tauri)                           │   │
│  │  ┌─────────────────┐    ┌─────────────────────────────────────────┐ │   │
│  │  │   Chat Panel    │    │           Configuration Views           │ │   │
│  │  │  ┌───────────┐  │    │  Devices │ Modes │ Mappings │ Profiles  │ │   │
│  │  │  │ Messages  │  │    └─────────────────────────────────────────┘ │   │
│  │  │  │ [User]    │  │                      ↑                         │   │
│  │  │  │ [LLM]     │  │                      │ Config Updates          │   │
│  │  │  │ [System]  │  │    ┌─────────────────┴───────────────────────┐ │   │
│  │  │  └───────────┘  │    │         LLM Agent Controller            │ │   │
│  │  │  ┌───────────┐  │    │  - Tool execution                       │ │   │
│  │  │  │  Input    │◄─┼───►│  - Config generation                    │ │   │
│  │  │  │  [Send]   │  │    │  - MIDI Learn coordination              │ │   │
│  │  │  └───────────┘  │    │  - Validation & preview                 │ │   │
│  │  └─────────────────┘    └─────────────────────────────────────────┘ │   │
│  │           │                              │                           │   │
│  │           │ Tauri Events                 │ Tauri Commands            │   │
│  │           ▼                              ▼                           │   │
│  │  ┌─────────────────────────────────────────────────────────────────┐ │   │
│  │  │                    LLM Provider Abstraction                     │ │   │
│  │  │  ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────────────┐   │ │   │
│  │  │  │ OpenAI   │ │Anthropic │ │ Google   │ │ OpenRouter/Lite  │   │ │   │
│  │  │  │ GPT-4    │ │ Claude   │ │ Gemini   │ │ LLM Router       │   │ │   │
│  │  │  └──────────┘ └──────────┘ └──────────┘ └──────────────────────┘   │ │   │
│  │  └─────────────────────────────────────────────────────────────────┘ │   │
│  └─────────────────────────────────────────────────────────────────────┘   │
│                                      │                                      │
│                                      │ IPC (Unix Socket)                    │
│                                      ▼                                      │
│  ┌─────────────────────────────────────────────────────────────────────┐   │
│  │                      CONDUCTOR DAEMON                                │   │
│  │  ┌─────────────────────────────────────────────────────────────────┐│   │
│  │  │                    Engine Manager                                ││   │
│  │  │  - Config management    - Event processing                       ││   │
│  │  │  - MIDI Learn mode      - Action execution                       ││   │
│  │  │  - Device management    - Mode switching                         ││   │
│  │  └─────────────────────────────────────────────────────────────────┘│   │
│  │                              │                                       │   │
│  │  ┌───────────────────────────┴───────────────────────────────────┐  │   │
│  │  │                    MCP Server (New)                            │  │   │
│  │  │  - Tool exposure for external LLMs                             │  │   │
│  │  │  - A2A protocol support                                        │  │   │
│  │  │  - skills.md generation                                        │  │   │
│  │  └───────────────────────────────────────────────────────────────┘  │   │
│  └─────────────────────────────────────────────────────────────────────┘   │
│                                      ▲                                      │
│                                      │ MCP/A2A/HTTP                         │
│                                      │                                      │
│  ┌─────────────────────────────────────────────────────────────────────┐   │
│  │                    EXTERNAL LLM CLIENTS                              │   │
│  │  ┌──────────────┐  ┌──────────────┐  ┌──────────────────────────┐   │   │
│  │  │ Claude Code  │  │ Cursor/Cody  │  │ Custom MCP Clients       │   │   │
│  │  │ (via MCP)    │  │ (via MCP)    │  │ (A2A Protocol)           │   │   │
│  │  └──────────────┘  └──────────────┘  └──────────────────────────┘   │   │
│  └─────────────────────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────────────────────┘

Component Design

1. LLM Provider Abstraction Layer

Location: conductor-gui/src-tauri/src/llm/

/// Unified trait for LLM providers
#[async_trait]
pub trait LLMProvider: Send + Sync {
    /// Send a message and get a response (with tool use support)
    async fn chat(&self, request: ChatRequest) -> Result<ChatResponse, LLMError>;

    /// Stream a response for real-time display
    async fn chat_stream(&self, request: ChatRequest) -> Result<ChatStream, LLMError>;

    /// Check provider health/availability
    async fn health_check(&self) -> Result<ProviderStatus, LLMError>;

    /// Get provider capabilities (tool use, vision, etc.)
    fn capabilities(&self) -> ProviderCapabilities;
}

/// Provider configuration
#[derive(Clone, Serialize, Deserialize)]
pub struct ProviderConfig {
    pub provider_type: ProviderType,
    pub api_key: SecretString,        // Encrypted in config
    pub base_url: Option<String>,     // For OpenRouter/LiteLLM
    pub model: String,
    pub max_tokens: u32,
    pub temperature: f32,
}

#[derive(Clone, Serialize, Deserialize)]
pub enum ProviderType {
    OpenAI,
    Anthropic,
    Google,
    OpenRouter,
    LiteLLM,
    Custom { base_url: String },
}

Supported Providers:

Provider	Models	Tool Use	Streaming	Notes
OpenAI	GPT-4, GPT-4o	Yes	Yes	Native function calling
Anthropic	Claude 3.5, Claude 4	Yes	Yes	Native tool use
Google	Gemini Pro, Ultra	Yes	Yes	Function declarations
OpenRouter	100+ models	Varies	Yes	Unified API, model routing
LiteLLM	Local/Proxy	Varies	Yes	Self-hosted, privacy-focused

Capability Negotiation (LLM Council Recommendation):

Runtime detection of provider capabilities enables graceful degradation when features aren't available:

/// Provider capabilities detected at runtime
#[derive(Clone, Default, Serialize, Deserialize)]
pub struct ProviderCapabilities {
    /// Supports tool/function calling
    pub tool_use: bool,

    /// Supports streaming responses
    pub streaming: bool,

    /// Supports vision/image input
    pub vision: bool,

    /// Maximum context window (tokens)
    pub max_context: u32,

    /// Maximum output tokens
    pub max_output: u32,

    /// Supports structured JSON output
    pub json_mode: bool,

    /// Cost per 1K input tokens (for tracking)
    pub input_cost_per_1k: Option<f64>,

    /// Cost per 1K output tokens
    pub output_cost_per_1k: Option<f64>,
}

impl LLMProvider {
    /// Probe provider to detect actual capabilities
    async fn detect_capabilities(&self) -> ProviderCapabilities {
        // Try a test message with tool use
        let tool_use = self.test_tool_call().await.is_ok();

        // Check streaming support
        let streaming = self.test_streaming().await.is_ok();

        // Get model info from provider API if available
        let model_info = self.get_model_info().await.ok();

        ProviderCapabilities {
            tool_use,
            streaming,
            vision: model_info.map(|i| i.vision).unwrap_or(false),
            max_context: model_info.map(|i| i.context_window).unwrap_or(4096),
            max_output: model_info.map(|i| i.max_output).unwrap_or(4096),
            json_mode: model_info.map(|i| i.json_mode).unwrap_or(false),
            input_cost_per_1k: model_info.and_then(|i| i.pricing.input),
            output_cost_per_1k: model_info.and_then(|i| i.pricing.output),
        }
    }
}

/// Graceful degradation when capabilities are missing
pub struct LLMAgentController {
    provider: Box<dyn LLMProvider>,
    capabilities: ProviderCapabilities,
}

impl LLMAgentController {
    pub async fn chat(&self, request: ChatRequest) -> Result<ChatResponse, LLMError> {
        if !self.capabilities.tool_use {
            // Fall back to instruction-based prompting
            return self.chat_without_tools(request).await;
        }

        if !self.capabilities.streaming && request.stream {
            // Fall back to non-streaming with progress simulation
            return self.chat_with_simulated_streaming(request).await;
        }

        self.provider.chat(request).await
    }
}

2. Tool System for LLM Actions

Location: conductor-gui/src-tauri/src/llm/tools.rs

/// Tools exposed to the LLM for Conductor control
pub enum ConductorTool {
    // Device Management
    ListDevices,
    ConnectDevice { port: usize },
    DisconnectDevice,

    // MIDI Learning
    StartMidiLearn,
    StopMidiLearn,
    GetCapturedEvents,

    // Configuration
    GetConfig,
    GetModes,
    CreateMode { name: String, color: Option<String> },
    DeleteMode { name: String },

    // Mapping Management
    GetMappings { mode: String },
    CreateMapping { mode: String, trigger: Trigger, action: Action },
    UpdateMapping { mode: String, index: usize, trigger: Trigger, action: Action },
    DeleteMapping { mode: String, index: usize },

    // Profile Management
    ListProfiles,
    CreateProfile { name: String, bundle_id: String },
    UpdateProfile { id: String, mappings: Vec<Mapping> },
    DeleteProfile { id: String },

    // Settings
    GetSettings,
    UpdateSettings { settings: AdvancedSettings },

    // Validation & Preview
    ValidateConfig { config: Config },
    PreviewAction { action: Action },

    // Plugin Management
    ListPlugins,
    EnablePlugin { name: String },
    DisablePlugin { name: String },
}

/// Tool execution result
pub struct ToolResult {
    pub success: bool,
    pub data: Option<serde_json::Value>,
    pub error: Option<String>,
    pub suggestions: Vec<String>,  // Follow-up suggestions for LLM
}

Tool Risk Tiers (LLM Council Recommendation):

Classify tools by impact level for graduated safety controls:

/// Tool risk classification for graduated safety controls
#[derive(Clone, Copy, Debug, Serialize, Deserialize)]
pub enum ToolRiskTier {
    /// Read-only operations - safe to auto-execute
    /// Examples: GetConfig, ListDevices, GetMappings
    ReadOnly,

    /// Stateful but reversible operations (session-scoped)
    /// Examples: StartMidiLearn, StopMidiLearn, ConnectDevice
    /// Note: Creates temporary server-side state that does NOT persist
    Stateful,

    /// Configuration changes - require Plan/Apply pattern
    /// Examples: CreateMapping, UpdateMapping, DeleteMapping
    ConfigChange,

    /// Hardware I/O operations - CRITICAL RISK (LLM Council Addition)
    /// Examples: SendRawSysEx, FirmwareUpdate, DeviceReset
    /// WARNING: SysEx messages can brick hardware! Requires multi-step confirmation.
    HardwareIO,

    /// Privileged operations - require explicit user confirmation
    /// Examples: DeleteProfile, ResetConfig, ExportSecrets
    Privileged,
}

impl ConductorTool {
    pub fn risk_tier(&self) -> ToolRiskTier {
        match self {
            // ReadOnly - auto-execute
            Self::GetConfig | Self::GetModes | Self::GetMappings { .. }
            | Self::ListDevices | Self::ListProfiles | Self::ListPlugins
            | Self::GetSettings | Self::GetCapturedEvents => ToolRiskTier::ReadOnly,

            // Stateful - auto-execute with logging
            Self::StartMidiLearn | Self::StopMidiLearn
            | Self::ConnectDevice { .. } | Self::DisconnectDevice => ToolRiskTier::Stateful,

            // ConfigChange - require Plan/Apply
            Self::CreateMode { .. } | Self::DeleteMode { .. }
            | Self::CreateMapping { .. } | Self::UpdateMapping { .. }
            | Self::DeleteMapping { .. } | Self::CreateProfile { .. }
            | Self::UpdateProfile { .. } | Self::UpdateSettings { .. } => ToolRiskTier::ConfigChange,

            // Privileged - explicit confirmation
            Self::DeleteProfile { .. } | Self::ValidateConfig { .. }
            | Self::PreviewAction { .. } | Self::EnablePlugin { .. }
            | Self::DisablePlugin { .. } => ToolRiskTier::Privileged,
        }
    }
}

Plan/Apply Pattern (LLM Council Critical Recommendation):

For ConfigChange and Privileged operations, use a Plan/Apply pattern where the LLM proposes changes, the UI displays a diff, and the user explicitly confirms before application:

User: "Create a mapping so pad 36 triggers Cmd+C"
                    │
                    ▼
┌─────────────────────────────────────────┐
│           LLM Agent Controller          │
│  1. Parse user intent                   │
│  2. Generate proposed config change     │
│  3. Return Plan (not applied yet)       │
└─────────────────────────────────────────┘
                    │
                    ▼
┌─────────────────────────────────────────┐
│           Plan Preview UI               │
│  ┌─────────────────────────────────┐    │
│  │  + [[modes.mappings]]           │    │
│  │  + trigger = { type = "Note",   │    │
│  │  +            note = 36 }       │    │
│  │  + action = { type = "Keystroke"│    │
│  │  +            keys = ["cmd","c"]│    │
│  │  +          }                   │    │
│  └─────────────────────────────────┘    │
│  [Cancel]           [Apply Changes]     │
└─────────────────────────────────────────┘
                    │
           User clicks [Apply]
                    │
                    ▼
┌─────────────────────────────────────────┐
│           Tool Executor                 │
│  1. Apply validated changes             │
│  2. Emit state update events            │
│  3. Return confirmation                 │
└─────────────────────────────────────────┘

Plan/Apply Data Structures:

/// A proposed configuration change
#[derive(Clone, Serialize, Deserialize)]
pub struct ConfigPlan {
    pub id: Uuid,
    pub description: String,
    pub changes: Vec<ConfigChange>,
    pub created_at: DateTime<Utc>,
    pub expires_at: DateTime<Utc>,  // Auto-expire after 5 minutes
    pub base_state_hash: String,    // CRITICAL: Hash of config at plan time (TOCTOU protection)
    pub touched_files: Vec<PathBuf>, // All files that will be modified
}

#[derive(Clone, Serialize, Deserialize)]
pub enum ConfigChange {
    CreateMapping { mode: String, mapping: Mapping },
    UpdateMapping { mode: String, index: usize, old: Mapping, new: Mapping },
    DeleteMapping { mode: String, index: usize, mapping: Mapping },
    CreateMode { mode: Mode },
    DeleteMode { name: String, mode: Mode },
    UpdateSettings { old: Settings, new: Settings },
    // ... other changes
}

impl ConfigPlan {
    /// Generate human-readable diff for UI display
    pub fn to_diff(&self) -> String { ... }

    /// Validate plan is still applicable (TOCTOU protection)
    pub fn validate(&self, current_config: &Config) -> Result<(), PlanError> {
        let current_hash = current_config.compute_hash();
        if current_hash != self.base_state_hash {
            return Err(PlanError::StaleState {
                message: "Configuration changed since plan was created. Please re-run plan.".into(),
                plan_hash: self.base_state_hash.clone(),
                current_hash,
            });
        }
        // Additional validation...
        Ok(())
    }
}

Plan Expiration Policy (LLM Council Recommendation):

const DEFAULT_PLAN_TTL: Duration = Duration::from_secs(300); // 5 minutes

impl ConfigPlan {
    pub fn is_expired(&self) -> bool {
        Utc::now() > self.expires_at
    }
}

Plans expire after 5 minutes by default (configurable via CONDUCTOR_PLAN_TTL_SECONDS)
Expired plan apply returns clear error message
Rationale: Long-lived plans create false confidence in stale state

**Tool Execution Flow**:

User Message: "Create a mapping so pad 36 triggers Cmd+C" │ ▼ ┌─────────────────────────────────────────┐ │ LLM Agent Controller │ │ 1. Parse user intent │ │ 2. Select tools: StartMidiLearn or │ │ CreateMapping │ │ 3. Generate tool call parameters │ └─────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────┐ │ Transport-Agnostic Tool Executor │ │ 1. Validate parameters │ │ 2. Check tool risk tier │ │ 3. For ReadOnly/Stateful: Execute │ │ 4. For ConfigChange: Return Plan │ │ 5. Emit Tauri Events for state sync │ └─────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────┐ │ LLM Response │ │ "I've prepared a mapping for pad 36 │ │ to trigger Cmd+C. Please review the │ │ changes above and click Apply." │ └─────────────────────────────────────────┘

#### 3. Transport-Agnostic Tool Executor

**Location**: `conductor-gui/src-tauri/src/llm/executor.rs`

A centralized tool executor that works identically whether invoked from the GUI chat or via MCP:

```rust
/// Centralized tool executor - same logic for GUI and MCP
pub struct ToolExecutor &#123;
    config_manager: Arc&lt;RwLock&lt;ConfigManager>>,
    device_manager: Arc&lt;RwLock&lt;DeviceManager>>,
    midi_learn: Arc&lt;RwLock&lt;MidiLearnState>>,
    event_emitter: TauriEventEmitter,
    pending_plans: Arc&lt;RwLock&lt;HashMap&lt;Uuid, ConfigPlan>>>,
&#125;

impl ToolExecutor &#123;
    /// Execute a tool call with risk-tier-appropriate handling
    pub async fn execute(&self, tool: ConductorTool) -> ToolResult &#123;
        let risk_tier = tool.risk_tier();

        // Log all tool executions for audit
        self.log_tool_execution(&tool, risk_tier).await;

        match risk_tier &#123;
            ToolRiskTier::ReadOnly => &#123;
                // Execute immediately, return data
                self.execute_readonly(tool).await
            &#125;
            ToolRiskTier::Stateful => &#123;
                // Execute, emit state change event
                let result = self.execute_stateful(tool).await;
                self.emit_state_change(&result).await;
                result
            &#125;
            ToolRiskTier::ConfigChange => &#123;
                // Generate plan, require user approval
                let plan = self.generate_plan(tool).await?;
                self.pending_plans.write().await.insert(plan.id, plan.clone());
                self.emit_plan_ready(&plan).await;
                ToolResult::plan_pending(plan)
            &#125;
            ToolRiskTier::Privileged => &#123;
                // Generate plan with explicit warning
                let plan = self.generate_plan_with_warning(tool).await?;
                self.pending_plans.write().await.insert(plan.id, plan.clone());
                self.emit_plan_ready(&plan).await;
                ToolResult::plan_pending_privileged(plan)
            &#125;
        &#125;
    &#125;

    /// Apply a previously generated plan (called after user approval)
    pub async fn apply_plan(&self, plan_id: Uuid) -> Result&lt;ToolResult, PlanError> &#123;
        let plan = self.pending_plans.write().await.remove(&plan_id)
            .ok_or(PlanError::NotFound)?;

        // Validate plan is still applicable
        let config = self.config_manager.read().await;
        plan.validate(&config)?;
        drop(config);

        // Apply all changes atomically
        self.apply_changes(&plan.changes).await?;

        // Emit config updated event
        self.emit_config_updated().await;

        Ok(ToolResult::success_with_message(
            format!("Applied &#123;&#125; changes", plan.changes.len())
        ))
    &#125;

    /// Reject a pending plan
    pub async fn reject_plan(&self, plan_id: Uuid) -> Result&lt;(), PlanError> &#123;
        self.pending_plans.write().await.remove(&plan_id);
        Ok(())
    &#125;
&#125;

Wrapper for MCP:

/// MCP handler wraps the same ToolExecutor
pub struct McpToolHandler &#123;
    executor: Arc&lt;ToolExecutor>,
&#125;

impl McpToolHandler &#123;
    pub async fn handle_tool_call(&self, tool_name: &str, params: Value) -> McpResult &#123;
        let tool = ConductorTool::from_mcp(tool_name, params)?;
        let result = self.executor.execute(tool).await;
        result.to_mcp_response()
    &#125;
&#125;

4. State Synchronization via Tauri Events

Location: conductor-gui/src-tauri/src/llm/events.rs

Use Tauri's event system for real-time state synchronization between the LLM agent, GUI, and any external clients:

/// Events emitted by the LLM system
pub enum LlmEvent &#123;
    /// Chat message received/sent
    ChatMessage &#123; message: ChatMessage &#125;,

    /// Tool execution started
    ToolExecutionStarted &#123; tool: String, params: Value &#125;,

    /// Tool execution completed
    ToolExecutionCompleted &#123; tool: String, result: ToolResult &#125;,

    /// Config plan ready for review
    PlanReady &#123; plan: ConfigPlan &#125;,

    /// Config plan applied
    PlanApplied &#123; plan_id: Uuid, changes_count: usize &#125;,

    /// Config plan rejected
    PlanRejected &#123; plan_id: Uuid &#125;,

    /// Configuration updated (sync all views)
    ConfigUpdated &#123; config: Config &#125;,

    /// MIDI Learn state changed
    MidiLearnStateChanged &#123; state: MidiLearnState &#125;,

    /// LLM streaming token
    StreamingToken &#123; token: String &#125;,

    /// LLM provider status changed
    ProviderStatusChanged &#123; provider: String, status: ProviderStatus &#125;,
&#125;

/// Event emitter for Tauri
pub struct TauriEventEmitter &#123;
    app_handle: AppHandle,
&#125;

impl TauriEventEmitter &#123;
    pub fn emit(&self, event: LlmEvent) &#123;
        let event_name = event.event_name();
        let payload = serde_json::to_value(&event).unwrap();
        self.app_handle.emit_all(&event_name, payload).ok();
    &#125;
&#125;

Frontend Event Handling:

<!-- ChatView.svelte -->
&lt;script>
  import &#123; listen &#125; from '@tauri-apps/api/event';
  import &#123; onMount, onDestroy &#125; from 'svelte';

  let unlisten = [];

  onMount(async () => &#123;
    // Listen for plan ready events
    unlisten.push(await listen('llm:plan-ready', (event) => &#123;
      showPlanReviewModal(event.payload.plan);
    &#125;));

    // Listen for config updates (sync all views)
    unlisten.push(await listen('llm:config-updated', (event) => &#123;
      configStore.set(event.payload.config);
    &#125;));

    // Listen for streaming tokens
    unlisten.push(await listen('llm:streaming-token', (event) => &#123;
      chatStore.appendToLastMessage(event.payload.token);
    &#125;));
  &#125;);

  onDestroy(() => &#123;
    unlisten.forEach(fn => fn());
  &#125;);
&lt;/script>

5. MCP Server for External LLM Access (Priority)

Location: conductor-daemon/src/mcp/

The daemon exposes an MCP (Model Context Protocol) server enabling external LLMs (Claude Code, Cursor, etc.) to control Conductor.

/// MCP Server configuration
pub struct McpServerConfig &#123;
    pub enabled: bool,
    pub socket_path: PathBuf,      // Unix socket for local access
    pub http_port: Option&lt;u16>,    // Optional HTTP for remote (with auth)
    pub auth_token: Option&lt;SecretString>,
    pub allowed_tools: Vec&lt;String>, // Tool allowlist
&#125;

/// MCP Tool definitions exposed to external clients
pub fn get_mcp_tools() -> Vec&lt;McpTool> &#123;
    vec![
        McpTool &#123;
            name: "conductor_get_status",
            description: "Get Conductor daemon status including device connection and lifecycle state",
            parameters: json!(&#123;&#125;),
        &#125;,
        McpTool &#123;
            name: "conductor_list_devices",
            description: "List available MIDI and gamepad devices",
            parameters: json!(&#123;&#125;),
        &#125;,
        McpTool &#123;
            name: "conductor_connect_device",
            description: "Connect to a specific MIDI device by port index",
            parameters: json!(&#123;
                "type": "object",
                "properties": &#123;
                    "port": &#123; "type": "integer", "description": "MIDI port index" &#125;
                &#125;,
                "required": ["port"]
            &#125;),
        &#125;,
        McpTool &#123;
            name: "conductor_get_config",
            description: "Get the current Conductor configuration including modes and mappings",
            parameters: json!(&#123;&#125;),
        &#125;,
        McpTool &#123;
            name: "conductor_create_mapping",
            description: "Create a new mapping in the specified mode",
            parameters: json!(&#123;
                "type": "object",
                "properties": &#123;
                    "mode": &#123; "type": "string" &#125;,
                    "trigger": &#123; "$ref": "#/definitions/Trigger" &#125;,
                    "action": &#123; "$ref": "#/definitions/Action" &#125;
                &#125;,
                "required": ["mode", "trigger", "action"]
            &#125;),
        &#125;,
        // ... additional tools
    ]
&#125;

MCP Protocol Flow:

External LLM (e.g., Claude Code)
            │
            │ MCP Initialize
            ▼
┌─────────────────────────────────────────┐
│         Conductor MCP Server            │
│  1. Authenticate (if configured)        │
│  2. Return tool manifest                │
│  3. Handle tool calls                   │
└─────────────────────────────────────────┘
            │
            │ Tool Results
            ▼
External LLM formulates response to user

6. A2A (Agent-to-Agent) Protocol Support (Deferred to Phase 4+)

LLM Council Recommendation: Defer A2A implementation to focus on MCP first. MCP has broader adoption and Claude Code support. A2A can be added later when the protocol stabilizes.

For future agent interoperability beyond MCP, support the emerging A2A protocol:

/// A2A Agent Card (discovery)
pub struct AgentCard &#123;
    pub name: String,                    // "Conductor MIDI Mapper"
    pub description: String,
    pub url: String,                     // Agent endpoint
    pub capabilities: Vec&lt;Capability>,
    pub authentication: AuthMethod,
    pub skills: Vec&lt;Skill>,
&#125;

/// A2A Task execution
pub struct A2ATask &#123;
    pub id: String,
    pub skill: String,
    pub input: serde_json::Value,
    pub context: Option&lt;TaskContext>,
&#125;

/// Skills exposed via A2A
pub fn get_a2a_skills() -> Vec&lt;Skill> &#123;
    vec![
        Skill &#123;
            name: "configure_midi_mapping",
            description: "Configure a MIDI controller mapping from natural language",
            input_schema: json!(&#123;
                "type": "object",
                "properties": &#123;
                    "description": &#123;
                        "type": "string",
                        "description": "Natural language description of desired mapping"
                    &#125;
                &#125;
            &#125;),
        &#125;,
        Skill &#123;
            name: "capture_midi_events",
            description: "Start MIDI capture and return detected events",
            input_schema: json!(&#123;
                "type": "object",
                "properties": &#123;
                    "duration_secs": &#123; "type": "integer", "default": 30 &#125;
                &#125;
            &#125;),
        &#125;,
    ]
&#125;

7. Agent Skills (SKILL.md Standard)

Important Distinction: Agent Skills and MCP serve complementary roles:

Skills = Knowledge packages ("the brain and playbook") - teach how to do things

MCP Tools = Capability interfaces ("the arms and legs") - enable what can be done

"Skills without MCP are well-written instructions. MCP without Skills is raw power with no guidance." — Agent Skills vs MCP comparison

Conductor provides Agent Skills following the agentskills.io specification for cross-platform compatibility with Claude Code, VS Code Copilot, Cursor, Codex CLI, and other MCP-compatible clients.

Location: ~/.conductor/skills/ (user-installable) or bundled with Conductor

Skill Directory Structure

~/.conductor/skills/
├── conductor-midi-mapping/
│   ├── SKILL.md              # Required: instructions + metadata
│   ├── references/
│   │   ├── TRIGGERS.md       # Detailed trigger type reference
│   │   ├── ACTIONS.md        # Detailed action type reference
│   │   └── EXAMPLES.md       # Common mapping patterns
│   └── scripts/
│       └── validate_mapping.py
├── conductor-midi-learn/
│   ├── SKILL.md
│   └── references/
│       └── PATTERNS.md       # Pattern detection guide
├── conductor-device-setup/
│   ├── SKILL.md
│   └── references/
│       └── DEVICES.md        # Supported devices
└── conductor-profile-manager/
    └── SKILL.md

Primary Skill: conductor-midi-mapping/SKILL.md

LLM Council Guidance: Skills should teach judgment and decision-making, not just procedures. Focus on mental models, decision frameworks, and failure modes.

---
name: conductor-midi-mapping
description: >
  Create and manage MIDI controller mappings for Conductor. Use when the user
  wants to configure what happens when they press pads, turn knobs, or move
  faders on their MIDI controller. Handles triggers (Note, VelocityRange,
  LongPress, DoubleTap, Chord, Encoder, CC) and actions (Keystroke, Launch,
  Shell, SendMIDI, ModeChange, Sequence).
license: Apache-2.0
compatibility: Requires Conductor daemon running with MCP server enabled
metadata:
  author: amiable
  version: "4.11.0"
  category: midi
allowed-tools: Bash(conductor:*) Read Write
---

# MIDI Mapping Configuration

You help users create MIDI controller mappings for Conductor, translating natural
language descriptions into precise trigger/action configurations.

## Scope & Non-Goals

**This skill covers:**
- Creating, modifying, and deleting MIDI mappings
- Understanding trigger types and when to use each
- Selecting appropriate actions for user intent

**This skill does NOT cover:**
- OS-level MIDI driver configuration (direct user to system preferences)
- DAW-specific scripting (Ableton Live, Logic Pro internal scripting)
- Hardware firmware updates (NEVER attempt this)
- Raw SysEx message construction (requires explicit HardwareIO tier approval)

## Safety Policy

**CRITICAL RULES - Never violate these:**
1. **Never modify configuration without Plan/Apply** - Always generate a plan first
2. **Never assume MIDI note/CC numbers** - Ask or use MIDI Learn to discover
3. **Never send raw SysEx without explicit user confirmation** - Can brick hardware
4. **Never execute shell commands from user input without sanitization**

## Core Mental Model

MIDI mappings are **routing rules** with optional **transforms**. Think of them as:

Source (what hardware sends) → Transform (how to interpret) → Target (what to control)

### Decision Framework

When creating mappings, resolve these questions **in order**:

**1. Source Identification**
- What device? (May require `conductor_list_devices` if user is vague)
- What message type? (CC for continuous controls, Note for triggers/buttons)
- Is this a **relative encoder** or **absolute fader**? (Critical for transform choice)
  - Clue: "knob" often means encoder; "fader/slider" means absolute
  - When unsure: ASK the user or use MIDI Learn

**2. Trigger Selection**
- Simple press → `Note`
- Velocity-sensitive → `VelocityRange` (ask about soft/medium/hard thresholds)
- Hold behavior → `LongPress` (ask about duration if not specified)
- Quick double-press → `DoubleTap`
- Multiple simultaneous → `NoteChord`
- Continuous control → `CC` or `EncoderTurn`

**3. Action Selection**
- Keyboard shortcut → `Keystroke`
- Launch app → `Launch`
- Script execution → `Shell` (validate path exists!)
- Mode switching → `ModeChange`
- MIDI output → `SendMIDI`

## Common Pitfalls

| Pitfall | Why It Happens | How to Avoid |
|---------|----------------|--------------|
| Wrong CC number | Assumed instead of discovered | Use MIDI Learn or ask user |
| Encoder vs fader confusion | "Knob" is ambiguous | Ask: "Does it spin forever or have endpoints?" |
| Value range mismatch | 0-127 vs 0.0-1.0 | Check target application's expected range |
| Mapping conflict | Same trigger, different actions | Check existing mappings first |

## Execution Rules

**To implement changes, you MUST:**
1. Use `conductor_get_config` to read current state
2. Use `conductor_create_mapping` (returns Plan, not immediate change)
3. Present the Plan diff to user
4. Only proceed when user explicitly approves

**DO NOT:**
- Generate Python scripts to edit config files directly
- Output raw JSON and tell user to paste it
- Bypass Plan/Apply for "simple" changes

## Quick Reference Tables

### Trigger Types

| User Says | Trigger Type | Example |
|-----------|--------------|---------|
| "when I press pad 36" | Note | `{ type: "Note", note: 36 }` |
| "when I hit it hard" | VelocityRange | `{ type: "VelocityRange", note: 36, min_velocity: 100, max_velocity: 127 }` |
| "when I hold the button" | LongPress | `{ type: "LongPress", note: 36, duration_ms: 2000 }` |
| "when I double-tap" | DoubleTap | `{ type: "DoubleTap", note: 36, timeout_ms: 300 }` |
| "when I press multiple pads" | NoteChord | `{ type: "NoteChord", notes: [36, 37, 38] }` |
| "when I turn the knob" | EncoderTurn | `{ type: "EncoderTurn", cc: 16, direction: "any" }` |

### Action Types

| User Says | Action Type | Example |
|-----------|-------------|---------|
| "copy" / "Cmd+C" | Keystroke | `{ type: "Keystroke", keys: ["cmd", "c"] }` |
| "open Safari" | Launch | `{ type: "Launch", app: "Safari" }` |
| "run a script" | Shell | `{ type: "Shell", command: "~/scripts/foo.sh" }` |
| "switch to DJ mode" | ModeChange | `{ type: "ModeChange", mode: "DJ" }` |
| "send MIDI note" | SendMIDI | `{ type: "SendMIDI", message_type: "NoteOn", channel: 1, note: 60 }` |

## Error Recovery

**"No MIDI devices found"**
- Check: Is Conductor daemon running? (`conductor status`)
- Check: OS permissions for MIDI access
- Guide user to system MIDI preferences

**"Mapping conflict detected"**
- Present both mappings to user
- Ask which takes priority
- NEVER auto-resolve conflicts

**"Unknown note/CC number"**
- Suggest using MIDI Learn mode
- Guide user: "Press the control you want to map"

See [TRIGGERS.md](https://github.com/amiable-dev/conductor/blob/741b613135a07d04ba7f17c310c076329f7eef36/docs/adrs/references/TRIGGERS.md) for complete trigger documentation.
See [ACTIONS.md](https://github.com/amiable-dev/conductor/blob/741b613135a07d04ba7f17c310c076329f7eef36/docs/adrs/references/ACTIONS.md) for complete action documentation.

MIDI Learn Skill: conductor-midi-learn/SKILL.md

---
name: conductor-midi-learn
description: >
  Guide users through MIDI Learn mode to capture controller inputs and create
  mappings. Use when the user wants to "learn" or "capture" what their controller
  does, or when they don't know the MIDI note numbers for their pads/knobs.
  Detects patterns like LongPress, DoubleTap, and Chords automatically.
license: Apache-2.0
compatibility: Requires Conductor daemon with MIDI device connected
metadata:
  author: amiable
  version: "4.11.0"
  category: midi
---

# MIDI Learn Mode

Help users discover their controller's MIDI messages and create mappings from
captured events.

## Workflow

1. **Start Capture**: Use `conductor_start_midi_learn` tool
2. **Guide the User**: Ask them to press/turn the controls they want to map
3. **Analyze Events**: Use `conductor_get_captured_events` to see what was captured
4. **Detect Patterns**: Identify if events suggest LongPress, DoubleTap, or Chord triggers
5. **Suggest Mappings**: Propose mappings based on captured events
6. **Apply with Approval**: Use Plan/Apply pattern for user confirmation

## Pattern Detection

When analyzing captured events, look for:

| Pattern | Detection Criteria | Suggested Trigger |
|---------|-------------------|-------------------|
| LongPress | NoteOn duration > 500ms before NoteOff | `LongPress` with detected duration |
| DoubleTap | Same note twice within 400ms | `DoubleTap` with detected interval |
| Chord | Multiple notes within 50ms window | `NoteChord` with detected notes |
| Velocity Layers | Same note at different velocities | Multiple `VelocityRange` mappings |

## Example Session

User: "I want to set up my Launchpad" Agent: Let me start MIDI Learn mode. Press the pads you want to configure. [Uses conductor_start_midi_learn]

User: [Presses pads] Agent: [Uses conductor_get_captured_events] I detected:

Pad at note 36 (bottom-left)
Pad at note 37 (next to it) What would you like these to do?

See [PATTERNS.md](https://github.com/amiable-dev/conductor/blob/741b613135a07d04ba7f17c310c076329f7eef36/docs/adrs/references/PATTERNS.md) for advanced pattern detection.

Progressive Disclosure Model

Following the Agent Skills specification, Conductor skills use three-level progressive disclosure:

Level 1: Metadata (~100 tokens)
├── name: "conductor-midi-mapping"
├── description: "Create and manage MIDI controller mappings..."
└── Loaded at startup for ALL skills

Level 2: Instructions (~2000-5000 tokens)
├── Full SKILL.md body
├── Workflow steps, quick reference tables
└── Loaded only when skill is activated

Level 3: Resources (as needed)
├── references/TRIGGERS.md - Complete trigger documentation
├── references/ACTIONS.md - Complete action documentation
└── Loaded only when agent needs specific details

This approach allows Conductor to provide comprehensive documentation without overwhelming the context window. An agent can have access to all Conductor skills (~400 tokens for metadata) while only loading full instructions when needed.

Skills + MCP Integration

Skills and MCP work together in Conductor:

┌─────────────────────────────────────────────────────────────┐
│                    User Request                              │
│  "Set up pad 36 to copy when I tap it, paste when I hold"   │
└─────────────────────────────────────────────────────────────┘
                            │
                            ▼
┌─────────────────────────────────────────────────────────────┐
│              Agent Skills (Knowledge Layer)                  │
│  conductor-midi-mapping SKILL.md provides:                   │
│  - Understanding that this needs two triggers (Note, LongPress)
│  - Knowledge of Keystroke action format                      │
│  - Workflow: parse → plan → apply                            │
└─────────────────────────────────────────────────────────────┘
                            │
                            ▼
┌─────────────────────────────────────────────────────────────┐
│              MCP Tools (Capability Layer)                    │
│  conductor_create_mapping executes:                          │
│  - Validates trigger/action schemas                          │
│  - Generates ConfigPlan for review                           │
│  - Applies changes after user approval                       │
└─────────────────────────────────────────────────────────────┘

Skill Installation

Users can install Conductor skills via:

# From Conductor's bundled skills
conductor skills install conductor-midi-mapping

# From GitHub
conductor skills install github:amiable/conductor-skills/midi-mapping

# From local directory
conductor skills install ./my-custom-skill

Skills are installed to ~/.conductor/skills/ and automatically discovered by compatible agents.

8. GUI Chat Interface

Location: conductor-gui/ui/src/lib/views/ChatView.svelte

&lt;script>
  import &#123; llmStore, chatStore &#125; from '../stores/llm.js';
  import &#123; configStore &#125; from '../stores.js';

  let inputMessage = '';
  let isStreaming = false;

  async function sendMessage() &#123;
    if (!inputMessage.trim()) return;

    // Add user message
    chatStore.addMessage(&#123; role: 'user', content: inputMessage &#125;);
    const userInput = inputMessage;
    inputMessage = '';

    // Stream LLM response
    isStreaming = true;
    try &#123;
      await llmStore.chat(userInput, &#123;
        onToken: (token) => chatStore.appendToLastMessage(token),
        onToolCall: (tool, result) => chatStore.addToolResult(tool, result),
        onComplete: () => isStreaming = false,
      &#125;);
    &#125; catch (error) &#123;
      chatStore.addMessage(&#123; role: 'system', content: `Error: ${error.message}` &#125;);
      isStreaming = false;
    &#125;
  &#125;
&lt;/script>

<div class="chat-container">
  <div class="messages">
    &#123;#each $chatStore.messages as message&#125;
      <div class="message &#123;message.role&#125;">
        &#123;#if message.role === 'tool'&#125;
          &lt;ToolResultCard tool=&#123;message.tool&#125; result=&#123;message.result&#125; />
        &#123;:else&#125;
          &lt;MessageBubble &#123;message&#125; />
        &#123;/if&#125;
      </div>
    &#123;/each&#125;
    &#123;#if isStreaming&#125;
      &lt;StreamingIndicator />
    &#123;/if&#125;
  </div>

  <div class="input-area">
    &lt;textarea
      bind:value=&#123;inputMessage&#125;
      placeholder="Describe what you want to configure..."
      on:keydown=&#123;(e) => e.key === 'Enter' && !e.shiftKey && sendMessage()&#125;
    />
    &lt;button on:click=&#123;sendMessage&#125; disabled=&#123;isStreaming&#125;>
      Send
    &lt;/button>
  </div>
</div>

Security Considerations

Concern	Mitigation
API Key Storage	Encrypted in OS keychain (keytar), never in plain config
Shell Command Injection	LLM cannot execute arbitrary shell commands; only predefined actions
MCP Authentication	Optional token-based auth for remote access
Tool Allowlisting	Admin can restrict which tools are available to LLM
Rate Limiting	Per-provider rate limits to prevent API abuse
Config Validation	All LLM-generated configs validated before application
Audit Logging	All LLM actions logged with timestamps for review

Configuration Schema

# ~/.conductor/config.toml

[llm]
enabled = true
default_provider = "anthropic"

[llm.providers.anthropic]
api_key_env = "ANTHROPIC_API_KEY"  # Reference env var
model = "claude-3-5-sonnet-20241022"
max_tokens = 4096
temperature = 0.7

[llm.providers.openai]
api_key_env = "OPENAI_API_KEY"
model = "gpt-4o"
max_tokens = 4096

[llm.providers.openrouter]
api_key_env = "OPENROUTER_API_KEY"
base_url = "https://openrouter.ai/api/v1"
model = "anthropic/claude-3.5-sonnet"

[llm.providers.litellm]
base_url = "http://localhost:8000"
model = "gpt-4"

[llm.mcp]
enabled = true
socket_path = "~/.conductor/mcp.sock"
http_enabled = false
# http_port = 8080
# auth_token_env = "CONDUCTOR_MCP_TOKEN"

[llm.a2a]
enabled = true
agent_name = "Conductor MIDI Mapper"

Consequences

Positive

Natural Language Configuration: Users can describe desired behavior in plain English
Reduced Learning Curve: No need to understand trigger/action schemas upfront
External Automation: Claude Code, Cursor, and other tools can configure Conductor
Provider Flexibility: Users choose their preferred LLM (cost, privacy, capability)
Extensibility: MCP/A2A enable future integrations without code changes
Interactive Learning: LLM can guide users through MIDI Learn process

Negative

API Costs: LLM API calls incur costs (mitigated by local LiteLLM option)
Latency: LLM responses add latency vs. direct configuration
Complexity: Additional infrastructure (MCP server, provider abstraction)
Security Surface: New attack vectors through LLM tool execution

Neutral

Optional Feature: Users can disable LLM integration entirely
Backward Compatible: Existing config editing remains fully functional

Alternatives Considered

Alternative 1: Single Provider (Anthropic Only)

Rejected: Limits user choice, creates vendor lock-in, no offline option

Alternative 2: LLM in Daemon (Rust)

Rejected: Rust LLM libraries are less mature than JS/TS ecosystem; Tauri GUI already has web runtime

Alternative 3: External Chat Application

Rejected: Poor UX - users would need to context-switch between apps

Alternative 4: REST API Instead of MCP

Rejected: MCP is the emerging standard for LLM tool integration; REST would require custom client code

Implementation Plan

Phase 1A: Skills Foundation (v4.11.0) ✅ COMPLETE

Agent Skills: Bundled SKILL.md files for core workflows
- conductor-midi-mapping - Mapping creation skill (judgment-based)
- conductor-midi-learn - MIDI Learn guidance skill
- conductor-device-setup - Device connection skill
Skills validation tooling (conductor skills validate)
Cross-platform testing (Claude Code, Cursor, VS Code Copilot)
Chat UI component (skills-only mode)

Phase 1B: ReadOnly MCP Tools (v4.11.1) ✅ COMPLETE

Rationale: "Skills without Tools are knowledge without hands." LLMs need to read current state for skill instructions to be useful and grounded.

MCP Server with ReadOnly tools only:
- conductor_get_status - Daemon status
- conductor_list_devices - Available devices
- conductor_get_config - Current configuration
- conductor_list_mappings - Mappings by mode
- conductor_get_mapping - Single mapping details
LLM provider abstraction layer with capability negotiation
OpenAI and Anthropic provider implementations
API key management (keychain storage)

Phase 2: Mutations & Plan/Apply (v4.12.0) ✅ COMPLETE

Phase 3: Enhanced Features (v4.13.0-v4.15.0) ✅ COMPLETE

OpenRouter and LiteLLM support (v4.13.0)
Streaming responses with graceful degradation (v4.13.0)
Tool result visualization in chat (v4.13.0)
Google Gemini support (v4.13.0)
Batch operations support (v4.13.0)
Conversation history persistence (SQLite) (v4.15.0 - P3-05)
- Conversations persist across app restarts
- History sidebar with conversation list
- Load, delete, and create new conversations
- Messages persisted with tool calls
Cost tracking display (v4.15.0 - P3-06)
- Total usage cost summary
- Per-conversation cost display
- Breakdown by provider and model
- Token usage statistics

Phase 4: Advanced MCP & Security (v4.14.0) ✅ COMPLETE

HardwareIO tier tools (SysEx with multi-step confirmation)
Audit logging for all tool executions (SQLite conductor.db)
Rate limiting per provider/client (token bucket)
Undo/redo support for config changes (history stack)
Atomic rollback on failed applies (transaction wrapper)
Remote MCP access with authentication (OAuth2) - deferred

Phase 5: Polish & Future (v4.16.0+)

Multi-turn conversation context optimization
Suggestion chips for common actions
A2A protocol support (deferred)
User-provided skills with sandboxing
Voice input (optional)
Offline mode with local models

TDD Test Plan

Unit Tests

test_provider_trait_implementation - Each provider implements trait correctly
test_tool_parameter_validation - Invalid tool params rejected
test_tool_execution_success - Tools modify config correctly
test_tool_execution_rollback - Failed tools don't corrupt state
test_api_key_encryption - Keys stored encrypted
test_mcp_tool_manifest - MCP tools match internal tools
test_tool_risk_tier_classification - All tools have correct risk tier
test_plan_expiry - Plans auto-expire after timeout

Agent Skills Tests

test_skill_md_frontmatter_valid - All SKILL.md files have valid YAML frontmatter
test_skill_md_name_matches_directory - Skill name matches parent directory
test_skill_references_exist - All referenced files in skills exist
test_skill_progressive_disclosure - Metadata < 200 tokens, full < 5000 tokens
test_skill_cross_platform_compat - Skills validate against agentskills.io schema

Integration Tests

test_chat_roundtrip - User message → LLM → tool → response
test_midi_learn_with_llm - LLM coordinates capture and mapping creation
test_mcp_external_client - External tool calls execute correctly
test_provider_failover - Graceful handling of provider errors
test_skill_activation_loads_context - Activating skill loads SKILL.md body
test_skill_mcp_tool_coordination - Skills guide MCP tool usage correctly

E2E Tests

test_full_mapping_workflow - "Create a mapping for pad 36 to copy" → working mapping
test_multi_turn_conversation - Context maintained across messages
test_skill_guided_midi_learn - Skill guides user through MIDI Learn workflow

Requirements Traceability

Requirement	Component	Test Coverage
R1: Chat Interface	ChatView.svelte, llm/chat.rs	test_chat_roundtrip
R2: LLM Config Management	llm/tools.rs, ConductorTool enum	test_tool_execution_*
R3: Multi-Provider Support	llm/providers/*.rs	test_provider_trait_*
R4: MCP/Skills	mcp/server.rs, skills/*.md	test_mcp_, test_skill_
R5: Security Boundaries	tool allowlisting, Plan/Apply	test_tool_parameter_validation, test_plan_*
R6: Cross-Platform Skills	~/.conductor/skills/	test_skill_cross_platform_compat

Open Questions

Conversation Persistence: Should chat history persist across sessions? If so, where?
- RESOLVED (v4.15.0): Yes, SQLite database stores conversations with messages and tool calls
Multi-User: If MCP is exposed over HTTP, how to handle multiple concurrent users?
Model Selection UI: Should users be able to switch models mid-conversation?
Cost Tracking: Should we display estimated API costs to users?
- RESOLVED (v4.15.0): Yes, CostSummaryPanel displays total, per-conversation, and breakdown by provider/model

References

Agent Skills

Agent Skills Specification - Official SKILL.md format specification
Anthropic Skills Repository - Official example skills
Skills vs MCP Comparison - When to use which
Claude Agent Skills Deep Dive - Technical analysis
VS Code Agent Skills Guide - Cross-platform usage

MCP (Model Context Protocol)

MCP Specification - Official protocol specification
MCP Best Practices - Architecture & implementation guide
MCP Security Considerations - Security risks and controls

Other

A2A Protocol Draft - Agent-to-Agent protocol (deferred)
OpenRouter API - Multi-model router
LiteLLM Proxy - Self-hosted LLM proxy
Tauri Security - GUI security model

LLM Council Review

Review 1: Initial Architecture (2026-01-31)

Verdict: APPROVED with Critical Architectural Modifications Consensus: High (CSS: 0.92)

The LLM Council approved ADR-007 as a comprehensive and well-structured architecture for LLM integration. Critical modifications applied:

Plan/Apply Pattern - Config changes require user approval via diff preview
Tool Risk Tiers - Four-tier classification (ReadOnly, Stateful, ConfigChange, Privileged)
Transport-Agnostic Tool Executor - Unified logic for GUI and MCP paths
State Synchronization - Tauri Events for real-time propagation
Capability Negotiation - Runtime detection with graceful degradation
A2A Deferral - Focus on MCP first

Review 2: Agent Skills Integration (2026-02-01)

Verdict: APPROVED with Critical Modifications Consensus: High

After adding Agent Skills (SKILL.md per agentskills.io spec), the council provided additional critical feedback:

Critical Modifications Applied

1. TOCTOU Vulnerability Fix (Critical)

Issue: Plans could become stale if config changes between plan and apply
Resolution: Added base_state_hash to ConfigPlan; apply verifies hash matches current state

2. Hardware I/O Risk Tier (Critical)

Issue: SysEx messages can brick MIDI hardware but weren't specially protected
Resolution: Added HardwareIO tier with multi-step confirmation for dangerous operations

3. SKILL.md Content Style (Critical)

Issue: Skills were procedural ("Step 1, 2, 3") instead of judgment-based
Resolution: Rewrote skills to focus on mental models, decision frameworks, and failure modes

4. Skills Must Include Safety Policy (Important)

Issue: No explicit rules about what skills cannot do
Resolution: Added Scope/Non-Goals and Safety Policy sections to all skills

5. Implementation Phasing Correction (Important)

Issue: Skills in Phase 1 without Tools means "knowledge without hands"
Resolution: Added Phase 1B with ReadOnly MCP tools so LLM can see current state

6. Skills-to-Tools Execution Guidance (Important)

Issue: LLM might try to edit files directly instead of using MCP tools
Resolution: Added explicit "Execution Rules" section: MUST use MCP tools, DO NOT generate scripts

7. Treat Skills as Untrusted (Important)

Issue: Malicious SKILL.md could try to bypass safety controls
Resolution: Runtime (MCP layer) enforces all boundaries; skills are guidance only

Council Quality Metrics (Review 2)

Metric	Score	Notes
Consensus Strength (CSS)	0.88	Strong agreement on security fixes
Deliberation Depth (DDI)	0.91	Thorough analysis of Skills+MCP interaction
Synthesis Attribution (SAS)	0.93	All modifications traceable

Model Rankings

Model	Score	Key Contribution
anthropic/claude-opus-4.5	0.833	SKILL.md rewrite guidance, trust boundaries
openai/gpt-5.2	0.333	TOCTOU vulnerability identification
x-ai/grok-4.1-fast	0.333	Skill-MCP contract validation
google/gemini-3-pro-preview	0.333	Hardware I/O risk tier

Dissenting Views (Review 2)

Minor dissent on skill content style: One council member suggested keeping some procedural content for "quick reference" alongside judgment-based guidance. The compromise was to include Quick Reference Tables at the end of skills after the decision frameworks.

Verification Requirements (Combined)

Security Review: Plan/Apply with TOCTOU protection must be penetration tested
UX Testing: Plan review modal clarity with non-technical users
Integration Tests: ToolExecutor 100% coverage including HardwareIO tier
Skills Validation: All SKILL.md files must pass agentskills.io schema validation
Platform Test Matrix: Validate skills across Claude Code, Cursor, VS Code Copilot, Codex CLI
Token Budget Testing: Verify skills stay under recommended token limits

Status​

Implementation Status​

Context​

Problem Statement​

Requirements​

Decision​

Skills vs MCP: Complementary Architectures​

Architecture Overview​

Component Design​

1. LLM Provider Abstraction Layer​

2. Tool System for LLM Actions​

4. State Synchronization via Tauri Events​

5. MCP Server for External LLM Access (Priority)​

6. A2A (Agent-to-Agent) Protocol Support (Deferred to Phase 4+)​

7. Agent Skills (SKILL.md Standard)​

Skill Directory Structure​

Primary Skill: conductor-midi-mapping/SKILL.md​

MIDI Learn Skill: conductor-midi-learn/SKILL.md​

Progressive Disclosure Model​

Skills + MCP Integration​

Skill Installation​

8. GUI Chat Interface​

Security Considerations​

Configuration Schema​

Consequences​

Positive​

Negative​

Neutral​

Alternatives Considered​

Alternative 1: Single Provider (Anthropic Only)​

Alternative 2: LLM in Daemon (Rust)​

Alternative 3: External Chat Application​

Alternative 4: REST API Instead of MCP​

Implementation Plan​

Phase 1A: Skills Foundation (v4.11.0) ✅ COMPLETE​

Phase 1B: ReadOnly MCP Tools (v4.11.1) ✅ COMPLETE​

Phase 2: Mutations & Plan/Apply (v4.12.0) ✅ COMPLETE​

Phase 3: Enhanced Features (v4.13.0-v4.15.0) ✅ COMPLETE​

Phase 4: Advanced MCP & Security (v4.14.0) ✅ COMPLETE​

Phase 5: Polish & Future (v4.16.0+)​

TDD Test Plan​

Unit Tests​

Agent Skills Tests​

Integration Tests​

E2E Tests​

Requirements Traceability​

Open Questions​

References​

Agent Skills​

MCP (Model Context Protocol)​

Other​

LLM Council Review​

Review 1: Initial Architecture (2026-01-31)​

Review 2: Agent Skills Integration (2026-02-01)​

Critical Modifications Applied​

Council Quality Metrics (Review 2)​

Model Rankings​

Dissenting Views (Review 2)​

Verification Requirements (Combined)​