ADR-023: Knowledge Layer Deployment Boundary
Status
Draft (Revised after LLM Council review — GPT-5.4, Claude Opus 4.6, Gemini 3.1 Pro)
Relationship to other ADRs:
- ADR-018 (Knowledge Layer & Retrieval Architecture): Defines the three-layer knowledge system (L1/L2/L3), embedding model, retrieval pipeline, and provider-aware tuning. This ADR constrains where that infrastructure runs.
- ADR-007 (LLM Integration Architecture): Defines the provider abstraction, MCP tools, and chat UI — all currently in conductor-gui. This ADR preserves that boundary.
- ADR-022 (Device Discovery, Bindings & Multi-Protocol Routing): Phase 6 depends on L2 retrieval for device profile suggestions. The deployment boundary determines which process serves those suggestions.
Scope: This ADR covers the initial desktop deployment of the knowledge layer. Mobile (Tauri iOS/Android), web, and server-side deployments are explicitly out of scope but the architecture preserves escape hatches for each (see D3, Future Considerations).
Implementation note: The knowledge code was initially placed in
conductor-daemon/src/daemon/knowledge/during ADR-018 Phase 2 implementation (#648–#651). This ADR establishes the correct boundary and mandates migration to a dedicatedconductor-knowledgecrate.
Context
Problem Statement
ADR-018 specifies a local embedding model (all-MiniLM-L6-v2 via ONNX Runtime, 80MB) loaded via the ort Rust crate, a SQLite vector index (2-5MB), and a retrieval pipeline that runs before every LLM API call. The ADR describes the knowledge architecture in detail but is silent on a critical deployment question: which binary hosts this infrastructure?
The workspace has four crates:
| Crate | Purpose | Binary Size (release) | Resident Memory | Dependency Profile |
|---|---|---|---|---|
| conductor-core | Pure engine library | N/A (library) | N/A | Zero I/O deps, no runtime |
| conductor-daemon | Event pipeline, MCP server, IPC | ~5MB | 10-15MB | midir, gilrs, tokio, rusqlite, enigo |
| conductor-gui | Tauri visual interface + LLM chat | ~20MB (Tauri baseline) | 80-120MB | tauri, reqwest, rusqlite, webview |
| conductor (root) | Compatibility re-export | N/A | N/A | Just re-exports core |
The ort crate (ONNX Runtime bindings) pulls in a ~50MB native C++ shared library (libonnxruntime). Combined with the model weights (80MB for MiniLM, 33MB for bge-small), this adds ~130MB disk and ~150-200MB resident memory (model weights + ONNX Runtime session buffers + thread pool) to whichever binary hosts it.
Why This Matters
1. The daemon is a headless open-source tool. Users run conductor-daemon without the GUI for headless setups (Raspberry Pi, studio rack servers, CI/CD-driven config), via SSH, or as a system service. Today it's a 5MB binary with 10-15MB resident memory. Adding 130MB of ONNX infrastructure to the daemon means every user pays for LLM knowledge features whether they use them or not. This violates the project's lightweight-daemon design principle.
2. The knowledge layer is an LLM feature. L2/L3 retrieval only runs when an LLM provider is configured and the user is in a chat session. The daemon's MCP server exposes tools that the LLM calls, but the MCP server itself doesn't need embeddings — it serves structured data (device status, port lists, config diffs). The retrieval pipeline sits between the user's message and the LLM API call, not between the MCP tool and the daemon.
3. Cross-compilation. ort bundles platform-specific C++ binaries. The daemon currently cross-compiles cleanly for x86_64 and aarch64 (macOS universal, Linux ARM). Adding ort complicates this: ONNX Runtime pre-built binaries are available for major platforms but not all targets, and building from source requires a C++ toolchain with CMake. The GUI already accepts a heavier build chain (Tauri requires Node.js + system webview SDK), so the marginal cost is lower there.
4. Target user context. Conductor's primary users are musicians and audio engineers running DAWs (Ableton, Logic, Bitwig). DAW workstations are typically memory-constrained — the DAW + plugins consume 4-12GB on an 8-16GB machine. The knowledge layer's ~200MB RSS footprint is meaningful in this context and should be isolated from the always-running daemon.
Current Architecture (LLM Chat Data Flow)
User types in Chat UI (conductor-gui)
│
▼
┌──────────────────────────────┐
│ GUI: Chat Provider │
│ - Formats system prompt │
│ - Injects L1 core ref │
│ - Injects T1/T2/T3 signals │
│ - Calls LLM API (SSE) │
│ - Processes tool calls │
└──────────┬───────────────────┘
│ (tool calls)
▼
┌──────────────────────────────┐
│ GUI: ToolExecutor │
│ - Dispatches to daemon IPC │
│ - Or executes locally │
└──────────┬───────────────────┘
│ (IPC: JSON over Unix socket)
▼
┌──────────────────────────────┐
│ Daemon: MCP Server │
│ - Executes ReadOnly tools │
│ - Queues ConfigChange plans │
│ - Returns structured JSON │
└──────────────────────────────┘
Note: the LLM API call happens in the GUI process. The daemon never talks to LLM providers. L1 injection (system prompt) already happens in the GUI. The question is where L2/L3 retrieval goes.
Decision
D1: Knowledge Retrieval Runs in the Application Layer, Not the Daemon
For desktop builds, retrieval is hosted in the application layer, initially in-process in conductor-gui, behind a stable KnowledgeService trait boundary. conductor-daemon remains knowledge-unaware.
Rationale:
- The retrieval pipeline fires before the LLM API call. The GUI already owns this call path (system prompt assembly → L1 injection → API request → SSE streaming → tool dispatch). L2 retrieval inserts at one point in this existing pipeline.
- The daemon stays at 5MB and cross-compiles without C++ toolchain requirements.
- Users who run daemon-only get zero knowledge layer overhead.
- The
KnowledgeServicetrait boundary (D3) preserves the option to move retrieval to a sidecar process or remote service without changing consumers.
User types in Chat UI (conductor-gui)
│
▼
┌──────────────────────────────────────────┐
│ GUI: Chat Provider │
│ - Formats system prompt │
│ - Injects L1 core ref │
│ - Injects T1/T2/T3 signals │
│ ┌────────────────────────────────────┐ │
│ │ KnowledgeService::retrieve_chunks()│ │ ← trait call
│ │ (initially: InProcessKnowledge) │ │
│ │ - Query formation │ │
│ │ - Domain filter │ │
│ │ - ONNX embed (ort) │ │
│ │ - Cosine similarity │ │
│ │ - Inject into context │ │
│ └────────────────────────────────────┘ │
│ - Calls LLM API (SSE) │
│ - Processes tool calls │
└──────────┬───────────────────────────────┘
│ (tool calls via IPC)
▼
┌──────────────────────────────────────────┐
│ Daemon: MCP Server │
│ (unchanged — no ort, no knowledge DB) │
└──────────────────────────────────────────┘
D2: Knowledge Infrastructure Behind a Cargo Feature Flag
Two feature flags control the knowledge layer at different levels:
conductor-gui'sknowledgefeature (future): gates the optionalconductor-knowledgecrate dependency. When disabled, the GUI builds without any knowledge code.conductor-knowledge'sonnxfeature: gates theortdependency for real ONNX embeddings. When disabled (current default), stub embeddings are used.
Current and planned Cargo configuration:
# conductor-knowledge/Cargo.toml
[dependencies]
serde = { version = "1", features = ["derive"] }
# ort = { version = "2.0.0-rc.12", optional = true } # Added when ONNX is wired
[features]
default = []
onnx = [] # Placeholder — change to onnx = ["dep:ort"] when ort is added
# conductor-gui/src-tauri/Cargo.toml (FUTURE — not yet added)
[dependencies]
conductor-knowledge = { path = "../../conductor-knowledge", optional = true }
[features]
default = ["custom-protocol", "plugin-registry", "knowledge"]
knowledge = ["dep:conductor-knowledge"]
Rationale:
- Developers building the GUI without LLM features can disable
knowledgeand avoid the ONNX build dependency. - CI can build
--no-default-featuresfor faster test cycles where knowledge isn't under test. - The feature flag is
default = truebecause the GUI is the LLM-enabled product. - The GUI frontend must detect the feature at runtime via a Tauri command (
is_knowledge_available()) to conditionally show/hide knowledge UI elements.
D3: Dedicated conductor-knowledge Workspace Crate
The knowledge layer lives in its own workspace crate, not inside conductor-gui or conductor-daemon:
conductor-knowledge/ ← New workspace crate
├── Cargo.toml
└── src/
├── lib.rs # KnowledgeService trait + InProcessKnowledge impl
├── index.rs # In-memory vector index, cosine similarity, stub_embed()
├── retrieval.rs # Pipeline: query → filter → embed → search → budget
├── chunker.rs # Section-aware document chunker for index builds
├── provider_tuning.rs # Per-provider retrieval parameter adjustment
├── device_profiles.rs # Built-in L2 device profiles
└── community.rs # L3 community profile stub
# Future additions (not yet implemented):
# src/embedder.rs — ONNX model loading (when ort is wired)
# src/bin/conductor_knowledge.rs — CLI: build-index, validate, inspect
The KnowledgeService trait:
/// Trait boundary that enables swapping in-process, sidecar, or remote implementations.
pub trait KnowledgeService: Send + Sync {
/// Retrieve relevant knowledge chunks for a user query.
fn retrieve_chunks(
&self,
query: &str,
domain: Option<&str>,
max_tokens: usize,
) -> RetrievedContext;
/// Check if the knowledge service is available and healthy.
fn is_available(&self) -> bool;
}
/// In-process implementation using in-memory KnowledgeIndex.
/// Default for desktop. Uses stub embeddings until ONNX is wired.
pub struct InProcessKnowledge { index: KnowledgeIndex }
/// Future: sidecar implementation using IPC to a separate process.
// pub struct SidecarKnowledge { /* ... */ }
Rationale:
- The CLI tool (
conductor-knowledge build) requiresort+rusqlitebut has no dependency on Tauri, webview, or any GUI infrastructure. Placing it inconductor-guiwould force the full Tauri build toolchain for a headless index builder. - The
KnowledgeServicetrait decouples consumers (GUI) from the hosting model. The GUI callsservice.retrieve()without knowing whether it runs in-process, in a sidecar, or via a remote API. This is the escape hatch for mobile, web, and server deployments. - Both the GUI (runtime retrieval) and the CLI (index building) depend on the same library code. A shared crate eliminates duplication.
Integration point in conductor-gui:
// In conductor-gui/src-tauri/src/chat/system_prompt.rs (future integration)
pub fn assemble_system_prompt(
knowledge: Option<&dyn KnowledgeService>,
config: &AppConfig,
signals: &SignalContext,
user_message: &str,
) -> String {
let mut prompt = String::new();
// L1: Static core reference (existing)
prompt.push_str(&load_l1_reference());
// L2: Retrieved knowledge chunks
if let Some(svc) = knowledge {
if svc.is_available() {
let ctx = svc.retrieve_chunks(user_message, None, 1500);
let formatted = conductor_knowledge::format_context(&ctx);
if !formatted.is_empty() {
prompt.push_str("\n\n");
prompt.push_str(&formatted);
}
}
}
// T1/T2/T3: Runtime signals (existing)
prompt.push_str(&format_signal_context(signals));
prompt
}
D4: Daemon MCP Tools Remain Knowledge-Unaware
The daemon's MCP tools return structured data only. They do not inject knowledge context, suggest matchers based on device profiles, or perform any retrieval. The LLM receives the tool result as raw JSON and relies on its system prompt context (L1 + L2 injected by the GUI) to interpret and present the result intelligently.
Rationale: The daemon is a data service. The GUI is the intelligence layer. The LLM bridges them.
Exception: conductor_suggest_binding (ADR-022 Phase 5). The daemon tool returns the raw fingerprint classification ("CC-only traffic, channels 1, CC range 64-67, likely foot controller"). The GUI enriches this with L2 knowledge before presenting to the LLM. The tool stays knowledge-unaware.
Sequence diagram — ADR-022 Phase 6 device profile suggestion:
User: "Set up my new controller"
│
▼
┌─────────────────────────────────────────────┐
│ GUI: assemble_system_prompt() │
│ 1. Inject L1 core reference │
│ 2. KnowledgeService::retrieve("controller")│
│ → returns device profile chunks from L2 │
│ 3. Inject T1/T2/T3 signals │
│ 4. Call LLM API with enriched context │
└──────────────┬──────────────────────────────┘
│ LLM calls conductor_suggest_binding
▼
┌─────────────────────────────────────────────┐
│ Daemon: conductor_suggest_binding │
│ Returns: { category: "foot_controller", │
│ cc_range: [64,67], confidence: 0.7 } │
└──────────────┬──────────────────────────────┘
│ raw fingerprint returned
▼
┌─────────────────────────────────────────────┐
│ GUI: Tool result received │
│ LLM already has L2 device profiles in │
│ system prompt context — it can match │
│ "foot_controller" to known devices. │
│ No GUI-side enrichment step needed. │
└─────────────────────────────────────────────┘
D5: L1 Stays in the Daemon (Unchanged)
L1 (Core Reference) is a static Markdown file (docs/llm-reference.md) shipped with the distribution. The GUI fetches L1 from the filesystem at startup and caches it for system prompt injection. The daemon also reads it for MCP tool descriptions and help text. L1 has no ONNX dependency — it's a text file. No change from ADR-018.
D6: L3 Online Retrieval Is GUI-Only
L3 (community knowledge, optional online retrieval) runs exclusively in the GUI process. It requires network access (already available via reqwest), user opt-in (settings UI is in the GUI), and privacy controls. The daemon has no involvement in L3. See Security Considerations for L3 privacy requirements.
D7: Model File Distribution
The ONNX model file (all-MiniLM-L6-v2.onnx, ~80MB) is downloaded on first launch with a Tauri progress UI:
- First launch: GUI detects missing model, shows download progress dialog with cancel button. Model fetched from a project-hosted URL over HTTPS. SHA-256 checksum verified before use (see Security Considerations).
- GUI installer (optional bundling): Distributors may bundle the model in the app resources directory for offline-first installs. This increases installer size from ~20MB to ~100MB.
- Cargo build from source: Model is not downloaded during build. Fetched on first GUI launch. A
CONDUCTOR_ONNX_MODEL_PATHenv var allows overriding with a local file. - Daemon-only install: No model needed. The daemon doesn't depend on
ortand doesn't load the model. - Fallback: If the model cannot be downloaded (firewall, offline, timeout), the GUI continues with L1 only. A status indicator shows "Knowledge: L1 only (model not available)".
The pre-built knowledge index (knowledge.db) is bundled with the GUI installer. Users can rebuild it via conductor-knowledge build if they add custom source documents.
Model size note: all-MiniLM-L6-v2 quantized to INT8 is ~23MB. gte-small is 33MB. If download size is a concern, a smaller or quantized model can be substituted — the KnowledgeService trait is model-agnostic.
D8: Model Loading Strategy
The ONNX model is loaded lazily on first knowledge-enabled request, not eagerly at GUI startup:
- GUI starts without loading the model. Startup time is unaffected.
- On the first chat message (if knowledge is enabled), the model loads asynchronously on a dedicated thread pool, separate from the Tauri async runtime. Typical load time: 1-3 seconds.
- During loading, retrieval returns empty results (L1-only mode). A brief "Loading knowledge model..." status is shown.
- Once loaded, the model session is cached for the lifetime of the GUI process.
- If loading fails (corrupt model, OOM, unsupported platform), the GUI logs the error and continues with L1-only permanently for that session.
Threading: ONNX inference runs on a dedicated ort thread pool (SessionBuilder::with_inter_threads(2)), isolated from the Tauri async runtime and webview renderer. This prevents inference from blocking UI interactions.
D9: Index Lifecycle Management
- Index schema version will be stored in the SQLite database metadata. The
conductor-knowledgecrate will define aSCHEMA_VERSIONconstant to track and enforce schema compatibility. - Compatibility check: On startup, if the index schema version doesn't match the library version, the GUI shows "Knowledge index outdated — rebuild required" and falls back to L1-only until rebuilt.
- Embedding model identity: The index stores the model name and a hash of the model file used to generate embeddings. If the model changes, the index is incompatible (embeddings from different models are not comparable).
- Concurrent access: The index uses SQLite WAL mode. The CLI (
conductor-knowledge build) acquires a write lock during index builds. The GUI holds a read connection. If a rebuild is in progress while the GUI is running, the GUI continues reading the old index until the rebuild completes. - Rebuild triggers: Manual only (
conductor-knowledge build). No automatic file watching or background rebuilding. Future: the GUI settings panel could offer a "Rebuild Index" button.
Specification: Migration from Current State
The knowledge code was migrated from conductor-daemon/src/daemon/knowledge/ in PR #768.
Completed (ADR-023 initial implementation):
Create— Doneconductor-knowledge/workspace crateMove all 6 modules from daemon— Done (withprovider_profile.rs→provider_tuning.rsrename)Add— DoneKnowledgeServicetrait andInProcessKnowledgeimplementationRemove all knowledge imports from— Doneconductor-daemonUpdate workspace— DoneCargo.toml
Remaining (future PRs):
6. Add conductor-knowledge as optional dependency of conductor-gui (behind knowledge feature)
7. Wire KnowledgeService::retrieve_chunks() into GUI's system prompt assembly
8. Create CLI binary at conductor-knowledge/src/bin/conductor_knowledge.rs
Dependency Changes
# conductor-knowledge/Cargo.toml (NEW — matches actual crate)
[package]
name = "conductor-knowledge"
version.workspace = true
edition.workspace = true
[dependencies]
serde = { version = "1", features = ["derive"] }
# ort = { version = "2.0.0-rc.12", optional = true } # Added when ONNX is wired
[features]
default = []
onnx = [] # Placeholder — change to onnx = ["dep:ort"] when ort is added
# conductor-gui/src-tauri/Cargo.toml (MODIFIED)
[dependencies]
conductor-knowledge = { path = "../../conductor-knowledge", optional = true }
[features]
default = ["custom-protocol", "plugin-registry", "knowledge"]
knowledge = ["dep:conductor-knowledge"]
# conductor-daemon/Cargo.toml (NO CHANGES)
# No ort dependency. No knowledge module. No conductor-knowledge dependency.
Specification: Impact on ADR-018
ADR-018 is amended as follows. These changes are additive — no existing decisions are reversed.
Amendment to D3.3 (Embedding Model)
Before: "The Rust backend loads it via ort at startup."
After: "The conductor-knowledge crate loads the model lazily on first retrieval request via ort, gated behind the onnx Cargo feature. The conductor-daemon process has no ort dependency and does not load the model."
Amendment to D3.4 (Index Storage)
Before: "Database location: $CONDUCTOR_DATA_DIR/knowledge/knowledge.db alongside knowledge.onnx"
After: Unchanged path. "The index is read by conductor-gui (via conductor-knowledge library) at runtime and by the conductor-knowledge CLI during index builds. The daemon does not access this file. The index uses SQLite WAL mode for concurrent read/write safety."
Amendment to D3.5 (Retrieval Pipeline)
Before: Pipeline described without specifying which process hosts it.
After: "The retrieval pipeline runs in the conductor-gui process via the conductor-knowledge library crate, integrated into the system prompt assembly path (see ADR-023 D3). It fires before each LLM API call. The daemon is not involved in retrieval."
Amendment to Phase 2A (GitHub #648)
Before: "Knowledge index infrastructure — ONNX embedding, SQLite storage, cosine search"
After: Infrastructure is implemented in the conductor-knowledge workspace crate, not in conductor-daemon. The ort dependency lives in conductor-knowledge's Cargo.toml. No changes to conductor-daemon's Cargo.toml.
Amendment to Phase 2B (GitHub #649)
Before: "Index build CLI — section-aware Markdown chunker and build pipeline"
After: The CLI binary (conductor-knowledge) lives in the conductor-knowledge crate as a primary binary, not in conductor-daemon or conductor-gui.
Specification: Impact on ADR-022
ADR-022 Phase 6 (LLM Integration) is unaffected in scope but clarified in deployment:
- Phase 6A (
binding-topologyCanvas artifact): Canvas rendering is already GUI-side. No change. - Phase 6C (Device profile retrieval via L2): The retrieval happens in the GUI process via
conductor-knowledge. The daemon'sconductor_suggest_bindingtool returns raw fingerprint data; the LLM uses L2 context already in the system prompt to interpret it (see D4 sequence diagram). - Phase 6D (Community profile sharing stub): L3 is GUI-only per D6.
Specification: Binary Size & Memory Impact
Disk Size
| Binary | Before ADR-023 | After ADR-023 | Delta |
|---|---|---|---|
| conductor-daemon | ~5MB | ~5MB | 0 |
| conductor-gui (with knowledge) | ~20MB | ~70MB (+ort) | +50MB |
| conductor-gui (no knowledge) | ~20MB | ~20MB | 0 |
| conductor-knowledge CLI | N/A | ~55MB | New binary |
| Model file (downloaded) | 0 | ~80MB | +80MB (data dir, downloaded on first launch) |
| Knowledge index | 0 | ~3MB | +3MB (data dir) |
Total install size increase for GUI users: ~130MB (model downloaded on first launch). Total install size increase for daemon-only users: 0.
Resident Memory (RSS) — Peak Estimates
| Component | Estimate | Notes |
|---|---|---|
| ONNX Runtime session | ~100MB | Model weights + inference buffers |
| ONNX thread pool (2 threads) | ~20MB | Dedicated inference threads |
| SQLite knowledge index | ~5-10MB | Page cache, WAL |
| Total knowledge overhead | ~130-150MB | On top of existing GUI baseline |
| GUI without knowledge | 80-120MB | Tauri + webview + chat history |
| GUI with knowledge (peak) | 210-270MB | During active inference |
| GUI with knowledge (idle) | ~150-180MB | Model loaded, no active inference |
Context: DAW workstations typically run 4-12GB of audio plugins. The knowledge layer's ~150MB idle footprint is significant but manageable on 8GB+ machines. On 4GB machines or memory-constrained environments, disable the knowledge feature.
Failure Modes & Degradation
The knowledge layer must never prevent the GUI from functioning. All failure modes degrade gracefully to L1-only operation.
| Failure | Behavior | User-Visible |
|---|---|---|
| Model file missing | Skip L2 retrieval, L1-only | Status: "Knowledge: L1 only (model not available)" |
| Model download fails (network) | Retry on next launch, L1-only for this session | Download dialog shows error with retry option |
| Model file corrupt (checksum mismatch) | Refuse to load, delete corrupt file, L1-only | Status: "Knowledge: model verification failed" |
| ONNX Runtime init fails (unsupported platform) | Log error, L1-only permanently | Status: "Knowledge: not available on this platform" |
| Index missing or incompatible schema | Skip L2, L1-only | Status: "Knowledge: index rebuild required" |
| Index corrupt (SQLite error) | Delete index, L1-only until rebuilt | Log warning |
| Retrieval timeout (>500ms) | Cancel retrieval, proceed with L1-only for this request | No visible indicator (transparent fallback) |
| OOM during inference | Catch panic, unload model, L1-only for rest of session | Log error |
| ONNX Runtime segfault | Process crash (GUI restarts via Tauri) | Crash report; on next launch, disable knowledge auto-load, offer "try again" |
Crash isolation note: The ort C++ runtime can segfault on malformed models or extreme memory pressure. Running it in-process means a crash takes down the GUI. This is acceptable for the initial desktop deployment because: (a) the model is integrity-verified before loading, (b) crashes are rare with verified models, (c) Tauri supports process restart. If crash frequency exceeds acceptable levels, escalate to sidecar (see Sidecar Escalation Criteria).
Security Considerations
Model Supply Chain (D7)
The ONNX model is an executable artifact — a tampered model could produce adversarial embeddings that bias retrieval results, constituting an indirect prompt injection vector.
Mitigations:
- Integrity verification: A SHA-256 checksum of the official model file is embedded in the
conductor-knowledgebinary at compile time. The model is verified before every load. Checksum mismatch → refuse to load. - Download security: Model downloaded over HTTPS from a project-controlled URL. Certificate pinning is not required (HTTPS provides sufficient integrity for this threat model), but the URL is not user-configurable via UI — only via
CONDUCTOR_ONNX_MODEL_PATHenv var for advanced users. CONDUCTOR_ONNX_MODEL_PATHrisk: This env var allows loading an arbitrary model file, bypassing the embedded checksum. This is intentional (for development and custom models) but means an attacker who can set environment variables can swap the model. This is equivalent to the attacker already having code execution, so it does not expand the threat surface.
Knowledge Index at Rest (D9)
The SQLite knowledge index contains embeddings of device profiles, workflow patterns, and potentially user-specific configuration context. This constitutes a fingerprint of the user's studio setup.
Mitigations:
- The index is stored in the user's data directory (
$CONDUCTOR_DATA_DIR/knowledge/), protected by OS file permissions. - The index contains embeddings of reference documentation, not user conversations or personal data. User-specific data (chat history, API keys) is in separate databases.
- Encryption at rest is not required for the initial deployment (the index content is derived from shipped documentation). Revisit if user-generated content (custom profiles, learned patterns) is added to the index.
- The index is excluded from any future cloud sync or backup integration by default.
L3 Online Retrieval Privacy (D6)
When L3 online retrieval is enabled (opt-in only), queries are sent to an external service.
Requirements:
- Data sanitization: Before any L3 network request, the query is stripped of: hardware serial numbers, internal IP addresses, file paths, environment variables, and device aliases. Only the semantic query text is transmitted.
- Trust tiers: L1 = authoritative/trusted (shipped with binary). L2 = trusted (locally indexed from shipped content). L3 = external/untrusted (community-contributed, cached locally). L3 results are injected into the system prompt with a
[Community — Unverified]prefix so the LLM can weight them appropriately. - No embedding vectors transmitted: L3 is strictly text-query-based — the hub is responsible for computing its own embeddings from the query text. Client-side APIs (e.g., the
communitymodule) must not accept or transmit client-computed embedding vectors for L3 lookup. The user's local embeddings never leave the machine. Note: the currentcommunity.rsstub accepts an embedding parameter for API symmetry; this will be changed to a text query when L3 is implemented. - User consent: L3 is disabled by default. Enabling it requires explicit opt-in via the Knowledge Sources settings panel, with a clear description of what data leaves the machine.
Daemon Tool Validation (D4)
The daemon's MCP tool validation (from ADR-007) must not trust GUI-provided context. A compromised knowledge index could theoretically bias the LLM into generating malicious tool calls (e.g., a poisoned L2 profile suggesting a dangerous shell command). The daemon's existing risk-tier validation (ConfigChange requires Plan/Apply, HardwareIO requires confirmation) is the defense. No additional measures are needed because the daemon validates tool parameters, not the reasoning that produced them.
Alternatives Considered
A1: ONNX in the Daemon
The ort dependency is added to conductor-daemon. The retrieval pipeline runs daemon-side.
Rejected because:
- +130MB to the daemon binary affects all users
- +150-200MB RSS on a process that should be 10-15MB
- Daemon loses clean cross-compilation story
- Daemon-only users (headless, no LLM) pay for LLM infrastructure
- Violates the "daemon is a lightweight engine" design principle
A2: Separate conductor-knowledge Sidecar Process
A standalone binary that runs alongside the daemon and GUI. The GUI sends retrieval requests to it via IPC (Unix socket or stdin/stdout, managed by Tauri's sidecar system).
Deferred (not rejected). Revisit when:
- ONNX crashes take down the GUI more than once per 1,000 sessions (crash isolation needed)
- GUI RSS exceeds 300MB with knowledge loaded (memory isolation needed)
- Tauri mobile target is pursued (can't bundle C++ ONNX on iOS/Android easily)
- Third-party clients need knowledge without the GUI (independent lifecycle needed)
- Index rebuild needs to run in the background while GUI is closed
The KnowledgeService trait (D3) makes this migration mechanical: implement SidecarKnowledge that wraps IPC calls to the sidecar process, swap the implementation at startup. No consumer code changes.
A3: API-Based Embedding (No Local ONNX)
Use an external embedding API (OpenAI text-embedding-3-small, etc.) instead of local ONNX inference.
Rejected because:
- Requires network for every LLM message (breaks offline operation)
- Ties the knowledge layer to a specific provider
- Privacy: query text sent to embedding API provider
- ADR-018 explicitly chose local embedding for privacy and cost reasons
A4: Feature-Gated ONNX in the Daemon (Optional)
Add ort to conductor-daemon behind an opt-in feature flag.
Rejected because:
- Feature flags in the daemon create a matrix of supported configurations
- The retrieval pipeline needs to integrate with system prompt assembly, which is GUI-side
- If the daemon has the knowledge module, MCP tools would be expected to use it, creating implicit coupling
A5: WebAssembly-Based Inference in the Webview
Run onnxruntime-web or a Rust ML framework (candle) compiled to WASM inside the Tauri webview. This eliminates the native C++ dependency entirely.
Deferred (not rejected) because:
- WASM SIMD support is inconsistent across webview engines
- Inference is 3-10x slower than native ONNX Runtime
- The
ortRust crate does not support WASM targets today - SQLite index access from WASM requires additional bridging (sql.js or IPC to Rust backend)
- Revisit if: native ONNX distribution becomes untenable for cross-platform builds, or if a web-only deployment target is pursued
A6: Lexical Fallback (BM25/FTS5, No Embeddings)
Use SQLite FTS5 full-text search instead of semantic vector search. Eliminates the ONNX dependency entirely at the cost of retrieval quality.
Not adopted as primary, but available as fallback:
- BM25 retrieval is valuable as a degraded-mode fallback when the ONNX model is unavailable (see Failure Modes)
- The
conductor-knowledgecrate should implement bothFtsRetrieval(BM25) andSemanticRetrieval(ONNX) behind the sameKnowledgeServicetrait - On platforms where ONNX is unavailable (e.g., future WASM target), FTS5 provides baseline retrieval
Consequences
Positive
- Daemon stays lightweight. 5MB binary, 10-15MB resident, clean cross-compilation. Open-source users get the full event pipeline without LLM overhead.
- Single retrieval integration point. The GUI already assembles the system prompt. L2 retrieval inserts at one point in that pipeline. No new IPC protocols or cross-process coordination.
- Build simplicity. Only the
conductor-knowledgecrate needs theortdependency. The daemon builds with pure Rust dependencies. The GUI optionally pulls inconductor-knowledge. - Clean crate boundary. The
KnowledgeServicetrait enables swapping in-process, sidecar, WASM, or remote implementations without changing consumers.
Negative
- MCP tools can't use L2 knowledge directly. A tool like
conductor_suggest_bindingcan't enrich its response with device profile data from L2. The LLM must use L2 context already present in the system prompt to interpret tool results. This is architecturally clean but means the LLM does the work of connecting tool output to knowledge context, adding cognitive load to the model. - Daemon-only users get no knowledge features. A headless setup running
conductorctlcommands via SSH has no L2/L3. L1 (static reference) is still available. If a future web UI, CLI chat, or VS Code extension connects directly to the daemon, it would need its own knowledge integration — the intelligence is not in the platform, it's in the client. - GUI memory pressure. The knowledge layer adds ~130-150MB RSS to the GUI process. On memory-constrained DAW workstations (8GB with plugins loaded), this is significant. Mitigated by lazy loading (D8) and the option to disable the
knowledgefeature. - Model distribution complexity. First-launch download requires async progress UI, retry logic, checksum verification, and error handling. This is a one-time UX cost but adds engineering effort to the release pipeline.
- Update coupling. If the embedding model changes or the index schema evolves, users must update the GUI and potentially redownload the model. The daemon and GUI can no longer be updated fully independently when knowledge features are involved.
Neutral
- L1 is unaffected. The static core reference is a text file read by both daemon and GUI.
- The CLI tool in
conductor-knowledgecrate is architecturally clean. It shares library code with the GUI without requiring the Tauri build toolchain.
Future Considerations
Mobile (Tauri iOS/Android)
The ort crate's support for mobile targets (via CoreML on iOS, NNAPI on Android) is experimental. If Conductor targets mobile:
- Option A: Disable the
knowledgefeature on mobile builds. Use L1-only or FTS5 fallback (A6). - Option B: Implement a
MobileKnowledgevariant ofKnowledgeServiceusing platform-native ML frameworks (Core ML, NNAPI) via thecandlecrate or direct FFI. - Option C: Use the API-based embedding option (A3) on mobile where network is typically available.
Web Deployment
If a web-based UI replaces or supplements Tauri:
- The
KnowledgeServicetrait allows aRemoteKnowledgeimplementation that calls a backend knowledge API. - The sidecar (A2) becomes a standalone knowledge server.
Multi-Model Future
If larger models are needed (e.g., for reranking), the KnowledgeService trait boundary isolates consumers from model changes. The sidecar escalation criteria (A2) should be re-evaluated if model RSS exceeds 300MB.
Implementation
This ADR requires migrating existing code and adding new infrastructure:
- Create
conductor-knowledgeworkspace crate (D3) - Migrate 6 modules from
conductor-daemon/src/daemon/knowledge/(see Migration section) - Implement
KnowledgeServicetrait +InProcessKnowledge(D3) - Implement
is_knowledge_available()Tauri command for frontend feature detection (D2) - Wire retrieval into
system_prompt.rsviaKnowledgeService(D1) - Implement model download + checksum verification (D7)
- Implement lazy model loading on dedicated thread pool (D8)
- Implement FTS5 fallback retrieval (A6 fallback path)
- Remove all knowledge code from
conductor-daemon - Update CI to build
conductor-knowledgecrate and test both--features onnxand--no-default-features
Review Checklist
- Does the deployment boundary hold if the GUI is rewritten (e.g., web-based instead of Tauri)?
- Yes: the
KnowledgeServicetrait allowsRemoteKnowledgefor web backends.
- Yes: the
- Does this prevent future daemon-side intelligence (e.g., proactive suggestions without GUI)?
- Partially: daemon-only mode with LLM support would need the sidecar (A2) or a daemon-embedded
KnowledgeService.
- Partially: daemon-only mode with LLM support would need the sidecar (A2) or a daemon-embedded
- Is the
knowledgefeature flag tested in CI?- Must be: CI should build
conductor-knowledgewith both--features onnxand--no-default-features. CI should buildconductor-guiwith both--features knowledgeand--no-default-features.
- Must be: CI should build
- Is crash isolation sufficient for desktop?
- Acceptable for initial deployment with integrity-verified models. Monitor crash frequency; escalate to sidecar if >1 per 1,000 sessions.
- Is the memory footprint acceptable for target users?
- Acceptable with lazy loading on 8GB+ machines. Document the
--no-default-featuresescape hatch for constrained environments.
- Acceptable with lazy loading on 8GB+ machines. Document the
- Are security considerations addressed?
- Model integrity: SHA-256 checksum. Index privacy: OS file permissions, no user data in initial index. L3: data sanitization, trust tiers, opt-in only.