Skip to main content

ADR-037: n8n Workflow Automation Integration

Status: DRAFT Date: 2025-12-30 Context: Enabling LLM Council as an "agent jury" in workflow automation Depends On: ADR-009 (HTTP API), ADR-025 (Future Integration Capabilities) Author: @amiable-dev Council Review: 2025-12-30 (High Tier, 4/4 models)

Context

LLM Council's HTTP API enables integration with workflow automation platforms. Among these, n8n stands out as a popular open-source alternative to Zapier with strong LLM integration capabilities and a visual workflow editor.

Problem Statement

Engineers building automation workflows face a fundamental limitation: single-model AI decisions are binary and prone to hallucination. When automating high-stakes decisions (code reviews, support triage, design approvals), teams need:

  1. Consensus-based decisions - Multiple perspectives reduce single-model bias
  2. Confidence signals - Understanding when the AI is uncertain
  3. Auditability - Transparent reasoning for compliance and debugging
  4. Flexibility - Both binary (go/no-go) and synthesized (detailed analysis) outputs

n8n Platform Analysis

Why n8n as the first integration target:

Factorn8nZapierMake
Open SourceYes (Fair-code)NoNo
Self-hostedYesNoNo
LLM IntegrationNative AI nodesAdd-onsAdd-ons
HTTP FlexibilityFull controlLimitedLimited
Community Templates7600+LargerMedium
Target AudienceEngineers/DevOpsBusiness usersMixed

n8n's engineer-focused design, HTTP Request node flexibility, and self-hosting capability align with LLM Council's technical audience.

Current State (Pre-ADR)

ADR-025 identified n8n as P2 priority:

"Create n8n integration example/template"

HTTP API availability (ADR-009):

  • POST /v1/council/run - Synchronous deliberation
  • GET /v1/council/stream - SSE streaming
  • GET /v1/health - Health check

Gap: No documented patterns, example workflows, or security guidance for n8n integration.

Decision

Implement comprehensive n8n integration with three workflow templates, security patterns, and documentation targeting engineer audiences.

1. Integration Architecture

┌─────────────────────────────────────────────────────────────┐
│ n8n Workflow │
├─────────────────────────────────────────────────────────────┤
│ ┌──────────┐ ┌───────────────┐ ┌─────────────────┐ │
│ │ Trigger │───▶│ HTTP Request │───▶│ Parse Response │ │
│ │ (Webhook)│ │ POST /v1/ │ │ (Set Node) │ │
│ └──────────┘ │ council/run │ └────────┬────────┘ │
│ └───────────────┘ │ │
│ ▼ │
│ ┌───────────────────┐ │
│ │ Conditional Logic │ │
│ │ (IF verdict=...) │ │
│ └─────────┬─────────┘ │
│ │ │
│ ┌───────────────┼───────────┐ │
│ ▼ ▼ ▼ │
│ ┌─────────┐ ┌─────────┐ ┌──────┐│
│ │ Approve │ │ Reject │ │ Flag ││
│ │ Action │ │ Action │ │Human ││
│ └─────────┘ └─────────┘ └──────┘│
└─────────────────────────────────────────────────────────────┘

2. API Contract

Request Schema

{
"prompt": "Review this code for security issues:\n{{ $json.diff }}",
"verdict_type": "synthesis | binary | tie_breaker",
"confidence": "quick | balanced | high",
"include_dissent": false,
"metadata": {
"correlation_id": "{{ $execution.id }}",
"workflow_id": "code-review-v1",
"idempotency_key": "{{ $json.event_id }}"
}
}

Response Schema

{
"request_id": "council_abc123",
"stage1": [
{"model": "gpt-4", "content": "...", "tokens": 450}
],
"stage2": [
{"reviewer": "claude-3", "rankings": [...], "rationale": "..."}
],
"stage3": {
"synthesis": "The council identified three concerns...",
"verdict": "approved | rejected",
"confidence": 0.85,
"dissent": [{"model": "gemini-pro", "opinion": "..."}]
},
"metadata": {
"correlation_id": "exec_xyz789",
"models_consulted": ["gpt-4", "claude-3", "gemini-pro"],
"aggregate_rankings": {"Response A": 1.5, "Response B": 2.3},
"timing": {
"total_ms": 18500,
"stage1_ms": 8200,
"stage2_ms": 7100,
"stage3_ms": 3200
},
"token_usage": {
"input": 2400,
"output": 1850,
"total": 4250
}
}
}

Error Response Schema

{
"error": {
"code": "COUNCIL_TIMEOUT | RATE_LIMITED | PARTIAL_FAILURE | VALIDATION_ERROR",
"message": "Human-readable error description",
"details": {
"models_succeeded": ["gpt-4"],
"models_failed": ["claude-3", "gemini-pro"],
"retry_after_seconds": 30
}
}
}

3. Verdict Types for Automation

Verdict TypeUse CaseResponse StructureConfidence Interpretation
synthesisDetailed analysis (code review, design feedback)stage3.synthesis (string)Agreement level among models
binaryGo/no-go gates (triage, approval)stage3.verdict (approved/rejected), confidence (0-1)Proportion of models agreeing
tie_breakerDeadlocked decisionsChairman resolves split votesN/A (chairman decides)

Confidence Score Semantics:

  • 0.0-0.5: Strong disagreement, recommend human review
  • 0.5-0.7: Mixed consensus, proceed with caution
  • 0.7-0.9: Moderate agreement, generally reliable
  • 0.9-1.0: Strong consensus, high reliability

4. Workflow Templates

4.1 Code Review Automation

  • Trigger: GitHub PR webhook
  • Verdict: synthesis
  • Output: Detailed security, performance, and style analysis
  • Action: Post review comment to PR

4.2 Support Ticket Triage

  • Trigger: Ticket creation webhook
  • Verdict: binary
  • Output: approved (URGENT) or rejected (STANDARD)
  • Action: Route to appropriate queue based on verdict + confidence

4.3 Technical Design Decision

  • Trigger: Design doc submission
  • Verdict: synthesis with include_dissent: true
  • Output: Council recommendation + minority opinions
  • Action: Send to Slack/email for human review

5. Security Model

5.1 Trust Boundaries

┌─────────────────────────────────────────────────────────────────┐
│ UNTRUSTED ZONE │
│ ┌─────────────┐ │
│ │ GitHub/Jira │──webhook──┐ │
│ │ (PR, Ticket)│ │ │
│ └─────────────┘ │ │
└────────────────────────────┼────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────┐
│ n8n WORKFLOW (SEMI-TRUSTED) │
│ ┌──────────┐ ┌──────────────┐ ┌───────────────────────┐ │
│ │ Validate │───▶│ Sanitize │───▶│ Call LLM Council │ │
│ │ Webhook │ │ Input │ │ (API Key + TLS) │ │
│ └──────────┘ └──────────────┘ └───────────────────────┘ │
└────────────────────────────────────────────────────────────────┘

▼ HTTPS + API Key
┌────────────────────────────────────────────────────────────────┐
│ LLM COUNCIL API (TRUSTED) │
│ - Input validation │
│ - Rate limiting │
│ - Audit logging │
└────────────────────────────────────────────────────────────────┘

5.2 Authentication (Outbound to LLM Council)

API Key Authentication:

// n8n HTTP Request node - Header Auth
{
"headerAuth": {
"name": "Authorization",
"value": "Bearer {{ $credentials.llmCouncilApiKey }}"
}
}

Credential Storage Requirements:

  • Store API keys in n8n Credential Objects (encrypted at rest)
  • Never hardcode in workflow JSON
  • Use environment-specific credentials (dev/staging/prod)
  • Rotate keys quarterly or on suspected compromise

5.3 HMAC Signature Verification (Inbound Webhooks)

For webhook callbacks from LLM Council:

// n8n Function node - HMAC verification
const crypto = require('crypto');

const payload = JSON.stringify($input.first().json);
const secret = $credentials.llmCouncilWebhookSecret;
const receivedSig = $input.first().headers['x-council-signature'];
const receivedNonce = $input.first().headers['x-council-nonce'];
const receivedTimestamp = parseInt($input.first().headers['x-council-timestamp']);

// 1. Timestamp validation (±5 minutes)
const now = Math.floor(Date.now() / 1000);
if (Math.abs(now - receivedTimestamp) > 300) {
throw new Error('Timestamp too old or in future');
}

// 2. Nonce replay protection (requires external store)
// Store nonces in Redis/DB with TTL matching timestamp window
const nonceKey = `nonce:${receivedNonce}`;
if (await $env.cache.exists(nonceKey)) {
throw new Error('Nonce already used - replay attack detected');
}
await $env.cache.set(nonceKey, '1', { EX: 600 }); // 10 min TTL

// 3. HMAC verification with timing-safe comparison
const expectedSig = 'sha256=' + crypto
.createHmac('sha256', secret)
.update(`${receivedTimestamp}.${receivedNonce}.${payload}`)
.digest('hex');

if (!crypto.timingSafeEqual(
Buffer.from(expectedSig),
Buffer.from(receivedSig)
)) {
throw new Error('Invalid signature');
}

return $input.first();

5.4 Input Validation & Prompt Injection Mitigation

Input Sanitization:

// n8n Function node - Input sanitization
const input = $input.first().json;

// 1. Size limits
const MAX_DIFF_SIZE = 50000; // 50KB
const diff = input.diff?.substring(0, MAX_DIFF_SIZE) || '';

// 2. Remove potential prompt injection patterns
const sanitized = diff
.replace(/```system/gi, '```code') // Prevent system prompt injection
.replace(/\[INST\]/gi, '[CODE]') // Prevent instruction injection
.replace(/<<SYS>>/gi, '<<CODE>>'); // Prevent Llama-style injection

// 3. Escape special characters in structured fields
const safeSubject = input.subject?.replace(/[<>]/g, '') || '';

return {
json: {
diff: sanitized,
subject: safeSubject,
// Preserve original for audit
_original_size: input.diff?.length || 0,
_was_truncated: (input.diff?.length || 0) > MAX_DIFF_SIZE
}
};

LLM Council Server-Side Protections:

  • Input length validation (reject oversized payloads)
  • Rate limiting per API key
  • Audit logging of all requests
  • System prompt isolation from user content

5.5 Secret Rotation Strategy

SecretRotation FrequencyProcedure
API Key90 days or on compromiseGenerate new key → update n8n credential → revoke old key
Webhook Secret90 daysSupport dual secrets during rotation window
TLS CertificatesAuto-renew (Let's Encrypt)Managed by reverse proxy

6. Performance Configuration

Confidence TierModelsTypical LatencyTimeout SettingEst. Cost/Call
quick25-10s60000ms~$0.02
balanced310-20s90000ms~$0.04
high420-40s120000ms~$0.08

Cost Estimation Formula:

cost ≈ (input_tokens × $0.003 + output_tokens × $0.015) × num_models

Assumes GPT-4 class models. Actual cost varies by model mix.

HTTP Request Node Settings:

{
"options": {
"timeout": 120000,
"retry": {
"maxRetries": 3,
"retryOn": [429, 500, 502, 503, 504],
"retryDelay": 5000
}
}
}

7. Partial Failure Handling

Scenario: 2 of 3 models respond successfully.

PolicyBehaviorUse Case
require_allFail entire requestHigh-stakes decisions
require_majorityProceed with 2/3Default behavior
best_effortProceed with any responseLow-stakes, speed-critical

Default Behavior (require_majority):

  • If ≥ 50% of models respond, proceed with available responses
  • Flag partial failure in response metadata
  • Include models_failed array for debugging

8. Error Handling Strategy

┌─────────────┐     ┌──────────────┐     ┌─────────────────┐
│ Council │────▶│ IF Error? │────▶│ Retry w/ Backoff│
│ Request │ └──────┬───────┘ └────────┬────────┘
└─────────────┘ │ │
│ ▼
┌──────┴──────┐ ┌─────────────┐
│ Process │ │ Error Type? │
│ Success │ └──────┬──────┘
└─────────────┘ │
┌─────────────┼─────────────┐
▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌──────────┐
│ 429: Wait│ │ 5xx: │ │ Timeout: │
│ & Retry │ │ Fallback │ │ Human │
└──────────┘ └──────────┘ └──────────┘

n8n Implementation:

// Error handling with n8n Error Trigger
{
"nodes": [
{
"name": "Council Request",
"type": "n8n-nodes-base.httpRequest",
"continueOnFail": true // Don't stop workflow on error
},
{
"name": "Check Error",
"type": "n8n-nodes-base.if",
"parameters": {
"conditions": {
"string": [{
"value1": "={{ $json.error?.code }}",
"operation": "isNotEmpty"
}]
}
}
},
{
"name": "Handle 429",
"type": "n8n-nodes-base.wait",
"parameters": {
"amount": "={{ $json.error?.details?.retry_after_seconds || 30 }}"
}
}
]
}

Fallback Chain:

  1. Retry (3x with exponential backoff) - For transient errors
  2. Degrade (single model) - For persistent council failures (configurable per workflow)
  3. Human escalation - For critical decisions, never auto-approve

Policy Configuration (per workflow):

const FALLBACK_POLICY = {
"code-review": {
allow_single_model_fallback: true, // Code review can degrade
human_escalation_threshold: 0.5 // Escalate if confidence < 0.5
},
"security-approval": {
allow_single_model_fallback: false, // Security requires council
human_escalation_threshold: 0.8 // Higher bar for auto-approve
}
};

9. Observability & Debugging

9.1 Correlation ID Propagation

// Pass correlation ID through entire workflow
const correlationId = $json.correlation_id || `n8n_${$execution.id}`;

// Include in LLM Council request
{
"metadata": {
"correlation_id": correlationId,
"n8n_execution_id": $execution.id,
"n8n_workflow_id": $workflow.id
}
}

// Log for tracing
console.log(`[${correlationId}] Council request initiated`);

9.2 Metrics to Monitor

MetricDescriptionAlert Threshold
council_latency_p9595th percentile response time> 60s
council_error_ratePercentage of failed requests> 5%
council_consensus_rateRequests with confidence > 0.7< 80%
council_fallback_rateRequests using single-model fallback> 10%

9.3 Troubleshooting Guide

SymptomLikely CauseResolution
COUNCIL_TIMEOUTHigh tier + large inputReduce tier or truncate input
RATE_LIMITED (429)Too many concurrent requestsImplement request queueing
PARTIAL_FAILUREModel API issuesCheck models_failed, retry later
Inconsistent verdictsAmbiguous promptImprove prompt specificity
Low confidence scoresModels disagreeReview dissent, consider human review
HMAC validation failedClock drift or secret mismatchSync NTP, verify secret

10. Versioning Strategy

API Versioning:

  • Templates target LLM Council API v1 (/v1/council/*)
  • v1 will remain supported for 18 months after v2 GA
  • Breaking changes require major version bump
  • Additive changes (new fields) are non-breaking

Template Versioning:

{
"name": "LLM Council - Code Review",
"meta": {
"llm_council_template_version": "1.0.0",
"llm_council_api_version": "v1",
"n8n_min_version": "1.20.0"
}
}

Compatibility Matrix:

Template VersionAPI Versionn8n Version
1.0.xv1≥1.20.0
2.0.x (future)v1, v2≥1.30.0

Consequences

Positive

  1. Lower integration barrier - Engineers can import ready-to-use templates
  2. Best practices encoded - Security patterns (HMAC, input validation) built into examples
  3. Discoverability - n8n Creator Hub expands reach to automation engineers
  4. Reference architecture - Patterns applicable to other platforms (Make, Zapier)

Negative

  1. Maintenance burden - Templates need updates when API changes
  2. Version coupling - n8n workflow format may evolve
  3. Limited scope - Only covers HTTP integration, not MCP
  4. Sync limitations - Long deliberations may exceed timeouts

Trade-offs

ChoiceAlternativeRationale
n8n firstZapier, MakeOpen source, engineer audience, self-hosted
HTTP APIMCP ServerHTTP is universal, n8n lacks MCP support
3 templatesMore use casesQuality over quantity; community can extend
Sync-firstAsync-firstSimpler to document; covers 95% of use cases
HTTP Request nodeCustom n8n nodeFaster to ship; custom node for future roadmap

Deferred: Async Webhook Callbacks

  • Why not now: Adds complexity (callback URL registration, delivery guarantees, state management)
  • When to revisit: If users report timeout failures > 5% of calls

Implementation

Files Created

FilePurpose
docs/integrations/index.mdIntegration landing page
docs/integrations/n8n.mdComprehensive n8n guide
docs/examples/n8n/code-review-workflow.jsonPR review template
docs/examples/n8n/support-triage-workflow.jsonTriage template
docs/examples/n8n/design-decision-workflow.jsonDesign review template
docs/blog/08-n8n-workflow-automation.mdBlog post
tests/test_n8n_examples.pyTDD test suite (28 tests)

Files Modified

FileChange
mkdocs.ymlAdded Integrations section, blog entry

Future Work

  1. Async webhook callbacks - When timeout failures exceed 5%
  2. Custom n8n node - Better UX, typed fields, versioned behavior
  3. n8n Creator Hub - Submit templates for official listing
  4. Additional platforms - Make, Zapier adapters
  5. MCP integration - When n8n adds MCP support

Validation

Acceptance Criteria

  • All workflow JSON files are valid and importable
  • Documentation covers security (HMAC, auth, input validation)
  • Documentation covers timeouts and error handling
  • Blog post reviewed by LLM Council for engineer audience
  • 28 TDD tests pass
  • mkdocs builds without warnings

Test Results

tests/test_n8n_examples.py::TestN8nWorkflowExamples - 19 passed
tests/test_n8n_examples.py::TestIntegrationDocs - 5 passed
tests/test_n8n_examples.py::TestBlogPost - 4 passed
Total: 28 passed

References

External Resources

Appendix A: Prompt Engineering Patterns

Code Review Prompt

You are a code review expert. Review this code change for:
1. Security vulnerabilities (injection, XSS, etc.)
2. Performance issues
3. Code style and best practices
4. Potential bugs

Diff:
{{ $json.diff }}

Provide specific, actionable feedback with line references.

Triage Prompt

Should this ticket be escalated to URGENT priority?

Criteria for URGENT:
- Production system down
- Data loss or corruption
- Security incident
- Multiple users affected

Ticket Subject: {{ $json.subject }}
Ticket Body: {{ $json.body }}

Respond approved (URGENT) or rejected (STANDARD).

Design Review Prompt

As a council of senior architects, evaluate this design:

{{ $json.design_doc }}

Consider: Scalability, Maintainability, Security, Cost, Complexity.

Provide:
- Recommendation (proceed/revise/reject)
- Critical concerns
- Suggested improvements

Appendix B: Council Review Summary

Review Date: 2025-12-30 Tier: High (4 models) Models: grok-4.1-fast, gemini-3-pro-preview, gpt-5.2, claude-opus-4.5

Key Feedback Incorporated

  1. API Schemas - Added complete request/response/error schemas (Section 2)
  2. Security Depth - Expanded to include authentication, input validation, prompt injection mitigation, nonce-based replay protection, secret rotation (Section 5)
  3. Partial Failure Semantics - Documented require_all/require_majority/best_effort policies (Section 7)
  4. Error Handling - Added n8n-specific implementation patterns (Section 8)
  5. Observability - Added correlation ID propagation, metrics, troubleshooting guide (Section 9)
  6. Versioning - Added API and template versioning strategy (Section 10)
  7. Cost Estimation - Added per-tier cost estimates and formula (Section 6)
  8. Async Trade-off - Documented why sync-first, when to revisit (Trade-offs section)

Deferred Recommendations

  • Custom n8n node (prioritized for future roadmap)
  • Extension points for custom verdict types (future work)
  • mTLS for enterprise deployments (not yet required)