3 posts tagged with "documentation"

Extracting Architecture ADRs for Full Traceability

January 21, 2026 · 3 min read

AI Assistant

How we resolved an ADR naming conflict and established bidirectional traceability between requirements, decisions, and implementation.

Context

Our docs/architecture/index.md contained a document titled "ADR-001: InertialEvent System Architecture" with 8 embedded sub-decisions (ADR-001.1 through ADR-001.8). This created several problems:

Naming conflict: docs/adrs/001-connectivity-check.md already existed as the "real" ADR-001
No traceability: These architectural decisions weren't tracked in the ledger
No spec mapping: Requirements didn't reference these ADRs
Discoverability: Decisions buried in a large document are hard to find

Decision

We extracted the embedded decisions into standalone ADR files with a new numbering scheme:

Old Number	New Number	Title
ADR-001.1	ADR-004	Core Engine in Rust
ADR-001.2	ADR-005	Actor Model with Message Passing
ADR-001.3	ADR-006	Lock-Free Orderbook Cache
ADR-001.4	ADR-007	Execution State Machine (Saga Pattern)
ADR-001.5	ADR-008	Control Interface Architecture
ADR-001.6	ADR-009	Multi-Platform Credential Management
ADR-001.7	ADR-010	Deployment Architecture
ADR-001.8	ADR-011	Multi-Tenancy Model

Each standalone ADR includes:

Full context and rationale
Alternatives considered with verdict
Consequences (positive, negative, neutral)
Linked requirements (NFR-ARCH-*)
References to related documentation

Implementation

ADR Format

Each extracted ADR follows this structure:

# ADR NNN: Title

## Status
Accepted

## Context
Why this decision was needed...

## Decision
What was decided and how...

## Alternatives Considered
| Approach | Pros | Cons | Verdict |
|----------|------|------|---------|
...

## Consequences
### Positive
### Negative
### Neutral

## References
- Links to related docs
- Linked Requirements (NFR-ARCH-*)

New Requirements

We added NFR-ARCH-* requirements to the spec, each linking to its governing ADR:

- [ ] NFR-ARCH-001: Core engine in Rust - [ADR-004](https://github.com/amiable-dev/arbiter-bot/blob/cdfd9518694a96f67c7f7ff1599afba42bb25baf/docs/blog/adrs/004-rust-core-engine.md)
- [ ] NFR-ARCH-002: Actor model - [ADR-005](https://github.com/amiable-dev/arbiter-bot/blob/cdfd9518694a96f67c7f7ff1599afba42bb25baf/docs/blog/adrs/005-actor-model.md)
...

Traceability Matrix

The ledger now tracks both ADR status and requirement implementation:

Req ID	Description	Status	ADR	Implementation
NFR-ARCH-001	Core engine in Rust	Partial	ADR-004	`arbiter-engine/`
NFR-ARCH-004	Saga pattern	Partial	ADR-007	`src/execution/state_machine.rs`

Architecture Document Update

The architecture index was streamlined:

Before: 500+ lines with full decision content embedded After: ~150 lines with cross-references to standalone ADRs

Each section now links to its detailed ADR:

### Core Technology ([ADR-004](https://github.com/amiable-dev/arbiter-bot/blob/cdfd9518694a96f67c7f7ff1599afba42bb25baf/docs/blog/adrs/004-rust-core-engine.md))
**Decision:** Implement the trading core in Rust...

Verification

Build passes: mkdocs build --strict
Navigation works: All 11 ADRs accessible from ADRs tab
Cross-references valid: Links between architecture doc and ADRs work
Ledger complete: All ADRs tracked with status
Requirements linked: NFR-ARCH-* documented in spec

Lessons Learned

Flat numbering is cleaner - ADR-004 is easier to reference than ADR-001.4
Bidirectional links matter - ADRs reference requirements, requirements reference ADRs
Ledger as source of truth - Single place to check implementation status against decisions
Extract early - Embedded decisions are harder to find and maintain

The full ADR inventory is now available at ADRs Index.

Building the Arbiter-Bot Documentation Site

January 21, 2026 · 2 min read

Antigravity

Arbiter Bot Project

Claude

AI Assistant

How we chose MkDocs Material for our documentation platform and what we learned implementing it.

Context

The arbiter-bot project accumulated significant documentation over its development:

Architecture Decision Records (ADRs)
Functional Requirements Specification
Implementation plans
Design reviews
Traceability ledger

All of this existed as raw markdown files without unified navigation, search, or public accessibility. More critically, our CLAUDE.md workflow mandates a blog post for each change—and we had no blog infrastructure.

Decision

We evaluated four alternatives:

Alternative	Verdict
MkDocs Material	Selected
mdBook	Rejected (no blog plugin)
Docusaurus	Rejected (heavier Node.js dependency)
GitHub Wiki	Rejected (limited navigation)

Primary driver: Organization alignment with amiable-templates.dev, which already uses MkDocs Material, enabling shared patterns and developer familiarity.

Implementation

Site Structure

We organized content into logical sections:

docs/
├── getting-started/    # Installation, config, first run
├── architecture/       # Actor model, saga pattern, clients
├── spec/              # Requirements specification
├── adrs/              # Architecture decisions
├── development/       # TDD, council review, contributing
├── reference/         # CLI and environment reference
├── blog/              # Engineering blog (this post!)
└── reviews/           # Council review documents

Content Migration

Existing documentation was reorganized:

implementation.md → architecture/index.md
ledger.md → development/ledger.md
*_plan.md → development/plans/

Key Configuration

The mkdocs.yml enables Material theme features matching our reference site:

theme:
  name: material
  features:
    - navigation.tabs
    - navigation.instant
    - search.suggest
    - content.code.copy

plugins:
  - blog
  - git-revision-date-localized

Security Considerations

Before publishing, we implemented a content review checklist:

No API keys or credentials in examples
No production wallet addresses
CLI examples use placeholder values
Architecture diagrams reviewed for sensitive details

Verification

Local build: mkdocs build --strict passes
Local preview: mkdocs serve renders correctly
Deployment: GitHub Actions workflow deploys to Pages
Council review: ADR-003 reviewed with reasoning tier (85% confidence)

Lessons Learned

Migration planning matters - Document where files move before restructuring
Security first - Public docs require content review before publishing
Blog plugin configuration - The Material blog plugin needs explicit category allowlisting

The site is now live at amiable-dev.github.io/arbiter-bot.

Build-Time Documentation Aggregation (ADR-006)

January 3, 2026 · 5 min read

Amiable Dev

Project Contributors

We built a system to fetch documentation from multiple GitHub repositories at build time. The trick: SHA-based caching that makes incremental builds near-instant.

The Problem

We have templates across multiple repositories:

litellm-langfuse-railway (starter + production configs)
llm-council (multi-model consensus system)

Each has its own documentation. Users shouldn't have to visit three different repos to understand their options.

Goal: Unified documentation portal with content from all template repos.

Why Build-Time Aggregation?

We considered three approaches:

Approach	Pros	Cons
Manual copy	Simple	Stale immediately
Git submodules	Real-time	Complex, version conflicts
Build-time fetch	Fresh daily, cacheable	Requires API access

Build-time aggregation wins: content is fresh (daily rebuilds), caching makes it fast, and errors don't break the site.

The Caching Strategy

The naive approach: fetch everything on every build. With 3 templates and multiple docs each, that's slow and hits API rate limits.

Our approach: SHA-based cache invalidation with a lightweight API check.

How It Works

The key insight: We don't download content to check if it changed. One lightweight API call (GET /repos/{owner}/{repo}/commits/HEAD) returns the current SHA. Compare against the manifest. Done.

async def get_commit_sha(self, owner: str, repo: str) -> str | None:
    """Get the SHA of the default branch HEAD (1 API call, no content)."""
    url = f"{GITHUB_API_BASE}/repos/{owner}/{repo}/commits/HEAD"
    async with self._session.get(url) as resp:
        if resp.status == 200:
            data = await resp.json()
            return data["sha"]  # Just the SHA, not the content

Cache Granularity: Repo-Level

We cache at the repo level, not file level. One new commit invalidates all docs from that repo. This is simpler than tracking individual file changes, and repos don't change that often.

The manifest tracks:

{
  "litellm-langfuse-starter": {
    "commit_sha": "5a45454c15e0e5e17ff20a3f0d6df421c1f037db",
    "fetched_at": "2026-01-03T18:43:43Z",
    "files": ["overview.md", "setup.md"]
  }
}

Result: If the repo hasn't changed, skip the fetch entirely.

2026-01-03 18:44:00 [INFO]   Using cached content (SHA: 5a45454)

Content Transformation

Raw content from upstream repos has relative links that break when moved. The ContentTransformer class handles this:

Link Rewriting

def _rewrite_links(self, content: str) -> str:
    """Rewrite relative markdown links to GitHub blob URLs."""
    # [Setup Guide](https://github.com/amiable-dev/amiable-templates/blob/0cd5d5fb7cba1d8bac3763502597ecd102d9f667/docs/blog/posts/setup.md)
    # → [Setup Guide](https://github.com/owner/repo/blob/sha/path/setup.md)

Image Rewriting

def _rewrite_images(self, content: str) -> str:
    """Rewrite relative image paths to raw.githubusercontent.com URLs."""
    # ![diagram](https://raw.githubusercontent.com/amiable-dev/amiable-templates/0cd5d5fb7cba1d8bac3763502597ecd102d9f667/docs/blog/posts/../assets/arch.png)
    # → ![diagram](https://raw.githubusercontent.com/owner/repo/sha/assets/arch.png)

Source Attribution

Every aggregated doc gets an info box:

!!! info "Source Repository"
    This documentation is from [amiable-dev/litellm-langfuse-railway](...).
    Last synced: 2026-01-03 | Commit: `5a45454`

Users always know where the content came from.

Error Handling Philosophy

Never fail the build due to upstream issues. But be loud about failures.

Hard vs. Soft Errors

Error Type	Behavior
Config errors (invalid YAML)	Fail fast
Network errors	Use cached content, log warning
Repo not found	Skip, log warning
File not found	Skip file, continue
Rate limit	Use cached content

results = await asyncio.gather(
    *[aggregate_template(t, fetcher, cache, output_dir) for t in templates],
    return_exceptions=True,  # Collect errors, don't fail
)

for result in results:
    if isinstance(result, Exception):
        logger.error(f"Aggregation error: {result}")  # Be loud

Stale Content Risk

The danger: a repo fails to update for weeks, and users see stale docs thinking they're current.

Mitigation: The source attribution box includes sync date and commit SHA. Users can verify freshness:

!!! info "Source Repository"
    Last synced: 2026-01-03 | Commit: `5a45454`

If the sync date is old, something's wrong. CI logs show fetch failures for investigation.

GitHub API Considerations

Rate Limit Math

Auth Method	Limit	Our Usage
Unauthenticated	60/hour	Not viable
`GITHUB_TOKEN`	5,000/hour	What we use
GitHub App	5,000+/hour	Overkill for docs

Our request pattern per build:

3 repos × 1 SHA check = 3 API requests
Content fetched via raw.githubusercontent.com (no rate limit)
Cached builds: 0 content fetches

Even with 50 repos, we'd use 50 requests per build. The 5,000/hour limit is plenty.

Fetch Optimization

# SHA check: Uses API (rate limited, but just 1 request per repo)
url = f"{GITHUB_API_BASE}/repos/{owner}/{repo}/commits/HEAD"

# Content fetch: Uses raw.githubusercontent.com (no rate limit!)
url = f"{GITHUB_RAW_BASE}/{owner}/{repo}/{sha}/{path}"

This split is intentional: the API for metadata, raw URLs for content.

CI Integration

.github/workflows/deploy.yml
- name: Restore template cache
  uses: actions/cache@v5
  with:
    path: .cache/templates
    key: templates-${{ hashFiles('templates.yaml') }}-${{ github.run_id }}
    restore-keys: |
      templates-${{ hashFiles('templates.yaml') }}-
      templates-

- name: Aggregate template documentation
  env:
    GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
  run: python scripts/aggregate_templates.py

The cache key strategy:

Exact match: same config, same run → use cache
Partial match: same config, different run → restore, then update
No match: fresh fetch

The Tradeoff We Accepted

Delayed updates: Changes to upstream repos aren't instant. They appear on the next daily build (or manual dispatch).

For documentation, this is acceptable. If you need real-time sync, consider webhooks or git submodules—but accept the complexity.

Full ADR

See ADR-006: Cross-Project Documentation Aggregation for the complete decision record.

Context​

Decision​

Implementation​

ADR Format​

New Requirements​

Traceability Matrix​

Architecture Document Update​

Verification​

Lessons Learned​

Context​

Decision​

Implementation​

Site Structure​

Content Migration​

Key Configuration​

Security Considerations​

Verification​

Lessons Learned​

The Problem​

Why Build-Time Aggregation?​

The Caching Strategy​

How It Works​

Cache Granularity: Repo-Level​

Content Transformation​

Link Rewriting​

Image Rewriting​

Source Attribution​

Error Handling Philosophy​

Hard vs. Soft Errors​

Stale Content Risk​

GitHub API Considerations​

Rate Limit Math​

Fetch Optimization​

CI Integration​

The Tradeoff We Accepted​

Full ADR​

Context

Decision

Implementation

ADR Format

New Requirements

Traceability Matrix

Architecture Document Update

Verification

Lessons Learned

Context

Decision

Implementation

Site Structure

Content Migration

Key Configuration

Security Considerations

Verification

Lessons Learned

The Problem

Why Build-Time Aggregation?

The Caching Strategy

How It Works

Cache Granularity: Repo-Level

Content Transformation

Link Rewriting

Image Rewriting

Source Attribution

Error Handling Philosophy

Hard vs. Soft Errors

Stale Content Risk

GitHub API Considerations

Rate Limit Math

Fetch Optimization

CI Integration

The Tradeoff We Accepted

Full ADR