7 posts tagged with "amiable-templates"

View All Tags

Building Template Management Tooling: ADR-007

January 5, 2026 · 4 min read

Amiable Dev

Project Contributors

How we built a CLI tool and Claude Code skill to manage our template registry with three levels of validation.

The Problem

ADR-003 gave us a declarative template registry (templates.yaml), but managing it was painful:

Error-prone: Nested YAML structures are easy to mess up
Undiscoverable: New contributors didn't know required fields
No feedback: Errors only surfaced during CI builds
Manual validation: Run JSON Schema checks by hand

We needed tooling for both humans and LLMs to manage templates reliably.

The Solution: Hybrid Approach

We evaluated four options:

Option	Verdict
Claude Code Skills only	Limited to Claude Code users
MCP Server	Overkill for 3 templates
Makefile only	No guided prompts
Hybrid (Skills + Makefile + CLI)	Best of all worlds

The hybrid approach uses a single Python CLI as the canonical implementation, with both Skills and Makefile as interfaces.

The CLI: `template_manager.py`

All operations go through one entry point:

# Validation
python scripts/template_manager.py validate
python scripts/template_manager.py validate --deep  # Network checks

# List templates
python scripts/template_manager.py list
python scripts/template_manager.py list --category observability --format json

# CRUD operations
python scripts/template_manager.py add --id my-template --repo owner/repo ...
python scripts/template_manager.py update my-template --tier production
python scripts/template_manager.py remove old-template

Why One CLI?

Single source of truth: Skills and Makefile both call the same code
Testable: 54 unit tests cover all operations
Consistent: Same validation logic everywhere

Three Levels of Validation

Not all validation is equal. We separated checks by speed and importance:

Level	When	What	Blocking?
Level 1: Schema	Always	JSON Schema conformance, types, required fields	Yes
Level 2: Semantic	Always	Unique IDs, valid category refs, HTTPS URLs	Yes
Level 3: Network	`--deep` only	URL reachability, GitHub repo existence	No (warning)

Level 3 is opt-in because network checks are slow and external services can be flaky:

$ python scripts/template_manager.py validate --deep

Network warnings:
  - Template 'litellm-langfuse-starter' links.railway_template not found (404)
Validation passed: templates.yaml

The template is still valid—we just warn about the broken link.

Claude Code Skill Integration

For LLM-assisted workflows, we created a skill at .claude/skills/template-registry/:

template-registry/
├── SKILL.md           # Main instructions (safety rules, CLI commands)
├── schema-reference.md # Field documentation
└── examples.md        # Common patterns

The skill teaches Claude to use the CLI safely:

## Important Safety Rules
ALWAYS run validation before any write operation
NEVER commit directly to main - create a branch/PR
Treat all LLM outputs as untrusted until validated

Now you can ask Claude: "Add a new template for my-awesome-project" and it will:

Use the CLI with proper arguments
Run validation
Create a branch and PR

Security Hardening

LLM-assisted editing introduces risks. We added multiple protections:

Input Validation

# Reject malformed GitHub owner/repo names
GITHUB_OWNER_PATTERN = re.compile(r"^[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,37}[a-zA-Z0-9])?$")
GITHUB_REPO_PATTERN = re.compile(r"^[a-zA-Z0-9._-]{1,100}$")

Symlink Protection

# Reject symlinks to prevent LFI attacks
fd = os.open(str(path), os.O_RDONLY | os.O_NOFOLLOW)

YAML Hardening

yaml.allow_duplicate_keys = False  # Catch accidental overwrites

Atomic Writes

# Write to temp file, then atomic rename
fd, temp_path = tempfile.mkstemp(dir=path.parent)
# ... write content ...
os.replace(temp_path, path)  # Atomic on POSIX

Makefile Integration

For automation and CI, everything is available via make:

make validate        # Level 1 + 2
make validate-deep   # Level 1 + 2 + 3
make templates       # List all
make templates-json  # JSON output
make help           # Show all targets

The build target runs validation first:

build: validate
	python scripts/aggregate_templates.py
	mkdocs build --strict

Pre-commit Hook

Validation runs automatically before commits:

# .pre-commit-config.yaml
- repo: local
  hooks:
    - id: template-manager-validate
      name: Validate templates.yaml (semantic)
      entry: python scripts/template_manager.py validate
      files: ^templates\.yaml$

Now invalid templates can't even be committed locally.

What We Learned

One CLI, many interfaces: Skills and Makefile are just wrappers
Tiered validation saves time: Fast checks always, slow checks on demand
LLMs need guardrails: Validation-first prevents hallucinated YAML
Atomic operations matter: Temp file + rename prevents corruption

Implementation Stats

4 phases over 2 days
54 tests with full coverage
1,053 lines of Python
16 GitHub issues tracked and closed

What's Next

MCP Server: Reconsider at 20+ templates (current: 3)
Template Linting: Check for common misconfigurations
Auto-sync: Fetch metadata from Railway API

Links:

Building an OSS Foundation: ADR-001 Implementation

January 3, 2026 · 3 min read

Amiable Dev

Project Contributors

How we established community standards for amiable-templates using Architecture Decision Records (ADRs) and multi-model AI review.

!!! info "What's an ADR?" An Architecture Decision Record documents significant technical decisions with context, options considered, and rationale. It creates a searchable history of why things are the way they are.

The Problem

We're building amiable-templates to aggregate deployment templates for AI infrastructure into a single portal. Before writing any aggregation code, we needed to answer: How do we structure an OSS project that invites contribution?

Starting from scratch means making a lot of decisions:

What license?
How do contributors know what's expected?
How do we handle security reports?
What governance model fits a small project?

The Solution: Adopt Proven Patterns

Instead of reinventing the wheel, we borrowed the existing OSS ADR-033, which had already been reviewed with the LLM Council and battle-tested for llm-council.dev.

The Files

File	Purpose
`LICENSE`	MIT - maximum flexibility
`CODE_OF_CONDUCT.md`	Contributor Covenant v2.1
`CONTRIBUTING.md`	How to contribute
`SECURITY.md`	48hr target response time
`GOVERNANCE.md`	Decision-making process
`SUPPORT.md`	Where to get help

GitHub Configuration

.github/
├── CODEOWNERS           # Auto-assign reviewers
├── dependabot.yml       # Keep deps updated
├── ISSUE_TEMPLATE/      # Structured bug reports
└── PULL_REQUEST_TEMPLATE.md

Example: CODEOWNERS

Here's how we route reviews to the right people:

# Default: maintainers review everything
* @amiable-dev/maintainers

# Critical config requires explicit maintainer approval
templates.yaml @amiable-dev/maintainers
mkdocs.yml @amiable-dev/maintainers

# CI/CD changes are sensitive
.github/ @amiable-dev/maintainers

# ADRs need architectural review
docs/adrs/ @amiable-dev/maintainers

This means any PR touching templates.yaml (our template registry) automatically requests review from maintainers. As the project grows, we can split ownership - e.g., docs/ @docs-team.

The Interesting Part: LLM Council Review

We used LLM Council to review our ADR before accepting it. LLM Council is an MCP server that queries multiple AI models in parallel, has them critique each other's responses, and synthesizes a consensus verdict.

Four models (GPT-5.2, Claude Opus 4.5, Gemini 3 Pro, Grok 4.1) reviewed our draft ADR:

What they caught:

Finding	Our Response
Missing CI/CD workflows	Added deploy.yml and security.yml
GOVERNANCE.md premature for solo project	Simplified, will expand at 3+ maintainers
Need template intake policy	Added to CONTRIBUTING.md

The full review is documented in ADR-001.

Tracking It All

We used GitHub Issues to track implementation:

Epic: #5 - Complete OSS Foundation
Sub-issues: Labels (#6), Branch Protection (#7), Blog (#8), etc.

This gives visibility into what's done and what's remaining.

What's Next

With the foundation in place, we're moving through the remaining ADRs:

ADR-002: MkDocs site architecture
ADR-003: Template configuration system
ADR-004: CI/CD & deployment
ADR-005: DevSecOps implementation
ADR-006: Cross-project documentation aggregation

Each follows the same process: draft, LLM Council review, implement, document.

Links:

Choosing MkDocs Material: ADR-002 Site Architecture

January 3, 2026 · 4 min read

Amiable Dev

Project Contributors

Why we chose MkDocs Material over Docusaurus or a custom solution, and how we structured the site.

The Problem

We needed a documentation site that could showcase templates in a scannable, attractive format. The site also needed to aggregate documentation from multiple template repositories, provide excellent search, support dark/light mode for accessibility, and be easy for contributors to work with.

Three options emerged: MkDocs Material, Docusaurus, or a custom Next.js/Astro site.

Why MkDocs Material?

Criteria	MkDocs Material	Docusaurus	Custom
Stack alignment	Python (matches scripts)	React/Node	Varies
Setup time	Hours	Hours	Days/Weeks
Maintenance	Low	Medium	High
Search	Built-in (lunr.js)	Algolia needed	Build it
Dark mode	Built-in	Built-in	Build it

Why not Docusaurus? It's a great framework, and we use it in our own blog amiable.dev but it would introduce React/Node into a Python-focused project. Our aggregation scripts are Python, and having a consistent stack reduces cognitive load.

Why not custom? A Next.js or Astro site would give us full control, but it's overkill for documentation. We'd spend weeks building what MkDocs Material gives us out of the box.

The deciding factor: consistency. Our llm-council docs already use MkDocs Material. Same tooling, same patterns, same contributor experience.

The Architecture

docs/
├── index.md          # Hero + featured templates
├── quickstart.md     # Prominent, top-level
├── templates/        # Template grid + aggregated docs
├── adrs/             # Architecture decisions
├── blog/             # You're reading it
└── stylesheets/
    └── extra.css     # Hero + grid styling

We use top-level tabs for main sections:

nav:
  - Home: index.md
  - Quick Start: quickstart.md
  - Templates: templates/index.md
  - ADRs: adrs/index.md
  - Contributing: contributing.md
  - Blog: blog/index.md

Quick Start gets its own tab because that's what most visitors want.

The Template Grid

We wanted a scannable grid of template cards without any JavaScript complexity. Here's how we built it using pure markdown with custom CSS:

<div class="template-grid" markdown="1">

<div class="template-card" markdown="1">

### LiteLLM + Langfuse Starter

Production-ready LLM proxy with observability.

**Features:**
- 100+ LLM providers via LiteLLM
- Request tracing with Langfuse
- Cost tracking and analytics

**Estimated Cost:** ~$29-68/month

[:octicons-rocket-16: Deploy](https://railway.app/template/...)
[:octicons-mark-github-16: Source](https://github.com/amiable-dev/litellm-langfuse-railway)

</div>

<div class="template-card" markdown="1">

### Another Template

Description here...

</div>

</div>

The markdown attribute is key—it tells MkDocs Material to process the markdown inside the HTML divs.

The CSS does the heavy lifting:

.template-grid {
  display: grid;
  grid-template-columns: repeat(auto-fit, minmax(320px, 1fr));
  gap: 1.5rem;
}

.template-card {
  border: 1px solid var(--md-default-fg-color--lightest);
  border-radius: 0.5rem;
  padding: 1.5rem;
}

.template-card:hover {
  box-shadow: 0 4px 12px rgba(0, 0, 0, 0.1);
  transform: translateY(-2px);
}

No framework needed. Pure CSS grid with markdown content.

Dark Mode

MkDocs Material handles dark mode elegantly with palette configuration. Users get automatic detection based on their system preference, plus a toggle to override:

theme:
  name: material
  palette:
    - media: "(prefers-color-scheme: light)"
      scheme: default
      primary: deep purple
      accent: deep purple
      toggle:
        icon: material/brightness-7
        name: Switch to dark mode
    - media: "(prefers-color-scheme: dark)"
      scheme: slate
      primary: deep purple
      accent: deep purple
      toggle:
        icon: material/brightness-4
        name: Switch to light mode

The toggle icon appears in the header. No JavaScript to write, no state to manage—it just works.

What We Learned

Start with constraints: "No JavaScript" forced simpler, more maintainable solutions
Reuse organizational patterns: Same theme as llm-council = less cognitive load
Put Quick Start first: Most visitors want to deploy, not read architecture docs

What's Next

ADR-003: Template configuration system (templates.yaml)
ADR-006: Cross-project documentation aggregation

Links:

Designing the Template Registry: ADR-003

January 3, 2026 · 4 min read

Amiable Dev

Project Contributors

How we built a declarative configuration system for Railway templates with JSON Schema validation.

The Problem

We needed a way to register templates without touching code. Every new template shouldn't require modifying Python scripts or HTML—just add an entry to a configuration file and the site rebuilds.

The configuration needed to:

Define which templates appear on the site
Specify where to find documentation in each repo
Provide metadata for the template grid (features, cost, tags)
Catch errors before they reach production

Why YAML?

We considered three formats:

Format	Pros	Cons
YAML	Human-readable, comments allowed, matches mkdocs.yml	Syntax can be tricky
TOML	Matches Railway's railway.toml	Less common in Python ecosystem
JSON	Strict, universal	No comments, verbose

YAML won because:

Consistency: Our mkdocs.yml is already YAML
Comments: We can document inline why certain fields exist
Readability: Non-engineers can understand and edit it

The Schema

Here's what a template entry looks like:

templates:
  - id: litellm-langfuse-starter
    repo:
      owner: "amiable-dev"
      name: "litellm-langfuse-railway"
    title: "LiteLLM + Langfuse Starter"
    description: "Production-ready LLM gateway with observability"
    category: observability
    tags:
      - litellm
      - langfuse
    directories:
      docs:
        - path: "starter/README.md"
          target: "overview.md"
    links:
      railway_template: "https://railway.app/template/..."
      github: "https://github.com/amiable-dev/litellm-langfuse-railway"
    features:
      - "100+ LLM Providers"
      - "Cost Tracking"
    estimated_cost:
      min: 29
      max: 68
      currency: "USD"
      period: "month"

Required fields: id, repo, title, description, category, directories.docs

Everything else is optional.

Validation with JSON Schema

YAML is flexible, which means it's easy to make mistakes. We use JSON Schema to catch errors early:

# templates.schema.yaml
properties:
  templates:
    items:
      type: object
      additionalProperties: false  # Catch typos!
      required:
        - id
        - repo
        - title
        - description
        - category
        - directories
      properties:
        id:
          type: string
          pattern: "^[a-z][a-z0-9-]*$"
        # ...

The additionalProperties: false is important. Here's what happens with a typo:

# Bad config - spot the typo
templates:
  - id: my-template
    repo:
      owner: "amiable-dev"
      name: "my-repo"
    title: "My Template"
    description: "A template"
    category: observability
    directories:
      docs:
        - path: "README.md"
          target: "overview.md"
    featurs:  # Typo!
      - "Feature 1"

With additionalProperties: false, the schema rejects this:

$.templates[0]: Additional properties are not allowed ('featurs' was unexpected)

Without it, the typo would silently pass—and the aggregation script would just skip the field.

CI Integration

Validation runs on every PR:

# .github/workflows/validate.yml
- name: Validate templates.yaml schema
  run: check-jsonschema --schemafile templates.schema.yaml templates.yaml

- name: Validate unique template IDs
  run: |
    python -c "
    import yaml
    with open('templates.yaml') as f:
        config = yaml.safe_load(f)
    ids = [t['id'] for t in config.get('templates', [])]
    duplicates = [id for id in ids if ids.count(id) > 1]
    if duplicates:
        exit(1)
    "

Error messages are actionable:

$.templates[0]: 'repo' is a required property
$.templates[0].id: 'INVALID_ID' does not match '^[a-z][a-z0-9-]*$'

The Aggregation Flow

CI validates templates.yaml against the JSON Schema
Python aggregator reads the config (YAML → Python dict)
For each template, fetches docs from directories.docs paths
Transforms content (rewrites links, adds attribution)
Writes to docs/templates/{id}/
MkDocs builds the static site

The script uses .get() for optional fields, so missing features or estimated_cost don't break the build.

What We Learned

Strict schemas catch bugs early: additionalProperties: false is your friend
Validate on PRs, not just deploys: Developers should see errors before merge
JSON Schema can't do everything: We added a Python check for unique IDs

What's Next

ADR-006: Cross-project documentation aggregation (the script that reads this config)

Links:

CI/CD for a Docs Site: ADR-004

January 3, 2026 · 4 min read

Amiable Dev

Project Contributors

How we built a deployment pipeline that stays fresh without manual intervention.

The Problem

We needed a CI/CD pipeline that could:

Deploy on merge to main
Aggregate docs from upstream repos daily
Allow manual rebuilds with cache bypass
Run security scanning without slowing deploys

Why GitHub Pages?

We considered three options:

Platform	Cost	PR Previews	HTTPS	Vendor Count
GitHub Pages	Free	No	Auto (*.github.io)	1
Netlify/Vercel	Free tier	Yes	Auto	2
Railway	~$5/mo	Yes	Auto	2

Cost wasn't the deciding factor—all have generous free tiers. What mattered:

Vendor consolidation - secrets, permissions, and logs in one place
No external OAuth - fewer security surface areas
Workflow simplicity - deploy-pages action just works

The trade-off: No PR preview deployments. We accepted this because our site is documentation—reviewing markdown diffs is sufficient. For a React app with visual changes, we'd choose differently.

Note: Custom domains need DNS configuration and propagation time. The *.github.io subdomain gets HTTPS immediately.

The Pipeline

Key insight: security.yml runs in parallel with deploy.yml. A linting failure doesn't block deployment—but it does show up as a failed check on the commit.

Three triggers, one pipeline:

on:
  push:
    branches: [main]
  schedule:
    - cron: '0 6 * * *'  # Daily at 6 AM UTC
  workflow_dispatch:
    inputs:
      force_refresh:
        type: boolean
        default: false

Caching Strategy

Template aggregation fetches docs from GitHub repos. Without caching, every build would re-fetch everything.

Our approach:

Cache key includes hashFiles('templates.yaml') - config changes invalidate
Restore keys allow partial cache hits
Manifest tracking in aggregation script compares commit SHAs

- name: Restore template cache
  if: ${{ github.event.inputs.force_refresh != 'true' }}
  uses: actions/cache@v5
  with:
    path: .cache/templates
    key: templates-${{ hashFiles('templates.yaml') }}-${{ github.run_id }}
    restore-keys: |
      templates-${{ hashFiles('templates.yaml') }}-
      templates-

The force refresh option clears the cache entirely:

- name: Clear cache (if force refresh)
  if: ${{ github.event.inputs.force_refresh == 'true' }}
  run: rm -rf .cache/templates

Security Scanning

Separate workflow, parallel execution:

# security.yml
jobs:
  gitleaks:
    # Secret scanning on every push

  dependency-review:
    # License and vulnerability check on PRs

  yaml-lint:
    # Configuration validation

This keeps security checks from blocking deploys while still catching issues.

The yamllint War Story

Our first security run failed spectacularly:

##[error]mkdocs.yml:88:5 [indentation] wrong indentation: expected 6 but found 4
##[error]templates.yaml:45:121 [line-length] line too long (156 > 120 characters)
##[warning].github/workflows/deploy.yml:3:1 [truthy] truthy value should be one of [false, true]

The investigation revealed three conflicts:

on: is not a boolean - GitHub Actions uses on: as a keyword, but yamllint sees it as a truthy value
MkDocs doesn't require --- - yamllint's document-start rule expects it
Description fields are long - template descriptions exceed 120 characters

The fix: .yamllint.yml configuration that respects ecosystem conventions:

rules:
  # GitHub Actions uses `on:` as a keyword
  truthy:
    allowed-values: ['true', 'false', 'on']

  # MkDocs files don't need document start
  document-start: disable

  # Allow longer lines for descriptions
  line-length:
    max: 200

Lesson: Linting tools need per-ecosystem configuration. Default rules assume vanilla YAML.

Build Times

Scenario	Time
Cold build (no cache)	~45s
Warm build (cached)	~20s
Force refresh	~45s

Most deploys hit the cache. Daily scheduled builds may be slower if upstream repos changed.

What We Learned

Separate security from deploy - don't let linting failures block urgent content fixes
Cache aggressively, invalidate precisely - manifest-based tracking beats time-based expiry
Make force refresh easy - when caching goes wrong, you need an escape hatch

What's Next

ADR-005: DevSecOps implementation (the security.yml details)

Links:

DevSecOps for a Docs Site (ADR-005)

January 3, 2026 · 4 min read

Amiable Dev

Project Contributors

We added security scanning to a documentation site. Most DevSecOps guides assume you have application code. We don't.

The Problem

Documentation repositories have different security concerns than application code:

No server-side runtime - no SQL injection or RCE vectors (though DOM-based XSS remains possible)
No application secrets - but build-time secrets (GitHub tokens, API keys) can still leak
Community contributions - forks need to pass CI without repository secrets

Most DevSecOps tooling is overkill here. SAST (static code analysis) and DAST (runtime probing) assume you have application code. Container scanning assumes you have containers. We needed a minimal, fork-friendly approach.

The 3-Layer Pipeline

Layer 1 catches issues before they're committed. Layer 2 validates PRs from forks (no secrets required). Layer 3 runs post-merge for ongoing protection.

Fork-Friendly Design

This was the key constraint. GitHub intentionally isolates repository secrets from fork PRs to prevent malicious PRs from exfiltrating credentials.

The failure mode we avoided: If your security workflow requires SONAR_TOKEN or similar, every community contribution triggers a CI failure. Contributors wait for maintainers to manually approve, friction accumulates, contributions slow down.

Our security workflow uses only:

env:
  GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

GITHUB_TOKEN is automatically provided to all workflows, including forks. No API keys, no OAuth tokens, no external services.

What this enables:

Contributors don't need to configure anything
All security checks pass on fork PRs
No "skip CI" friction for external contributions
Avoids the pull_request_target security footgun

The Gitleaks Gotcha

Our first implementation had a dangerous allowlist:

.gitleaks.toml (DANGEROUS)
# DON'T DO THIS - excludes all markdown from scanning
[allowlist]
paths = [
  '''\.md$''',
]

This excludes all markdown files from secret scanning. For a documentation repository, that's most of the codebase.

Why this matters: Documentation often contains tutorial code blocks. Engineers copy-paste examples and accidentally include real API keys. Markdown files are where secrets leak in docs repos.

The fix: allowlist specific patterns, not entire file types:

.gitleaks.toml (SAFE)
# DO THIS - only ignore explicit example patterns
[[rules]]
id = "example-api-key"
regex = '''sk-example-[a-zA-Z0-9]+'''
allowlist = { regexes = ['''sk-example-'''] }

[[rules]]
id = "placeholder-key"
regex = '''YOUR_API_KEY|your-api-key'''
allowlist = { regexes = ['''YOUR_API_KEY|your-api-key'''] }

Real secrets in markdown files will still be caught. Only explicit example patterns (sk-example-*, YOUR_API_KEY) are ignored.

Tools We Didn't Use

Tool	Why Excluded
CodeQL	No codebase to analyze
Snyk	Dependabot sufficient at this scale
Trivy	No containers
SonarCloud	Overkill for docs
Semgrep	No application code

The right amount of security tooling is the minimum that covers your actual risks.

War Story: The YAML 1.1 Truthy (aka "The Norway Problem")

Our security workflow failed immediately:

[truthy] truthy value should be one of [false, true]
  3:1      error    on:

GitHub Actions uses on: as a keyword. But YAML 1.1 treats on, off, yes, and no as booleans. This is sometimes called "The Norway Problem" because country code NO gets parsed as false.

Fix in .yamllint.yml:

.yamllint.yml
rules:
  truthy:
    allowed-values: ['true', 'false', 'on']
    check-keys: false

The Minimal Stack

Pre-Commit

Gitleaks + yamllint
PR Checks

Gitleaks Action + Dependency Review
Post-Merge

Dependabot + Secret Scanning

Total configuration: 3 files, ~50 lines of YAML.

Full ADR

See ADR-005: DevSecOps Implementation for the complete Architecture Decision Record.

Build-Time Documentation Aggregation (ADR-006)

January 3, 2026 · 5 min read

Amiable Dev

Project Contributors

We built a system to fetch documentation from multiple GitHub repositories at build time. The trick: SHA-based caching that makes incremental builds near-instant.

The Problem

We have templates across multiple repositories:

litellm-langfuse-railway (starter + production configs)
llm-council (multi-model consensus system)

Each has its own documentation. Users shouldn't have to visit three different repos to understand their options.

Goal: Unified documentation portal with content from all template repos.

Why Build-Time Aggregation?

We considered three approaches:

Approach	Pros	Cons
Manual copy	Simple	Stale immediately
Git submodules	Real-time	Complex, version conflicts
Build-time fetch	Fresh daily, cacheable	Requires API access

Build-time aggregation wins: content is fresh (daily rebuilds), caching makes it fast, and errors don't break the site.

The Caching Strategy

The naive approach: fetch everything on every build. With 3 templates and multiple docs each, that's slow and hits API rate limits.

Our approach: SHA-based cache invalidation with a lightweight API check.

How It Works

The key insight: We don't download content to check if it changed. One lightweight API call (GET /repos/{owner}/{repo}/commits/HEAD) returns the current SHA. Compare against the manifest. Done.

async def get_commit_sha(self, owner: str, repo: str) -> str | None:
    """Get the SHA of the default branch HEAD (1 API call, no content)."""
    url = f"{GITHUB_API_BASE}/repos/{owner}/{repo}/commits/HEAD"
    async with self._session.get(url) as resp:
        if resp.status == 200:
            data = await resp.json()
            return data["sha"]  # Just the SHA, not the content

Cache Granularity: Repo-Level

We cache at the repo level, not file level. One new commit invalidates all docs from that repo. This is simpler than tracking individual file changes, and repos don't change that often.

The manifest tracks:

{
  "litellm-langfuse-starter": {
    "commit_sha": "5a45454c15e0e5e17ff20a3f0d6df421c1f037db",
    "fetched_at": "2026-01-03T18:43:43Z",
    "files": ["overview.md", "setup.md"]
  }
}

Result: If the repo hasn't changed, skip the fetch entirely.

2026-01-03 18:44:00 [INFO]   Using cached content (SHA: 5a45454)

Content Transformation

Raw content from upstream repos has relative links that break when moved. The ContentTransformer class handles this:

Link Rewriting

def _rewrite_links(self, content: str) -> str:
    """Rewrite relative markdown links to GitHub blob URLs."""
    # [Setup Guide](https://github.com/amiable-dev/amiable-templates/blob/0cd5d5fb7cba1d8bac3763502597ecd102d9f667/docs/blog/posts/setup.md)
    # → [Setup Guide](https://github.com/owner/repo/blob/sha/path/setup.md)

Image Rewriting

def _rewrite_images(self, content: str) -> str:
    """Rewrite relative image paths to raw.githubusercontent.com URLs."""
    # ![diagram](https://raw.githubusercontent.com/amiable-dev/amiable-templates/0cd5d5fb7cba1d8bac3763502597ecd102d9f667/docs/blog/posts/../assets/arch.png)
    # → ![diagram](https://raw.githubusercontent.com/owner/repo/sha/assets/arch.png)

Source Attribution

Every aggregated doc gets an info box:

!!! info "Source Repository"
    This documentation is from [amiable-dev/litellm-langfuse-railway](...).
    Last synced: 2026-01-03 | Commit: `5a45454`

Users always know where the content came from.

Error Handling Philosophy

Never fail the build due to upstream issues. But be loud about failures.

Hard vs. Soft Errors

Error Type	Behavior
Config errors (invalid YAML)	Fail fast
Network errors	Use cached content, log warning
Repo not found	Skip, log warning
File not found	Skip file, continue
Rate limit	Use cached content

results = await asyncio.gather(
    *[aggregate_template(t, fetcher, cache, output_dir) for t in templates],
    return_exceptions=True,  # Collect errors, don't fail
)

for result in results:
    if isinstance(result, Exception):
        logger.error(f"Aggregation error: {result}")  # Be loud

Stale Content Risk

The danger: a repo fails to update for weeks, and users see stale docs thinking they're current.

Mitigation: The source attribution box includes sync date and commit SHA. Users can verify freshness:

!!! info "Source Repository"
    Last synced: 2026-01-03 | Commit: `5a45454`

If the sync date is old, something's wrong. CI logs show fetch failures for investigation.

GitHub API Considerations

Rate Limit Math

Auth Method	Limit	Our Usage
Unauthenticated	60/hour	Not viable
`GITHUB_TOKEN`	5,000/hour	What we use
GitHub App	5,000+/hour	Overkill for docs

Our request pattern per build:

3 repos × 1 SHA check = 3 API requests
Content fetched via raw.githubusercontent.com (no rate limit)
Cached builds: 0 content fetches

Even with 50 repos, we'd use 50 requests per build. The 5,000/hour limit is plenty.

Fetch Optimization

# SHA check: Uses API (rate limited, but just 1 request per repo)
url = f"{GITHUB_API_BASE}/repos/{owner}/{repo}/commits/HEAD"

# Content fetch: Uses raw.githubusercontent.com (no rate limit!)
url = f"{GITHUB_RAW_BASE}/{owner}/{repo}/{sha}/{path}"

This split is intentional: the API for metadata, raw URLs for content.

CI Integration

.github/workflows/deploy.yml
- name: Restore template cache
  uses: actions/cache@v5
  with:
    path: .cache/templates
    key: templates-${{ hashFiles('templates.yaml') }}-${{ github.run_id }}
    restore-keys: |
      templates-${{ hashFiles('templates.yaml') }}-
      templates-

- name: Aggregate template documentation
  env:
    GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
  run: python scripts/aggregate_templates.py

The cache key strategy:

Exact match: same config, same run → use cache
Partial match: same config, different run → restore, then update
No match: fresh fetch

The Tradeoff We Accepted

Delayed updates: Changes to upstream repos aren't instant. They appear on the next daily build (or manual dispatch).

For documentation, this is acceptable. If you need real-time sync, consider webhooks or git submodules—but accept the complexity.

Full ADR

See ADR-006: Cross-Project Documentation Aggregation for the complete decision record.

The Problem​

The Solution: Hybrid Approach​

The CLI: template_manager.py​

Why One CLI?​

Three Levels of Validation​

Claude Code Skill Integration​

Security Hardening​

Input Validation​

Symlink Protection​

YAML Hardening​

Atomic Writes​

Makefile Integration​

Pre-commit Hook​

What We Learned​

Implementation Stats​

What's Next​

The Problem​

The Solution: Adopt Proven Patterns​

The Files​

GitHub Configuration​

Example: CODEOWNERS​

The Interesting Part: LLM Council Review​

Tracking It All​

What's Next​

The Problem​

Why MkDocs Material?​

The Architecture​

Navigation Tabs​

The Template Grid​

Dark Mode​

What We Learned​

What's Next​

The Problem​

Why YAML?​

The Schema​

Validation with JSON Schema​

CI Integration​

The Aggregation Flow​

What We Learned​

What's Next​

The Problem​

Why GitHub Pages?​

The Pipeline​

Caching Strategy​

Security Scanning​

The yamllint War Story​

Build Times​

What We Learned​

What's Next​

The Problem​

The 3-Layer Pipeline​

Fork-Friendly Design​

The Gitleaks Gotcha​

Tools We Didn't Use​

War Story: The YAML 1.1 Truthy (aka "The Norway Problem")​

The Minimal Stack​

Full ADR​

The Problem​

Why Build-Time Aggregation?​

The Caching Strategy​

How It Works​

Cache Granularity: Repo-Level​

Content Transformation​

Link Rewriting​

Image Rewriting​

Source Attribution​

Error Handling Philosophy​

Hard vs. Soft Errors​

Stale Content Risk​

GitHub API Considerations​

Rate Limit Math​

Fetch Optimization​

CI Integration​

The Tradeoff We Accepted​

Full ADR​

The Problem

The Solution: Hybrid Approach

The CLI: `template_manager.py`

Why One CLI?

Three Levels of Validation

Claude Code Skill Integration

Security Hardening

Input Validation

Symlink Protection

YAML Hardening

Atomic Writes

Makefile Integration

Pre-commit Hook

What We Learned

Implementation Stats

What's Next

The Problem

The Solution: Adopt Proven Patterns

The Files

GitHub Configuration

Example: CODEOWNERS

The Interesting Part: LLM Council Review

Tracking It All

What's Next

The Problem

Why MkDocs Material?

The Architecture

Navigation Tabs

The Template Grid

Dark Mode

What We Learned

What's Next

The Problem

Why YAML?

The Schema

Validation with JSON Schema

CI Integration

The Aggregation Flow

What We Learned

What's Next

The Problem

Why GitHub Pages?

The Pipeline

Caching Strategy

Security Scanning

The yamllint War Story

Build Times

What We Learned

What's Next

The Problem

The 3-Layer Pipeline

Fork-Friendly Design

The Gitleaks Gotcha

Tools We Didn't Use

War Story: The YAML 1.1 Truthy (aka "The Norway Problem")

The Minimal Stack

Full ADR

The Problem

Why Build-Time Aggregation?

The Caching Strategy

How It Works

Cache Granularity: Repo-Level

Content Transformation

Link Rewriting

Image Rewriting

Source Attribution

Error Handling Philosophy

Hard vs. Soft Errors

Stale Content Risk

GitHub API Considerations

Rate Limit Math

Fetch Optimization

CI Integration

The Tradeoff We Accepted

Full ADR