Skip to main content

Architecture Decisions: Building on Upptime's Foundation

· 7 min read
Chris
Amiable Dev
Claude
AI Assistant

This post covers the technical decisions behind Stentorosaur—why we chose certain approaches, what we rejected, and the trade-offs you should understand before adopting it.

Foundation: Upptime's Architecture

Stentorosaur builds on Upptime by Anand Chowdhary. Upptime established three patterns we adopted:

  1. GitHub Issues as the incident database
  2. GitHub Actions for scheduled health checks
  3. Static site generation from committed JSON data

We ported these concepts to work as a Docusaurus plugin rather than a standalone site.

System Overview

Decision 1: GitHub Issues as Incident Storage

What we chose: Store incidents as GitHub Issues with structured labels.

What we rejected:

  • SQLite in the repo (requires build-time queries, complex migrations)
  • External database (adds infrastructure, defeats the purpose)
  • JSON files only (loses collaboration features—comments, assignments, reactions)

Why Issues work:

  • Engineers already know how to create/close/comment on issues
  • Built-in permissions via repo access controls
  • Full audit trail with timestamps
  • Markdown support for incident details
  • Labels enable filtering (critical, system:api, process:checkout)

The label schema:

LabelPurpose
statusMarks issue as status-trackable
critical / major / minorSeverity level
system:{name}Links to a system entity
process:{name}Links to a process entity
automatedCreated by monitoring workflow
maintenanceScheduled maintenance window

Trade-off: API Rate Limits

GitHub's API allows 5,000 requests/hour with authentication. To avoid hitting limits:

  • Monitoring workflow commits data to JSON files
  • Plugin reads from committed files at build time, not from the API
  • Issues are only fetched when the workflow runs (not on every page load)

Decision 2: GitHub Actions for Monitoring

What we chose: Cron-scheduled GitHub Actions workflow.

What we rejected:

  • External monitoring service (adds cost, external dependency)
  • Client-side health checks (CORS issues, unreliable)
  • Webhook-based triggers (requires always-on infrastructure)

The monitoring workflow:

name: Monitor Systems
on:
schedule:
- cron: '*/5 * * * *' # Minimum: 5 minutes
workflow_dispatch: # Manual trigger

concurrency:
group: monitor
cancel-in-progress: false # Don't cancel running checks

permissions:
contents: write
issues: write

jobs:
monitor:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4

- name: Run health checks
run: |
npx -y stentorosaur-monitor \
--config .monitorrc.json \
--output-dir ./status-data

- name: Commit results
run: |
git config user.name "github-actions[bot]"
git config user.email "github-actions[bot]@users.noreply.github.com"
git add status-data/
git diff --staged --quiet || git commit -m "Update status [skip ci]"
git push

How stentorosaur-monitor works:

  1. Reads .monitorrc.json for endpoint definitions
  2. Makes HTTP requests with configurable timeout (default: 10s)
  3. Checks response code against expectedCodes array
  4. Measures response time
  5. Appends result to current.json
  6. If status changed: creates/closes GitHub Issue via API

Monitor configuration (.monitorrc.json):

{
"systems": [
{
"system": "api",
"url": "https://api.example.com/health",
"method": "GET",
"timeout": 10000,
"expectedCodes": [200],
"maxResponseTime": 5000,
"headers": {
"Authorization": "Bearer ${API_KEY}"
}
}
]
}

Trade-offs:

ConstraintImpactMitigation
5-minute minimum cronCan't detect sub-5-minute outagesAcceptable for status pages (not alerting)
Actions minutes (private repos)~8,600 min/month at 5-min intervalUse hourly checks, or public repos (unlimited)
GitHub Actions outagesMonitoring stopsAccept this; GA has 99.9%+ uptime

Decision 3: Data Storage in Git

What we chose: Commit monitoring data as JSON/JSONL files to the repository.

What we rejected:

  • GitHub Gists (no history, harder to query)
  • External storage (S3, etc.) — adds infrastructure
  • In-memory only — no persistence

The file structure:

status-data/
├── current.json # Rolling 14-day window
├── incidents.json # Denormalized incident data
├── maintenance.json # Scheduled maintenance windows
└── archives/
└── 2025/11/
├── 2025-11-01.jsonl.gz # Compressed historical data
└── 2025-11-19.jsonl # Today (uncompressed)

Schema: current.json entry

{
"t": 1700000000000,
"svc": "api",
"state": "up",
"code": 200,
"lat": 145
}
FieldTypeDescription
tnumberUnix timestamp (ms)
svcstringSystem/service name
statestringup or down
codenumberHTTP response code
latnumberResponse latency (ms)

Trade-off: Git as a Database

Using Git for data storage is unconventional. Here's the honest assessment:

Pros:

  • Zero infrastructure (no database to manage)
  • Full history via Git commits
  • Works offline (data is in the repo)
  • Free (included with GitHub)

Cons:

  • Repo size growth: Each commit adds to history. We mitigate this with:
    • JSONL format (append-only, single line per entry)
    • Gzip compression for archives (text-based, Git can delta-compress)
    • Rolling window in current.json (14 days, not forever)
  • Merge conflicts: Possible if two workflow runs overlap. We use concurrency groups to prevent this.
  • Not queryable: Can't run SQL. The plugin loads JSON into memory at build time.

Measured impact: A site checking 3 endpoints every 5 minutes generates ~50KB/month of JSONL data. After compression, archives grow at ~5KB/month. After one year, expect ~100MB of status history (mostly compressed).

Decision 4: Build-Time vs. Runtime Data

What we chose: Static Site Generation (SSG) with optional runtime fetching (v0.16+).

What we rejected:

  • Client-side fetching only (CSR) — CORS issues, stale data visible during navigation
  • Server-side rendering (SSR) — requires Node.js server, defeats static hosting

How it works:

  1. Plugin registers a loadContent hook in Docusaurus
  2. At build time, plugin reads status-data/*.json
  3. Data is passed to React components as props
  4. /status page is generated as static HTML with embedded JSON
  5. Optional (v0.16+): Runtime fetching via dataSource configuration

New in v0.16: Runtime Data Fetching

The dataSource option enables client-side data updates without rebuilding:

dataSource: {
strategy: 'github', // or 'http', 'static', 'build-only'
owner: 'your-org',
repo: 'your-repo',
}
StrategyUse CaseRuntime Fetch
githubPublic repos via raw.githubusercontent.comYes
httpCustom APIs, proxies for private reposYes
staticLocal/bundled JSON filesYes
build-onlyNo runtime fetch (original behavior)No

Private Repository Support: Use the http strategy with a server-side proxy (e.g., Cloudflare Worker) that adds authentication headers. This keeps tokens server-side while enabling runtime updates.

Trade-off: Update latency

ApproachLatencyComplexity
Build-only (original)Minutes-hoursLow
Scheduled deploysUp to 1 hourLow
dataSource runtime fetchSecondsLow
Webhook-triggered deploySecondsMedium

We recommend dataSource with the github or http strategy for real-time updates.

Failure Modes

What happens when things break:

FailureImpactRecovery
GitHub Actions downMonitoring stopsAutomatic when GA recovers
Health check timeoutMarked as downAuto-closes issue when recovered
Git push conflictWorkflow failsNext run succeeds (no data loss)
Rate limit exceededAPI calls failPlugin falls back to committed data

Flapping prevention:

The monitor doesn't immediately create an issue on first failure. Configuration option consecutiveFailures (default: 1) controls how many failures before opening an incident. Set to 3 for noisy endpoints.

Concurrency Handling

Problem: What if two workflow runs overlap?

Solution: GitHub Actions concurrency groups:

concurrency:
group: monitor-${{ github.repository }}
cancel-in-progress: false

This ensures only one monitoring job runs at a time. If a job is running, the next scheduled run waits (doesn't cancel the in-progress one).

Why Not Just Use Upptime?

Upptime is excellent for standalone status pages. We built Stentorosaur for a specific use case:

FeatureUpptimeStentorosaur
Standalone siteYesNo (Docusaurus plugin)
Docs integrationNoYes
FrameworkSvelteReact/Docusaurus
Config location.upptimerc.ymldocusaurus.config.js
ThemingCustomInherits Docusaurus theme

If you don't use Docusaurus, use Upptime. If your docs are on Docusaurus and you want integrated status, use Stentorosaur.

Summary

Stentorosaur makes specific trade-offs:

  • Monitoring granularity: 5+ minutes (GitHub Actions limitation)
  • Update latency: Depends on deploy frequency
  • Storage: Git-based (unconventional but zero-infrastructure)
  • Querying: In-memory at build time (no database)

These trade-offs are acceptable for status pages where 5-minute resolution is fine and the goal is simplicity over real-time precision.

Next: Quick Start Guide — full setup walkthrough.