Architecture Decisions: Building on Upptime's Foundation

November 19, 2025 · 7 min read

Amiable Dev

AI Assistant

This post covers the technical decisions behind Stentorosaur—why we chose certain approaches, what we rejected, and the trade-offs you should understand before adopting it.

Foundation: Upptime's Architecture

Stentorosaur builds on Upptime by Anand Chowdhary. Upptime established three patterns we adopted:

GitHub Issues as the incident database
GitHub Actions for scheduled health checks
Static site generation from committed JSON data

We ported these concepts to work as a Docusaurus plugin rather than a standalone site.

System Overview

Decision 1: GitHub Issues as Incident Storage

What we chose: Store incidents as GitHub Issues with structured labels.

What we rejected:

SQLite in the repo (requires build-time queries, complex migrations)
External database (adds infrastructure, defeats the purpose)
JSON files only (loses collaboration features—comments, assignments, reactions)

Why Issues work:

Engineers already know how to create/close/comment on issues
Built-in permissions via repo access controls
Full audit trail with timestamps
Markdown support for incident details
Labels enable filtering (critical, system:api, process:checkout)

The label schema:

Label	Purpose
`status`	Marks issue as status-trackable
`critical` / `major` / `minor`	Severity level
`system:{name}`	Links to a system entity
`process:{name}`	Links to a process entity
`automated`	Created by monitoring workflow
`maintenance`	Scheduled maintenance window

Trade-off: API Rate Limits

GitHub's API allows 5,000 requests/hour with authentication. To avoid hitting limits:

Monitoring workflow commits data to JSON files
Plugin reads from committed files at build time, not from the API
Issues are only fetched when the workflow runs (not on every page load)

Decision 2: GitHub Actions for Monitoring

What we chose: Cron-scheduled GitHub Actions workflow.

What we rejected:

External monitoring service (adds cost, external dependency)
Client-side health checks (CORS issues, unreliable)
Webhook-based triggers (requires always-on infrastructure)

The monitoring workflow:

name: Monitor Systems
on:
  schedule:
    - cron: '*/5 * * * *'  # Minimum: 5 minutes
  workflow_dispatch:        # Manual trigger

concurrency:
  group: monitor
  cancel-in-progress: false  # Don't cancel running checks

permissions:
  contents: write
  issues: write

jobs:
  monitor:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Run health checks
        run: |
          npx -y stentorosaur-monitor \
            --config .monitorrc.json \
            --output-dir ./status-data

      - name: Commit results
        run: |
          git config user.name "github-actions[bot]"
          git config user.email "github-actions[bot]@users.noreply.github.com"
          git add status-data/
          git diff --staged --quiet || git commit -m "Update status [skip ci]"
          git push

How stentorosaur-monitor works:

Reads .monitorrc.json for endpoint definitions
Makes HTTP requests with configurable timeout (default: 10s)
Checks response code against expectedCodes array
Measures response time
Appends result to current.json
If status changed: creates/closes GitHub Issue via API

Monitor configuration (.monitorrc.json):

{
  "systems": [
    {
      "system": "api",
      "url": "https://api.example.com/health",
      "method": "GET",
      "timeout": 10000,
      "expectedCodes": [200],
      "maxResponseTime": 5000,
      "headers": {
        "Authorization": "Bearer ${API_KEY}"
      }
    }
  ]
}

Trade-offs:

Constraint	Impact	Mitigation
5-minute minimum cron	Can't detect sub-5-minute outages	Acceptable for status pages (not alerting)
Actions minutes (private repos)	~8,600 min/month at 5-min interval	Use hourly checks, or public repos (unlimited)
GitHub Actions outages	Monitoring stops	Accept this; GA has 99.9%+ uptime

Decision 3: Data Storage in Git

What we chose: Commit monitoring data as JSON/JSONL files to the repository.

What we rejected:

GitHub Gists (no history, harder to query)
External storage (S3, etc.) — adds infrastructure
In-memory only — no persistence

The file structure:

status-data/
├── current.json          # Rolling 14-day window
├── incidents.json        # Denormalized incident data
├── maintenance.json      # Scheduled maintenance windows
└── archives/
    └── 2025/11/
        ├── 2025-11-01.jsonl.gz   # Compressed historical data
        └── 2025-11-19.jsonl      # Today (uncompressed)

Schema: current.json entry

{
  "t": 1700000000000,
  "svc": "api",
  "state": "up",
  "code": 200,
  "lat": 145
}

Field	Type	Description
`t`	number	Unix timestamp (ms)
`svc`	string	System/service name
`state`	string	`up` or `down`
`code`	number	HTTP response code
`lat`	number	Response latency (ms)

Trade-off: Git as a Database

Using Git for data storage is unconventional. Here's the honest assessment:

Pros:

Zero infrastructure (no database to manage)
Full history via Git commits
Works offline (data is in the repo)
Free (included with GitHub)

Cons:

Repo size growth: Each commit adds to history. We mitigate this with:
- JSONL format (append-only, single line per entry)
- Gzip compression for archives (text-based, Git can delta-compress)
- Rolling window in current.json (14 days, not forever)
Merge conflicts: Possible if two workflow runs overlap. We use concurrency groups to prevent this.
Not queryable: Can't run SQL. The plugin loads JSON into memory at build time.

Measured impact: A site checking 3 endpoints every 5 minutes generates ~50KB/month of JSONL data. After compression, archives grow at ~5KB/month. After one year, expect ~100MB of status history (mostly compressed).

Decision 4: Build-Time vs. Runtime Data

What we chose: Static Site Generation (SSG) with optional runtime fetching (v0.16+).

What we rejected:

Client-side fetching only (CSR) — CORS issues, stale data visible during navigation
Server-side rendering (SSR) — requires Node.js server, defeats static hosting

How it works:

Plugin registers a loadContent hook in Docusaurus
At build time, plugin reads status-data/*.json
Data is passed to React components as props
/status page is generated as static HTML with embedded JSON
Optional (v0.16+): Runtime fetching via dataSource configuration

New in v0.16: Runtime Data Fetching

The dataSource option enables client-side data updates without rebuilding:

dataSource: {
  strategy: 'github',  // or 'http', 'static', 'build-only'
  owner: 'your-org',
  repo: 'your-repo',
}

Strategy	Use Case	Runtime Fetch
`github`	Public repos via raw.githubusercontent.com	Yes
`http`	Custom APIs, proxies for private repos	Yes
`static`	Local/bundled JSON files	Yes
`build-only`	No runtime fetch (original behavior)	No

Private Repository Support: Use the http strategy with a server-side proxy (e.g., Cloudflare Worker) that adds authentication headers. This keeps tokens server-side while enabling runtime updates.

Trade-off: Update latency

Approach	Latency	Complexity
Build-only (original)	Minutes-hours	Low
Scheduled deploys	Up to 1 hour	Low
`dataSource` runtime fetch	Seconds	Low
Webhook-triggered deploy	Seconds	Medium

We recommend dataSource with the github or http strategy for real-time updates.

Failure Modes

What happens when things break:

Failure	Impact	Recovery
GitHub Actions down	Monitoring stops	Automatic when GA recovers
Health check timeout	Marked as down	Auto-closes issue when recovered
Git push conflict	Workflow fails	Next run succeeds (no data loss)
Rate limit exceeded	API calls fail	Plugin falls back to committed data

Flapping prevention:

The monitor doesn't immediately create an issue on first failure. Configuration option consecutiveFailures (default: 1) controls how many failures before opening an incident. Set to 3 for noisy endpoints.

Concurrency Handling

Problem: What if two workflow runs overlap?

Solution: GitHub Actions concurrency groups:

concurrency:
  group: monitor-${{ github.repository }}
  cancel-in-progress: false

This ensures only one monitoring job runs at a time. If a job is running, the next scheduled run waits (doesn't cancel the in-progress one).

Why Not Just Use Upptime?

Upptime is excellent for standalone status pages. We built Stentorosaur for a specific use case:

Feature	Upptime	Stentorosaur
Standalone site	Yes	No (Docusaurus plugin)
Docs integration	No	Yes
Framework	Svelte	React/Docusaurus
Config location	`.upptimerc.yml`	`docusaurus.config.js`
Theming	Custom	Inherits Docusaurus theme

If you don't use Docusaurus, use Upptime. If your docs are on Docusaurus and you want integrated status, use Stentorosaur.

Summary

Stentorosaur makes specific trade-offs:

Monitoring granularity: 5+ minutes (GitHub Actions limitation)
Update latency: Depends on deploy frequency
Storage: Git-based (unconventional but zero-infrastructure)
Querying: In-memory at build time (no database)

These trade-offs are acceptable for status pages where 5-minute resolution is fine and the goal is simplicity over real-time precision.

Next: Quick Start Guide — full setup walkthrough.

Foundation: Upptime's Architecture​

System Overview​

Decision 1: GitHub Issues as Incident Storage​

Decision 2: GitHub Actions for Monitoring​

Decision 3: Data Storage in Git​

Decision 4: Build-Time vs. Runtime Data​

Failure Modes​

Concurrency Handling​

Why Not Just Use Upptime?​

Summary​

Foundation: Upptime's Architecture

System Overview

Decision 1: GitHub Issues as Incident Storage

Decision 2: GitHub Actions for Monitoring

Decision 3: Data Storage in Git

Decision 4: Build-Time vs. Runtime Data

Failure Modes

Concurrency Handling

Why Not Just Use Upptime?

Summary