Architecture Decisions: Building on Upptime's Foundation
This post covers the technical decisions behind Stentorosaur—why we chose certain approaches, what we rejected, and the trade-offs you should understand before adopting it.
Foundation: Upptime's Architecture
Stentorosaur builds on Upptime by Anand Chowdhary. Upptime established three patterns we adopted:
- GitHub Issues as the incident database
- GitHub Actions for scheduled health checks
- Static site generation from committed JSON data
We ported these concepts to work as a Docusaurus plugin rather than a standalone site.
System Overview
Decision 1: GitHub Issues as Incident Storage
What we chose: Store incidents as GitHub Issues with structured labels.
What we rejected:
- SQLite in the repo (requires build-time queries, complex migrations)
- External database (adds infrastructure, defeats the purpose)
- JSON files only (loses collaboration features—comments, assignments, reactions)
Why Issues work:
- Engineers already know how to create/close/comment on issues
- Built-in permissions via repo access controls
- Full audit trail with timestamps
- Markdown support for incident details
- Labels enable filtering (
critical,system:api,process:checkout)
The label schema:
| Label | Purpose |
|---|---|
status | Marks issue as status-trackable |
critical / major / minor | Severity level |
system:{name} | Links to a system entity |
process:{name} | Links to a process entity |
automated | Created by monitoring workflow |
maintenance | Scheduled maintenance window |
Trade-off: API Rate Limits
GitHub's API allows 5,000 requests/hour with authentication. To avoid hitting limits:
- Monitoring workflow commits data to JSON files
- Plugin reads from committed files at build time, not from the API
- Issues are only fetched when the workflow runs (not on every page load)
Decision 2: GitHub Actions for Monitoring
What we chose: Cron-scheduled GitHub Actions workflow.
What we rejected:
- External monitoring service (adds cost, external dependency)
- Client-side health checks (CORS issues, unreliable)
- Webhook-based triggers (requires always-on infrastructure)
The monitoring workflow:
name: Monitor Systems
on:
schedule:
- cron: '*/5 * * * *' # Minimum: 5 minutes
workflow_dispatch: # Manual trigger
concurrency:
group: monitor
cancel-in-progress: false # Don't cancel running checks
permissions:
contents: write
issues: write
jobs:
monitor:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Run health checks
run: |
npx -y stentorosaur-monitor \
--config .monitorrc.json \
--output-dir ./status-data
- name: Commit results
run: |
git config user.name "github-actions[bot]"
git config user.email "github-actions[bot]@users.noreply.github.com"
git add status-data/
git diff --staged --quiet || git commit -m "Update status [skip ci]"
git push
How stentorosaur-monitor works:
- Reads
.monitorrc.jsonfor endpoint definitions - Makes HTTP requests with configurable timeout (default: 10s)
- Checks response code against
expectedCodesarray - Measures response time
- Appends result to
current.json - If status changed: creates/closes GitHub Issue via API
Monitor configuration (.monitorrc.json):
{
"systems": [
{
"system": "api",
"url": "https://api.example.com/health",
"method": "GET",
"timeout": 10000,
"expectedCodes": [200],
"maxResponseTime": 5000,
"headers": {
"Authorization": "Bearer ${API_KEY}"
}
}
]
}
Trade-offs:
| Constraint | Impact | Mitigation |
|---|---|---|
| 5-minute minimum cron | Can't detect sub-5-minute outages | Acceptable for status pages (not alerting) |
| Actions minutes (private repos) | ~8,600 min/month at 5-min interval | Use hourly checks, or public repos (unlimited) |
| GitHub Actions outages | Monitoring stops | Accept this; GA has 99.9%+ uptime |
Decision 3: Data Storage in Git
What we chose: Commit monitoring data as JSON/JSONL files to the repository.
What we rejected:
- GitHub Gists (no history, harder to query)
- External storage (S3, etc.) — adds infrastructure
- In-memory only — no persistence
The file structure:
status-data/
├── current.json # Rolling 14-day window
├── incidents.json # Denormalized incident data
├── maintenance.json # Scheduled maintenance windows
└── archives/
└── 2025/11/
├── 2025-11-01.jsonl.gz # Compressed historical data
└── 2025-11-19.jsonl # Today (uncompressed)
Schema: current.json entry
{
"t": 1700000000000,
"svc": "api",
"state": "up",
"code": 200,
"lat": 145
}
| Field | Type | Description |
|---|---|---|
t | number | Unix timestamp (ms) |
svc | string | System/service name |
state | string | up or down |
code | number | HTTP response code |
lat | number | Response latency (ms) |
Trade-off: Git as a Database
Using Git for data storage is unconventional. Here's the honest assessment:
Pros:
- Zero infrastructure (no database to manage)
- Full history via Git commits
- Works offline (data is in the repo)
- Free (included with GitHub)
Cons:
- Repo size growth: Each commit adds to history. We mitigate this with:
- JSONL format (append-only, single line per entry)
- Gzip compression for archives (text-based, Git can delta-compress)
- Rolling window in
current.json(14 days, not forever)
- Merge conflicts: Possible if two workflow runs overlap. We use
concurrencygroups to prevent this. - Not queryable: Can't run SQL. The plugin loads JSON into memory at build time.
Measured impact: A site checking 3 endpoints every 5 minutes generates ~50KB/month of JSONL data. After compression, archives grow at ~5KB/month. After one year, expect ~100MB of status history (mostly compressed).
Decision 4: Build-Time vs. Runtime Data
What we chose: Static Site Generation (SSG) with optional runtime fetching (v0.16+).
What we rejected:
- Client-side fetching only (CSR) — CORS issues, stale data visible during navigation
- Server-side rendering (SSR) — requires Node.js server, defeats static hosting
How it works:
- Plugin registers a
loadContenthook in Docusaurus - At build time, plugin reads
status-data/*.json - Data is passed to React components as props
/statuspage is generated as static HTML with embedded JSON- Optional (v0.16+): Runtime fetching via
dataSourceconfiguration
New in v0.16: Runtime Data Fetching
The dataSource option enables client-side data updates without rebuilding:
dataSource: {
strategy: 'github', // or 'http', 'static', 'build-only'
owner: 'your-org',
repo: 'your-repo',
}
| Strategy | Use Case | Runtime Fetch |
|---|---|---|
github | Public repos via raw.githubusercontent.com | Yes |
http | Custom APIs, proxies for private repos | Yes |
static | Local/bundled JSON files | Yes |
build-only | No runtime fetch (original behavior) | No |
Private Repository Support: Use the http strategy with a server-side proxy (e.g., Cloudflare Worker) that adds authentication headers. This keeps tokens server-side while enabling runtime updates.
Trade-off: Update latency
| Approach | Latency | Complexity |
|---|---|---|
| Build-only (original) | Minutes-hours | Low |
| Scheduled deploys | Up to 1 hour | Low |
dataSource runtime fetch | Seconds | Low |
| Webhook-triggered deploy | Seconds | Medium |
We recommend dataSource with the github or http strategy for real-time updates.
Failure Modes
What happens when things break:
| Failure | Impact | Recovery |
|---|---|---|
| GitHub Actions down | Monitoring stops | Automatic when GA recovers |
| Health check timeout | Marked as down | Auto-closes issue when recovered |
| Git push conflict | Workflow fails | Next run succeeds (no data loss) |
| Rate limit exceeded | API calls fail | Plugin falls back to committed data |
Flapping prevention:
The monitor doesn't immediately create an issue on first failure. Configuration option consecutiveFailures (default: 1) controls how many failures before opening an incident. Set to 3 for noisy endpoints.
Concurrency Handling
Problem: What if two workflow runs overlap?
Solution: GitHub Actions concurrency groups:
concurrency:
group: monitor-${{ github.repository }}
cancel-in-progress: false
This ensures only one monitoring job runs at a time. If a job is running, the next scheduled run waits (doesn't cancel the in-progress one).
Why Not Just Use Upptime?
Upptime is excellent for standalone status pages. We built Stentorosaur for a specific use case:
| Feature | Upptime | Stentorosaur |
|---|---|---|
| Standalone site | Yes | No (Docusaurus plugin) |
| Docs integration | No | Yes |
| Framework | Svelte | React/Docusaurus |
| Config location | .upptimerc.yml | docusaurus.config.js |
| Theming | Custom | Inherits Docusaurus theme |
If you don't use Docusaurus, use Upptime. If your docs are on Docusaurus and you want integrated status, use Stentorosaur.
Summary
Stentorosaur makes specific trade-offs:
- Monitoring granularity: 5+ minutes (GitHub Actions limitation)
- Update latency: Depends on deploy frequency
- Storage: Git-based (unconventional but zero-infrastructure)
- Querying: In-memory at build time (no database)
These trade-offs are acceptable for status pages where 5-minute resolution is fine and the goal is simplicity over real-time precision.
Next: Quick Start Guide — full setup walkthrough.

