Skip to main content

Scoped Role Bindings: From Global Chaos to Team-Level Control

February 10, 2026 · 5 min read

SRE Operations Platform

Published: 2025-01-16

When the LLM Council reviewed our RBAC implementation (ADR-032), the verdict was unanimous and uncomfortable: REQUEST_CHANGES. The core problem? Our "simple" role model was a ticking time bomb. A user with WRITE permission could modify any resource in the system—Team A's runbooks, Team B's SLOs, production configurations they had no business touching.

This post details how we addressed Issue #450 by implementing scoped role bindings, transforming our authorization model from a liability into a proper multi-tenant foundation.

The Problem

Our original RBAC model was deceptively simple:

class Permission(Enum):
    READ = "read"
    WRITE = "write"
    DELETE = "delete"
    ADMIN = "admin"

Users were assigned roles (admin, editor, viewer) that mapped to these permissions globally. The LLM Council identified several critical gaps:

No Resource Scoping: A Team-A editor could modify Team-B's runbooks
No Environment Separation: Couldn't be Admin in staging but Viewer in production
Missing Operational Verbs: No DEPLOY, SCALE, or SILENCE_ALERT permissions
No Audit Trail: Permission checks weren't logged for compliance

The "blast radius" of any permission was the entire system. For an SRE platform managing production infrastructure, this was unacceptable.

The Solution

We implemented a three-layer scoping model:

User → Role → Scope → Permission

Scope Model

Scopes define boundaries for resource access:

class AuthScope(Base):
    __tablename__ = "auth_scopes"

    scope_type: Mapped[str]   # team, environment, service, organization
    scope_value: Mapped[str]  # team-alpha, production, api-gateway
    parent_id: Mapped[str]    # For hierarchy (org → team → project)

The hierarchy enables inheritance—organization-level access flows down to teams within that org.

Scoped Role Bindings

Instead of global role assignments, users now have scoped bindings:

class AuthRoleScopeBinding(Base):
    __tablename__ = "auth_role_scope_bindings"

    user_email: Mapped[str]      # Who
    role_id: Mapped[str]         # What role
    scope_id: Mapped[str]        # Where (team, env, etc.)
    expires_at: Mapped[datetime] # Time-bound access
    assigned_by: Mapped[str]     # Audit: who granted this
    reason: Mapped[str]          # Audit: why granted

This enables scenarios like:

Alice: Admin in Team-Alpha, Viewer in Team-Beta
Bob: Deployer in staging, Reader in production
Carol: On-call engineer with SILENCE_ALERT for 8 hours

Operational Verbs

We added SRE-specific permissions:

class Permission(str, Enum):
    # Basic CRUD
    READ = "read"
    WRITE = "write"
    DELETE = "delete"
    EXECUTE = "execute"
    ADMIN = "admin"

    # Operational verbs (Issue #450)
    APPROVE = "approve"         # Approve deployments, changes
    DEPLOY = "deploy"           # Deploy to environments
    SCALE = "scale"             # Scale services
    SILENCE_ALERT = "silence_alert"  # Silence during maintenance

Implementation

Scoped Permission Checker

The core permission checker evaluates access within context:

class ScopedPermissionChecker:
    def check_scoped_permission(
        self,
        user: UserInfo,
        required_permission: Permission,
        resource_context: dict[str, Any],  # {"team": "alpha", "env": "prod"}
    ) -> bool:
        # 1. Get user's scope bindings
        bindings = self._get_user_bindings(user.email)

        # 2. Filter to active (non-expired) bindings
        active = [b for b in bindings if not b.is_expired]

        # 3. Check if any binding grants access
        for binding in active:
            if self._scope_matches_context(binding.scope, resource_context):
                if required_permission in self._get_role_permissions(binding.role):
                    return True

        return False

Audit Logging

Every permission check is logged with structured fields:

def _log_permission_check(self, user, permission, context, result):
    log_data = {
        "user_email": user.email,
        "permission": permission.value,
        "resource_context": context,
        "result": "granted" if result else "denied",
        "scope_path": self._build_scope_path(context),
    }

    if result:
        logger.info(f"Permission granted", extra=log_data)
    else:
        logger.warning(f"Permission denied", extra=log_data)

Usage in Routes

Protecting endpoints with scoped authorization:

from core.auth.scoped_rbac import ScopedPermissionChecker, create_scope_resolver

team_resolver = create_scope_resolver(
    scope_type="team",
    extract_scope=lambda r: r.path_params.get("team_id"),
)

@app.delete("/teams/{team_id}/runbooks/{runbook_id}")
async def delete_runbook(
    team_id: str,
    runbook_id: str,
    user: UserInfo = Depends(get_current_user),
    db: Session = Depends(get_db),
):
    checker = ScopedPermissionChecker(db)

    if not checker.check_scoped_permission(
        user, Permission.DELETE, {"team": team_id}
    ):
        raise HTTPException(403, "Not authorized for this team")

    # Safe to proceed—user has DELETE in this team's scope

Results

Before: Global Permissions

alice@example.com → editor → {READ, WRITE}  # Everywhere!

After: Scoped Bindings

alice@example.com → editor → Team-Alpha/staging     → {READ, WRITE}
alice@example.com → viewer → Team-Alpha/production  → {READ}
alice@example.com → (none)  → Team-Beta/*           → (no access)

Audit Trail

{
  "timestamp": "2025-01-16T10:30:00Z",
  "user_email": "alice@example.com",
  "permission": "write",
  "resource_context": {"team": "team-alpha", "environment": "staging"},
  "result": "granted",
  "scope_path": "team:team-alpha -> environment:staging"
}

Lessons Learned

Start with scopes, not roles: If we'd designed with scopes from day one, the global-permission mistake wouldn't have happened.
Operational verbs matter: CRUD permissions don't capture SRE workflows. DEPLOY, SCALE, and SILENCE_ALERT are fundamentally different from WRITE.
Audit everything: The compliance team was delighted. Every permission check now has a paper trail.
Time-bound access enables break-glass: On-call engineers get elevated permissions that auto-expire after their shift.
Hierarchy simplifies management: Org → Team → Project scope hierarchy means fewer bindings to manage.

Migration Path

For existing users, we created a migration that:

Seeds default scopes (development, staging, production, default org)
Existing global roles continue working (backward compatible)
New scoped bindings can be added incrementally

What's Next

With scoped RBAC in place, we can now:

Implement capability-level permissions (per-entity access)
Add break-glass workflows with automatic scope escalation
Build a permission management UI for team leads
Enable self-service scope requests with approval workflows

This post details the implementation of ADR-032: RBAC Model, resolving Issue #450.

The Problem
The Solution
Implementation
Results
Lessons Learned
Migration Path
What's Next