Skip to main content

Scoped Role Bindings: From Global Chaos to Team-Level Control

· 5 min read

Published: 2025-01-16


When the LLM Council reviewed our RBAC implementation (ADR-032), the verdict was unanimous and uncomfortable: REQUEST_CHANGES. The core problem? Our "simple" role model was a ticking time bomb. A user with WRITE permission could modify any resource in the system—Team A's runbooks, Team B's SLOs, production configurations they had no business touching.

This post details how we addressed Issue #450 by implementing scoped role bindings, transforming our authorization model from a liability into a proper multi-tenant foundation.

The Problem

Our original RBAC model was deceptively simple:

class Permission(Enum):
READ = "read"
WRITE = "write"
DELETE = "delete"
ADMIN = "admin"

Users were assigned roles (admin, editor, viewer) that mapped to these permissions globally. The LLM Council identified several critical gaps:

  1. No Resource Scoping: A Team-A editor could modify Team-B's runbooks
  2. No Environment Separation: Couldn't be Admin in staging but Viewer in production
  3. Missing Operational Verbs: No DEPLOY, SCALE, or SILENCE_ALERT permissions
  4. No Audit Trail: Permission checks weren't logged for compliance

The "blast radius" of any permission was the entire system. For an SRE platform managing production infrastructure, this was unacceptable.

The Solution

We implemented a three-layer scoping model:

User → Role → Scope → Permission

Scope Model

Scopes define boundaries for resource access:

class AuthScope(Base):
__tablename__ = "auth_scopes"

scope_type: Mapped[str] # team, environment, service, organization
scope_value: Mapped[str] # team-alpha, production, api-gateway
parent_id: Mapped[str] # For hierarchy (org → team → project)

The hierarchy enables inheritance—organization-level access flows down to teams within that org.

Scoped Role Bindings

Instead of global role assignments, users now have scoped bindings:

class AuthRoleScopeBinding(Base):
__tablename__ = "auth_role_scope_bindings"

user_email: Mapped[str] # Who
role_id: Mapped[str] # What role
scope_id: Mapped[str] # Where (team, env, etc.)
expires_at: Mapped[datetime] # Time-bound access
assigned_by: Mapped[str] # Audit: who granted this
reason: Mapped[str] # Audit: why granted

This enables scenarios like:

  • Alice: Admin in Team-Alpha, Viewer in Team-Beta
  • Bob: Deployer in staging, Reader in production
  • Carol: On-call engineer with SILENCE_ALERT for 8 hours

Operational Verbs

We added SRE-specific permissions:

class Permission(str, Enum):
# Basic CRUD
READ = "read"
WRITE = "write"
DELETE = "delete"
EXECUTE = "execute"
ADMIN = "admin"

# Operational verbs (Issue #450)
APPROVE = "approve" # Approve deployments, changes
DEPLOY = "deploy" # Deploy to environments
SCALE = "scale" # Scale services
SILENCE_ALERT = "silence_alert" # Silence during maintenance

Implementation

Scoped Permission Checker

The core permission checker evaluates access within context:

class ScopedPermissionChecker:
def check_scoped_permission(
self,
user: UserInfo,
required_permission: Permission,
resource_context: dict[str, Any], # {"team": "alpha", "env": "prod"}
) -> bool:
# 1. Get user's scope bindings
bindings = self._get_user_bindings(user.email)

# 2. Filter to active (non-expired) bindings
active = [b for b in bindings if not b.is_expired]

# 3. Check if any binding grants access
for binding in active:
if self._scope_matches_context(binding.scope, resource_context):
if required_permission in self._get_role_permissions(binding.role):
return True

return False

Audit Logging

Every permission check is logged with structured fields:

def _log_permission_check(self, user, permission, context, result):
log_data = {
"user_email": user.email,
"permission": permission.value,
"resource_context": context,
"result": "granted" if result else "denied",
"scope_path": self._build_scope_path(context),
}

if result:
logger.info(f"Permission granted", extra=log_data)
else:
logger.warning(f"Permission denied", extra=log_data)

Usage in Routes

Protecting endpoints with scoped authorization:

from core.auth.scoped_rbac import ScopedPermissionChecker, create_scope_resolver

team_resolver = create_scope_resolver(
scope_type="team",
extract_scope=lambda r: r.path_params.get("team_id"),
)

@app.delete("/teams/{team_id}/runbooks/{runbook_id}")
async def delete_runbook(
team_id: str,
runbook_id: str,
user: UserInfo = Depends(get_current_user),
db: Session = Depends(get_db),
):
checker = ScopedPermissionChecker(db)

if not checker.check_scoped_permission(
user, Permission.DELETE, {"team": team_id}
):
raise HTTPException(403, "Not authorized for this team")

# Safe to proceed—user has DELETE in this team's scope

Results

Before: Global Permissions

alice@example.com → editor → {READ, WRITE}  # Everywhere!

After: Scoped Bindings

alice@example.com → editor → Team-Alpha/staging     → {READ, WRITE}
alice@example.com → viewer → Team-Alpha/production → {READ}
alice@example.com → (none) → Team-Beta/* → (no access)

Audit Trail

{
"timestamp": "2025-01-16T10:30:00Z",
"user_email": "alice@example.com",
"permission": "write",
"resource_context": {"team": "team-alpha", "environment": "staging"},
"result": "granted",
"scope_path": "team:team-alpha -> environment:staging"
}

Lessons Learned

  1. Start with scopes, not roles: If we'd designed with scopes from day one, the global-permission mistake wouldn't have happened.

  2. Operational verbs matter: CRUD permissions don't capture SRE workflows. DEPLOY, SCALE, and SILENCE_ALERT are fundamentally different from WRITE.

  3. Audit everything: The compliance team was delighted. Every permission check now has a paper trail.

  4. Time-bound access enables break-glass: On-call engineers get elevated permissions that auto-expire after their shift.

  5. Hierarchy simplifies management: Org → Team → Project scope hierarchy means fewer bindings to manage.

Migration Path

For existing users, we created a migration that:

  1. Seeds default scopes (development, staging, production, default org)
  2. Existing global roles continue working (backward compatible)
  3. New scoped bindings can be added incrementally

What's Next

With scoped RBAC in place, we can now:

  • Implement capability-level permissions (per-entity access)
  • Add break-glass workflows with automatic scope escalation
  • Build a permission management UI for team leads
  • Enable self-service scope requests with approval workflows

This post details the implementation of ADR-032: RBAC Model, resolving Issue #450.