Scoped Role Bindings: From Global Chaos to Team-Level Control
Published: 2025-01-16
When the LLM Council reviewed our RBAC implementation (ADR-032), the verdict was unanimous and uncomfortable: REQUEST_CHANGES. The core problem? Our "simple" role model was a ticking time bomb. A user with WRITE permission could modify any resource in the system—Team A's runbooks, Team B's SLOs, production configurations they had no business touching.
This post details how we addressed Issue #450 by implementing scoped role bindings, transforming our authorization model from a liability into a proper multi-tenant foundation.
The Problem
Our original RBAC model was deceptively simple:
class Permission(Enum):
READ = "read"
WRITE = "write"
DELETE = "delete"
ADMIN = "admin"
Users were assigned roles (admin, editor, viewer) that mapped to these permissions globally. The LLM Council identified several critical gaps:
- No Resource Scoping: A Team-A editor could modify Team-B's runbooks
- No Environment Separation: Couldn't be Admin in staging but Viewer in production
- Missing Operational Verbs: No DEPLOY, SCALE, or SILENCE_ALERT permissions
- No Audit Trail: Permission checks weren't logged for compliance
The "blast radius" of any permission was the entire system. For an SRE platform managing production infrastructure, this was unacceptable.
The Solution
We implemented a three-layer scoping model:
User → Role → Scope → Permission
Scope Model
Scopes define boundaries for resource access:
class AuthScope(Base):
__tablename__ = "auth_scopes"
scope_type: Mapped[str] # team, environment, service, organization
scope_value: Mapped[str] # team-alpha, production, api-gateway
parent_id: Mapped[str] # For hierarchy (org → team → project)
The hierarchy enables inheritance—organization-level access flows down to teams within that org.
Scoped Role Bindings
Instead of global role assignments, users now have scoped bindings:
class AuthRoleScopeBinding(Base):
__tablename__ = "auth_role_scope_bindings"
user_email: Mapped[str] # Who
role_id: Mapped[str] # What role
scope_id: Mapped[str] # Where (team, env, etc.)
expires_at: Mapped[datetime] # Time-bound access
assigned_by: Mapped[str] # Audit: who granted this
reason: Mapped[str] # Audit: why granted
This enables scenarios like:
- Alice: Admin in Team-Alpha, Viewer in Team-Beta
- Bob: Deployer in staging, Reader in production
- Carol: On-call engineer with SILENCE_ALERT for 8 hours
Operational Verbs
We added SRE-specific permissions:
class Permission(str, Enum):
# Basic CRUD
READ = "read"
WRITE = "write"
DELETE = "delete"
EXECUTE = "execute"
ADMIN = "admin"
# Operational verbs (Issue #450)
APPROVE = "approve" # Approve deployments, changes
DEPLOY = "deploy" # Deploy to environments
SCALE = "scale" # Scale services
SILENCE_ALERT = "silence_alert" # Silence during maintenance
Implementation
Scoped Permission Checker
The core permission checker evaluates access within context:
class ScopedPermissionChecker:
def check_scoped_permission(
self,
user: UserInfo,
required_permission: Permission,
resource_context: dict[str, Any], # {"team": "alpha", "env": "prod"}
) -> bool:
# 1. Get user's scope bindings
bindings = self._get_user_bindings(user.email)
# 2. Filter to active (non-expired) bindings
active = [b for b in bindings if not b.is_expired]
# 3. Check if any binding grants access
for binding in active:
if self._scope_matches_context(binding.scope, resource_context):
if required_permission in self._get_role_permissions(binding.role):
return True
return False
Audit Logging
Every permission check is logged with structured fields:
def _log_permission_check(self, user, permission, context, result):
log_data = {
"user_email": user.email,
"permission": permission.value,
"resource_context": context,
"result": "granted" if result else "denied",
"scope_path": self._build_scope_path(context),
}
if result:
logger.info(f"Permission granted", extra=log_data)
else:
logger.warning(f"Permission denied", extra=log_data)
Usage in Routes
Protecting endpoints with scoped authorization:
from core.auth.scoped_rbac import ScopedPermissionChecker, create_scope_resolver
team_resolver = create_scope_resolver(
scope_type="team",
extract_scope=lambda r: r.path_params.get("team_id"),
)
@app.delete("/teams/{team_id}/runbooks/{runbook_id}")
async def delete_runbook(
team_id: str,
runbook_id: str,
user: UserInfo = Depends(get_current_user),
db: Session = Depends(get_db),
):
checker = ScopedPermissionChecker(db)
if not checker.check_scoped_permission(
user, Permission.DELETE, {"team": team_id}
):
raise HTTPException(403, "Not authorized for this team")
# Safe to proceed—user has DELETE in this team's scope
Results
Before: Global Permissions
alice@example.com → editor → {READ, WRITE} # Everywhere!
After: Scoped Bindings
alice@example.com → editor → Team-Alpha/staging → {READ, WRITE}
alice@example.com → viewer → Team-Alpha/production → {READ}
alice@example.com → (none) → Team-Beta/* → (no access)
Audit Trail
{
"timestamp": "2025-01-16T10:30:00Z",
"user_email": "alice@example.com",
"permission": "write",
"resource_context": {"team": "team-alpha", "environment": "staging"},
"result": "granted",
"scope_path": "team:team-alpha -> environment:staging"
}
Lessons Learned
-
Start with scopes, not roles: If we'd designed with scopes from day one, the global-permission mistake wouldn't have happened.
-
Operational verbs matter: CRUD permissions don't capture SRE workflows. DEPLOY, SCALE, and SILENCE_ALERT are fundamentally different from WRITE.
-
Audit everything: The compliance team was delighted. Every permission check now has a paper trail.
-
Time-bound access enables break-glass: On-call engineers get elevated permissions that auto-expire after their shift.
-
Hierarchy simplifies management: Org → Team → Project scope hierarchy means fewer bindings to manage.
Migration Path
For existing users, we created a migration that:
- Seeds default scopes (development, staging, production, default org)
- Existing global roles continue working (backward compatible)
- New scoped bindings can be added incrementally
What's Next
With scoped RBAC in place, we can now:
- Implement capability-level permissions (per-entity access)
- Add break-glass workflows with automatic scope escalation
- Build a permission management UI for team leads
- Enable self-service scope requests with approval workflows
This post details the implementation of ADR-032: RBAC Model, resolving Issue #450.
