ADR-006: Permission System Redesign

Status

Accepted (implemented March 2026, PRs #494, #497, #498)

Context

A performance investigation (March 2026) revealed that every authenticated API request spends 80–160ms on authorization database queries — looking up data that rarely changes (user identity, roles, permissions). Combined with the switch from transaction-mode to session-mode pooling (which cut response times 30–50%), auth overhead is now the dominant bottleneck.

A comprehensive audit of the permission system revealed accumulated complexity beyond business requirements, redundant data, and dead code. This document captures all findings and proposes a redesign.

Investigation Findings

Finding 1: Auth overhead per request

Every authenticated request makes 2–3 DB round trips (~40ms each) before any business logic runs:

Step	Query	Cost
`resolveAppUserId`	`SELECT id FROM users WHERE auth_user_id = $1`	~40ms
`getUserRoles` (inside `hasPermission`)	`SELECT user + roles WHERE auth_user_id = $1`	~40ms
`userPermission.findMany` (inside `hasPermission`)	`SELECT * FROM user_permissions WHERE user_id = $1`	~40ms

checkEntityPermission() calls hasPermission() up to 4 times (location scope, resource scope, group scope, base fallback), so a single entity permission check can fire up to 8 DB queries.

Production Sentry traces (post-session-mode switch):

Endpoint	Duration	DB queries	Auth overhead
users/me	261ms	2	~80ms (31%)
profile	318ms	4	~120ms (38%)
account/overview	582ms	20	~160ms (27%)
reservations (10)	487ms	7	~80ms (16%)

Finding 2: The permission grammar is over-engineered

Permissions use a dot-separated grammar: domain.action.subjectScope.resourceScopeType.resourceId (e.g., reservations.cancel.any.location.<uuid>). A wildcard engine supports * at any position. A PermissionCatalog validates all strings against canonical patterns.

However, GLOBAL_ROLE_PERMISSIONS and SCOPED_ROLE_PERMISSIONS — the maps that define what each role can do — are static constants in code. They are not stored in the database, not editable through a UI, and never change at runtime. The grammar/wildcard/catalog machinery exists solely to match permission strings against these static lists.

Every route permission check can be equivalently expressed as: "does this user have role X, optionally scoped to location Y?"

Key files:

apps/api/src/utils/auth/permission-check.ts — hasPermission(), role maps, wildcard matching (~700 lines)
apps/api/src/lib/permissions/PermissionCatalog.ts — grammar definition, parsing, validation
apps/api/src/lib/permissions/entity-permission-check.ts — entity-scoped wrappers

Finding 3: Dynamic permission grants are fully redundant

The user_permissions table stores explicit per-user permission grants with validity windows. It is written from three places:

Stripe purchase webhooks → grants reservations.create.own (credit purchase), reservations.create.own + resources.view.any (subscription), waitlists.view.own + reservations.create.own (waitlist)
Lockout management → grants reservations.create.own
Admin UI → manual grants (ADMIN-only)

The USER role (assigned to every user at signup) already has reservations.*.own and waitlists.*.own, which cover all dynamically-granted permissions via wildcard matching.

The resources.view.any grant given on subscription purchase is an antipattern — there is no reason to gate viewing resources behind a permission. Resources are publicly visible.

The "paywall" (credits or Stripe payment required to book) is enforced at the business logic layer (checkout validates payment method/balance), not the permission layer.

Production data (queried 2026-03-14):

Metric	Value
Total active grants	108
Redundant (covered by role)	108 (100%)

Breakdown of active grants:

85× reservations.create.own — USER role covers via reservations.*.own
18× waitlists.view.own — USER role covers via waitlists.*.own
3× * — ADMIN role covers
1× resources.view.any — antipattern (no reason to gate resource viewing)
1× resources.<uuid>.access — not a valid catalog permission, test artifact

Finding 4: Dead code

Code	Status
`requirePermission()` middleware	Defined, never used by any route
`checkOwnershipPermission()`	Defined, never called from any route or service
`PermissionService.processPurchasePermissions()`	Writes redundant grants
`PermissionService.handleSubscriptionCancellation()`	Revokes redundant grants
`PermissionService.grantResourcePermission()`	Writes redundant grants
`users/[id]/role/route.ts` (PATCH)	Dead from frontend — zero callers in web app

Finding 5: The JWT already has the role — but not enough

app_metadata.role exists in the Supabase JWT, set via supabase.auth.admin.updateUserById(). The code reads it in getAuthenticatedUser(). However:

It's a single string (e.g., "ADMIN") — doesn't include scoped roles
app_user_id (the app's internal UUID) is not in the token at all
No Custom Access Token Hook is configured (config.toml has it commented out)
The code ignores the JWT role for authorization and queries the DB instead

Finding 6: Supabase recommends JWT claims for this

Supabase's official RBAC documentation recommends:

Store roles in a DB table (source of truth)
Embed roles in JWT via Custom Access Token Hook at token issuance
Read roles from JWT at request time — zero DB lookups

Their example uses a single role per user embedded via jsonb_set(claims, '{user_role}', to_jsonb(user_role)). Our case extends this to multiple scoped roles via jsonb_agg. The staleness window (up to 1 hour until token refresh) is accepted as a reasonable tradeoff for most applications.

Supabase recommends a profiles table where profiles.id = auth.users.id (same UUID). Our system diverges — users.id is a separate app-generated UUID bridged to auth.users.id via users.auth_user_id. This divergence is the reason resolveAppUserId() exists. A follow-up UUID unification project will eliminate this indirection (see Finding 13).

Finding 7: Permission checks are really role checks

Audit of all ~100 route files shows permission checks fall into clear categories:

Pattern	Count	Equivalent role check
Admin-only operations (`credits.adjust.any`, etc.)	~25 routes	`isAdmin(claims)`
Staff operations (`users.list.any`, `pricing.view.any`, etc.)	~30 routes	`isStaffOrAbove(claims)`
Location-scoped operations (`reservations.cancel.any.location.<id>`)	~22 routes	`isStaffOrAboveAt(claims, locationId)`
Own-vs-any operations (`reservations.*.own` / `.any`)	~18 routes	Ownership check + `isStaffOrAboveAt` fallback
Own-only (no permission check needed)	~5 routes	Just `isAuthenticated`

Finding 8: Query execution is not the bottleneck

EXPLAIN ANALYZE on the slowest production queries shows all execute in <13ms on the database. The ~40ms per query visible in traces is almost entirely network round trip between Vercel (us-west-2) and Supabase (us-west-1). Reducing round trips is the only lever.

Finding 9: Session-mode pooling eliminated protocol overhead

Switching DATABASE_URL from port 6543 (transaction mode + pgbouncer=true) to port 5432 (session mode) eliminated BEGIN/DEALLOCATE ALL/COMMIT wrapping per query, cutting response times 30–50% across all endpoints. This was the single biggest performance win.

Finding 10: PARTNER role is active

The PARTNER role is a real, active role used for the referral program. It has a defined permission set (users.view.own, users.update.own, transactions.view.own) and is involved in live business logic:

Stripe webhook creates PartnerReferralPayout records for PARTNER-owned referral codes
Staff referral codes UI filters by PARTNER owner type
Admin can assign PARTNER role via user creation and role update endpoints
Seeded and used in test/dev environments

The role-check module must handle PARTNER.

Finding 11: STAFF vs COMMUNITY_MANAGER differences are nuanced

The original assessment grouped STAFF and COMMUNITY_MANAGER together as "staff or above." The actual role maps show significant differences:

Capability	STAFF	COMMUNITY_MANAGER
`organizations.*`	Yes (global)	No
`location-groups.view.*`	Yes (global)	No
`resources.list.*`	Yes (global)	No
`users.invite.any`	Yes (global)	No
`pricing.money.create.any`	Yes (global)	No
`security.codes.*`	Yes (global)	No
`resource-groups.members.manage.any`	Yes (global)	No
`media.*`	Yes (global)	No
`dashboard.*.any`	No	Yes (global + scoped)
`users.approve.any`	No	Yes (scoped only)
`security.impersonate.any`	No	Yes (global)
`reservations.create.any`	Yes (global)	Scoped only

Resource/location write operations (resources.create/update/delete.any, locations.create/update/delete.any) are ADMIN-only at global scope. STAFF and CM get them only when location-scoped via SCOPED_ROLE_PERMISSIONS.

migration.list.any is not in any role's permission list — only ADMIN can access migration routes (via the * wildcard).

Finding 12: New user race condition is impossible

The public.users record is created by a Postgres AFTER INSERT trigger (on_auth_user_created) on auth.users, running within the same transaction. By the time Supabase issues a token (after the auth.users INSERT commits), the public.users row already exists. The Custom Access Token Hook will find the user row on first token issuance for all users.

Finding 13: Dual-UUID architecture creates unnecessary indirection

Our users table uses its own id (app-generated UUID) with auth_user_id as a FK to auth.users. Supabase recommends profiles.id = auth.users.id (same UUID). Our pattern means:

Every request needs resolveAppUserId() to translate auth UUID → app UUID (~40ms)
~35 routes do where: { authUserId: auth.userId } Prisma lookups
~17 routes do entity.user.authUserId === auth.userId ownership comparisons
Two UUID spaces to reason about across the codebase

No Supabase admin API call (supabase.auth.admin.*) depends on auth.userId directly — they all resolve user.authUserId from Prisma first.

Deferred: A follow-up UUID unification project will make users.id = auth.users.id, eliminating resolveAppUserId(), the authUserId column, and all auth-to-app UUID translation. This is out of scope for the permission redesign but will provide the final ~40ms reduction.

Finding 14: Impersonation permission check is a no-op bug

admin/impersonate/route.ts queries userPermission with permission: { in: ['ADMIN', 'STAFF'] } — checking if role names exist as permission strings. They never do (these are role names, not valid permission strings). This check always returns zero rows and provides no protection. The intent was to prevent impersonating admin users.

Finding 15: `app_metadata.role` has limited but real usage

The single app_metadata.role string is read in 6 frontend files (post-login redirects, checkout context, resource display) and several API files (JWT validation, auth user filtering). It is written by only one endpoint (users/[id]/role/route.ts, which is dead from the frontend) and one migration script. Since we are embedding the full roles array in JWT claims, this field becomes redundant and should be removed rather than maintained for backward compatibility.

Finding 16: `user_permissions` write paths are fully cataloged

Only four permission strings are ever written automatically:

Permission	Written by	Redundant?
`reservations.create.own`	Stripe webhooks, lockout fulfillment	Yes — USER role has `reservations.*.own`
`resources.view.any`	Subscription purchase	Yes — antipattern, no reason to gate
`waitlists.view.own`	Waitlist purchase	Yes — USER role has `waitlists.*.own`
(any string)	Admin manual grant	Must audit production before cutover

Dashboard stats/revenue routes parse .location.<id> from permission strings to extract scoped locations. This should read from user_roles scoped entries instead.

Decision

Redesign the permission system to:

Embed role assignments in JWT via Custom Access Token Hook
Replace DB-based permission checks with in-memory role checks against JWT claims
Eliminate the permission grammar, wildcard engine, and PermissionCatalog
Stop writing to user_permissions from Stripe webhooks and lockout management
Remove app_metadata.role field and the dead legacy endpoint that writes it
Keep entity-level ownership checks unchanged (still compare authUserId — deferred to UUID unification)
Keep resolveAppUserId() for audit context (deferred to UUID unification)

Explicitly deferred to UUID unification: Changing auth.userId from Supabase auth UUID to app UUID, updating ~35 Prisma lookups and ~17 ownership comparisons. This avoids throwaway work since the UUID unification will eliminate the dual-UUID architecture entirely.

Consequences

Positive:

Auth overhead drops from 80–160ms to ~40ms on every authenticated request (permission DB queries eliminated; resolveAppUserId remains until UUID unification)
Permission checks become synchronous, pure functions — easier to test, no DB dependency
~1500 lines of grammar/wildcard/catalog/permission-service code removed
Simpler mental model: "what role do you have?" instead of "does your permission string match this wildcard pattern?"
app_metadata.role removed — single source of truth for roles is user_roles table embedded in JWT

Negative:

Role changes take up to 1 hour to take effect (token refresh cycle) — mitigated by forced refresh on role change
Migration touches ~100 route files (mechanical but large surface area)

Neutral:

user_permissions table kept temporarily but deprecated — no new writes, must audit production for manual admin grants before cutover
Test infrastructure needs updating (mock claims instead of DB role registration)
Ownership checks and authUserId Prisma lookups stay unchanged (deferred to UUID unification)

Implementation Context

This section provides the detail an implementing agent needs to understand the current system.

Current auth flow (per request)

Entry point: validateAuthentication(req) in apps/api/src/lib/validation/auth/auth-validator.ts

1. Test mode? → inject hardcoded userId, skip everything
2. Impersonation header? → look up ImpersonationSession in DB → resolve target user
3. Supabase auth → validateSupabaseAuth(req) → getClaims(token) → extract sub + app_metadata.role
4. resolveAppUserId(authUserId) → SELECT id FROM users WHERE auth_user_id = $1  ← DB HIT
5. Store audit context on request object
6. Return { userId: authUserId, email, isAuthenticated }

Key: auth.userId returned to routes is the Supabase auth UUID, NOT the app's users.id. The resolveAppUserId maps it to the app UUID for audit context only. Then hasPermission() does its own lookup by authUserId again internally. This remains unchanged in this project — the UUID mapping moves to the follow-up UUID unification.

Current JWT structure

{
  "sub": "supabase-auth-uuid",
  "aud": "authenticated",
  "role": "authenticated",
  "email": "user@example.com",
  "app_metadata": {
    "role": "STAFF",
    "provider": "email",
    "providers": ["email"]
  },
  "user_metadata": { "name": "..." }
}

app_metadata.role is set manually via supabase.auth.admin.updateUserById() in apps/api/src/app/api/users/[id]/role/route.ts (dead endpoint — zero frontend callers). No Custom Access Token Hook exists — config.toml has the section commented out.

How routes use auth today (typical pattern)

// 1. Authenticate
const auth = await validateAuthentication(req)
if (auth.error) return auth.error

// 2. Get request-scoped Prisma client
const reqPrisma = getRequestPrismaClient(req as ExtendedRequest)

// 3. Check permission (DB query)
const canDo = await hasPermission(auth.userId || '', 'domain.action.scope', reqPrisma)
if (!canDo) return createErrorResponse('Insufficient permissions', { type: ErrorType.FORBIDDEN, status: 403 })

// 4. Business logic

For entity-scoped checks (e.g., cancel reservation):

const isOwn = reservation.user?.authUserId === auth.userId
const permission = isOwn ? 'reservations.cancel.own' : 'reservations.cancel.any'
const canCancel = await checkEntityPermission(
  auth.userId || '',
  permission,
  { locationId: reservation.resource?.locationId, resourceId: reservation.resourceId },
  tx
)

Test infrastructure

registerTestUserRole(userId, role) / registerTestUserScopedRole(userId, role, scopeType, scopeId) — in-memory role registration for unit tests
checkTestModeFallback() — checks in-memory maps before DB lookup
Integration tests use x-test-user-id + x-test-role headers
API_TEST_MODE=true + API_TEST_MODE_ROLE=ADMIN for dev server
STRICT_PERMISSION_TESTS=true makes permission failures throw instead of returning false

Impersonation

Table: ImpersonationSession { id, adminUserId, targetUserId, expiresAt, endedAt }

Created via POST /api/admin/impersonate (requires security.impersonate.any)
Requests with x-impersonate: <sessionId> header operate as target user
impersonatorId stored in audit context
Impersonation bypasses normal auth flow — looks up target user directly from DB
This flow should remain DB-based (it needs to verify the session is valid and not expired)
Bug: The target-is-admin check queries userPermission for role name strings ('ADMIN', 'STAFF') which are not valid permission strings. This check is a no-op and should be fixed to query user_roles instead.

account/overview uses a different pattern

apps/api/src/app/api/account/overview/route.ts doesn't use hasPermission(). It calls getAllEffectivePermissions() once to get the full permission set, then uses checkPermissionAgainstList() (in-memory) to decide which sections to include. This is actually closer to the target design — it fetches once and checks in-memory.

Supabase local dev config

apps/api/supabase/config.toml:

jwt_expiry = 3600 (1 hour)
enable_refresh_token_rotation = true
Google OAuth enabled
Custom Access Token Hook section commented out — needs uncommenting and configuring

Verified schema: `user_roles` table

Confirmed via Prisma schema and migration (20260125222753_add_user_roles):

scope_type: TEXT, nullable
scope_id: UUID, nullable
Indexed: (scope_type, scope_id), (user_id), (deleted_at)
Unique constraint: (user_id, role, scope_type, scope_id)

Files to understand

File	What it does
`apps/api/src/lib/validation/auth/auth-validator.ts`	`validateAuthentication()`, `resolveAppUserId()`, impersonation
`apps/api/src/lib/auth/supabase/server.ts`	`validateSupabaseAuth()`, `getAuthenticatedUser()`, `getClaims()`
`apps/api/src/utils/auth/permission-check.ts`	`hasPermission()`, `getUserRoles()`, role maps, wildcard matching, test mode
`apps/api/src/lib/permissions/PermissionCatalog.ts`	Grammar, `parsePermission()`, `buildPermission()`, `isValidPermission()`
`apps/api/src/lib/permissions/entity-permission-check.ts`	`checkEntityPermission()`, `checkOwnershipPermission()` (dead)
`apps/api/src/lib/api/middleware/require-permission.ts`	`requirePermission()` (dead)
`apps/api/src/services/shared/PermissionService.ts`	`processPurchasePermissions()`, `grantResourcePermission()` (redundant)
`apps/api/src/app/api/account/overview/route.ts`	Uses `getAllEffectivePermissions()` + `checkPermissionAgainstList()` pattern
`packages/shared-types/src/auth/index.ts`	`Role` enum, `JwtClaims`, `UserRoleAssignment` types
`apps/api/supabase/config.toml`	Local Supabase config, hook configuration
`apps/api/src/app/api/users/[id]/roles/route.ts`	Role assignment (POST), already calls `updateUserById`
`apps/api/src/app/api/users/[id]/role/route.ts`	Legacy single-role update — dead from frontend, to be deleted
`apps/api/src/lib/auth/jwt/jwt-utils.ts`	JWT claim validation/extraction, reads `app_metadata.role`

Complete permission → role equivalence table

Current hasPermission string → what role check it's actually doing, verified against GLOBAL_ROLE_PERMISSIONS and SCOPED_ROLE_PERMISSIONS:

ADMIN-only (via * wildcard — no other role has these globally): credits.adjust.any, credits.create.any, credits.update.any, credits.delete.any, credits.transfer.any, permissions.view.any, permissions.assign.any, permissions.revoke.any, users.role.update.any, users.ban.any, users.delete.any, resources.restore.any, access-gates.create.any, access-gates.update.any, access-gates.delete.any, location-groups.create.any, location-groups.delete.any, location-groups.update.any, pricing.create.any, pricing.delete.any, transactions.create.any, migration.list.any, products.delete.any

STAFF (global — ADMIN + STAFF, but NOT COMMUNITY_MANAGER): users.invite.any, organizations.*, location-groups.view.*, resources.list.*, pricing.money.create.any, security.codes.view.any, security.codes.create.any, resource-groups.members.manage.any, media.*, reservations.create.any (global)

STAFF_OR_ABOVE (ADMIN + STAFF + COMMUNITY_MANAGER): users.list.any, users.view.any, users.update.any, users.create.any, locations.view.*, resources.view.*, assets.*.any, pricing.view.*, credits.view.any, credits.list.any, transactions.view.any, transactions.list.any, reservations.view.any, reservations.list.any, inquiries.*.any, checkout.create.any

COMMUNITY_MANAGER-specific (ADMIN + CM, but NOT STAFF): dashboard.*.any, security.impersonate.any

Location-scoped (STAFF + CM via SCOPED_ROLE_PERMISSIONS): reservations.cancel.any, reservations.noshow.any, reservations.update.any, waitlists.*.any, locations.*.any (write ops), resources.*.any (write ops)

CM-scoped-only (CM via SCOPED_ROLE_PERMISSIONS, not STAFF): users.approve.any, dashboard.*.any

Own-data (USER + PARTNER — any authenticated user): reservations.*.own, credits.*.own, waitlists.*.own, users.view.own, users.update.own, transactions.*.own, purchases.*.own, checkout.create.own

PARTNER-only: users.view.own, users.update.own, transactions.view.own (subset of USER)

References

Supabase Custom Claims & RBAC Guide — Official RBAC pattern: roles in DB → embed in JWT via hook → read from claims
Supabase Custom Access Token Hook — Hook API: inputs (user_id, claims, authentication_method), outputs (modified claims), required/optional JWT fields
Supabase Auth Hooks Overview — Postgres function vs HTTP endpoint, lifecycle, registration, local dev config
Supabase JWT Claims Reference — Default JWT structure, app_metadata vs user_metadata, what's user-editable vs admin-only
Supabase Managing User Data — Recommended profiles.id = auth.users.id pattern; our system diverges with separate users.id
GitHub Discussion #34925: Scoped Roles per Group — Community approach to per-org/location roles; DB-backed authorize() vs JWT-embedded roles
GitHub Discussion #1148: Multi-tenant Custom Claims — raw_app_metadata pattern, staleness concerns acknowledged by maintainers, helper function approach
Supabase RLS Guide — Defense-in-depth at DB layer (not applicable to Prisma connections using postgres role)
Supabase PgBouncer / Connection Pooling — Prisma's pgbouncer=true behavior: forces BEGIN/DEALLOCATE ALL/COMMIT per query
Supavisor Named Prepared Statements (Issue #239) — Why session mode works but transaction mode forces overhead
JWT Best Practices — LogRocket — General JWT claims vs DB lookup tradeoffs
Serverless Auth Strategies — Token claims vs per-request DB lookup in serverless contexts