Skip to content

ADR-006: Permission System Redesign

Status

Accepted (implemented March 2026, PRs #494, #497, #498)

Context

A performance investigation (March 2026) revealed that every authenticated API request spends 80–160ms on authorization database queries — looking up data that rarely changes (user identity, roles, permissions). Combined with the switch from transaction-mode to session-mode pooling (which cut response times 30–50%), auth overhead is now the dominant bottleneck.

A comprehensive audit of the permission system revealed accumulated complexity beyond business requirements, redundant data, and dead code. This document captures all findings and proposes a redesign.

Investigation Findings

Finding 1: Auth overhead per request

Every authenticated request makes 2–3 DB round trips (~40ms each) before any business logic runs:

Step Query Cost
resolveAppUserId SELECT id FROM users WHERE auth_user_id = $1 ~40ms
getUserRoles (inside hasPermission) SELECT user + roles WHERE auth_user_id = $1 ~40ms
userPermission.findMany (inside hasPermission) SELECT * FROM user_permissions WHERE user_id = $1 ~40ms

checkEntityPermission() calls hasPermission() up to 4 times (location scope, resource scope, group scope, base fallback), so a single entity permission check can fire up to 8 DB queries.

Production Sentry traces (post-session-mode switch):

Endpoint Duration DB queries Auth overhead
users/me 261ms 2 ~80ms (31%)
profile 318ms 4 ~120ms (38%)
account/overview 582ms 20 ~160ms (27%)
reservations (10) 487ms 7 ~80ms (16%)

Finding 2: The permission grammar is over-engineered

Permissions use a dot-separated grammar: domain.action.subjectScope.resourceScopeType.resourceId (e.g., reservations.cancel.any.location.<uuid>). A wildcard engine supports * at any position. A PermissionCatalog validates all strings against canonical patterns.

However, GLOBAL_ROLE_PERMISSIONS and SCOPED_ROLE_PERMISSIONS — the maps that define what each role can do — are static constants in code. They are not stored in the database, not editable through a UI, and never change at runtime. The grammar/wildcard/catalog machinery exists solely to match permission strings against these static lists.

Every route permission check can be equivalently expressed as: "does this user have role X, optionally scoped to location Y?"

Key files:

  • apps/api/src/utils/auth/permission-check.tshasPermission(), role maps, wildcard matching (~700 lines)
  • apps/api/src/lib/permissions/PermissionCatalog.ts — grammar definition, parsing, validation
  • apps/api/src/lib/permissions/entity-permission-check.ts — entity-scoped wrappers

Finding 3: Dynamic permission grants are fully redundant

The user_permissions table stores explicit per-user permission grants with validity windows. It is written from three places:

  1. Stripe purchase webhooks → grants reservations.create.own (credit purchase), reservations.create.own + resources.view.any (subscription), waitlists.view.own + reservations.create.own (waitlist)
  2. Lockout management → grants reservations.create.own
  3. Admin UI → manual grants (ADMIN-only)

The USER role (assigned to every user at signup) already has reservations.*.own and waitlists.*.own, which cover all dynamically-granted permissions via wildcard matching.

The resources.view.any grant given on subscription purchase is an antipattern — there is no reason to gate viewing resources behind a permission. Resources are publicly visible.

The "paywall" (credits or Stripe payment required to book) is enforced at the business logic layer (checkout validates payment method/balance), not the permission layer.

Production data (queried 2026-03-14):

Metric Value
Total active grants 108
Redundant (covered by role) 108 (100%)

Breakdown of active grants:

  • 85× reservations.create.own — USER role covers via reservations.*.own
  • 18× waitlists.view.own — USER role covers via waitlists.*.own
  • * — ADMIN role covers
  • resources.view.any — antipattern (no reason to gate resource viewing)
  • resources.<uuid>.access — not a valid catalog permission, test artifact

Finding 4: Dead code

Code Status
requirePermission() middleware Defined, never used by any route
checkOwnershipPermission() Defined, never called from any route or service
PermissionService.processPurchasePermissions() Writes redundant grants
PermissionService.handleSubscriptionCancellation() Revokes redundant grants
PermissionService.grantResourcePermission() Writes redundant grants
users/[id]/role/route.ts (PATCH) Dead from frontend — zero callers in web app

Finding 5: The JWT already has the role — but not enough

app_metadata.role exists in the Supabase JWT, set via supabase.auth.admin.updateUserById(). The code reads it in getAuthenticatedUser(). However:

  • It's a single string (e.g., "ADMIN") — doesn't include scoped roles
  • app_user_id (the app's internal UUID) is not in the token at all
  • No Custom Access Token Hook is configured (config.toml has it commented out)
  • The code ignores the JWT role for authorization and queries the DB instead

Finding 6: Supabase recommends JWT claims for this

Supabase's official RBAC documentation recommends:

  1. Store roles in a DB table (source of truth)
  2. Embed roles in JWT via Custom Access Token Hook at token issuance
  3. Read roles from JWT at request time — zero DB lookups

Their example uses a single role per user embedded via jsonb_set(claims, '{user_role}', to_jsonb(user_role)). Our case extends this to multiple scoped roles via jsonb_agg. The staleness window (up to 1 hour until token refresh) is accepted as a reasonable tradeoff for most applications.

Supabase recommends a profiles table where profiles.id = auth.users.id (same UUID). Our system diverges — users.id is a separate app-generated UUID bridged to auth.users.id via users.auth_user_id. This divergence is the reason resolveAppUserId() exists. A follow-up UUID unification project will eliminate this indirection (see Finding 13).

Finding 7: Permission checks are really role checks

Audit of all ~100 route files shows permission checks fall into clear categories:

Pattern Count Equivalent role check
Admin-only operations (credits.adjust.any, etc.) ~25 routes isAdmin(claims)
Staff operations (users.list.any, pricing.view.any, etc.) ~30 routes isStaffOrAbove(claims)
Location-scoped operations (reservations.cancel.any.location.<id>) ~22 routes isStaffOrAboveAt(claims, locationId)
Own-vs-any operations (reservations.*.own / .any) ~18 routes Ownership check + isStaffOrAboveAt fallback
Own-only (no permission check needed) ~5 routes Just isAuthenticated

Finding 8: Query execution is not the bottleneck

EXPLAIN ANALYZE on the slowest production queries shows all execute in <13ms on the database. The ~40ms per query visible in traces is almost entirely network round trip between Vercel (us-west-2) and Supabase (us-west-1). Reducing round trips is the only lever.

Finding 9: Session-mode pooling eliminated protocol overhead

Switching DATABASE_URL from port 6543 (transaction mode + pgbouncer=true) to port 5432 (session mode) eliminated BEGIN/DEALLOCATE ALL/COMMIT wrapping per query, cutting response times 30–50% across all endpoints. This was the single biggest performance win.

Finding 10: PARTNER role is active

The PARTNER role is a real, active role used for the referral program. It has a defined permission set (users.view.own, users.update.own, transactions.view.own) and is involved in live business logic:

  • Stripe webhook creates PartnerReferralPayout records for PARTNER-owned referral codes
  • Staff referral codes UI filters by PARTNER owner type
  • Admin can assign PARTNER role via user creation and role update endpoints
  • Seeded and used in test/dev environments

The role-check module must handle PARTNER.

Finding 11: STAFF vs COMMUNITY_MANAGER differences are nuanced

The original assessment grouped STAFF and COMMUNITY_MANAGER together as "staff or above." The actual role maps show significant differences:

Capability STAFF COMMUNITY_MANAGER
organizations.* Yes (global) No
location-groups.view.* Yes (global) No
resources.list.* Yes (global) No
users.invite.any Yes (global) No
pricing.money.create.any Yes (global) No
security.codes.* Yes (global) No
resource-groups.members.manage.any Yes (global) No
media.* Yes (global) No
dashboard.*.any No Yes (global + scoped)
users.approve.any No Yes (scoped only)
security.impersonate.any No Yes (global)
reservations.create.any Yes (global) Scoped only

Resource/location write operations (resources.create/update/delete.any, locations.create/update/delete.any) are ADMIN-only at global scope. STAFF and CM get them only when location-scoped via SCOPED_ROLE_PERMISSIONS.

migration.list.any is not in any role's permission list — only ADMIN can access migration routes (via the * wildcard).

Finding 12: New user race condition is impossible

The public.users record is created by a Postgres AFTER INSERT trigger (on_auth_user_created) on auth.users, running within the same transaction. By the time Supabase issues a token (after the auth.users INSERT commits), the public.users row already exists. The Custom Access Token Hook will find the user row on first token issuance for all users.

Finding 13: Dual-UUID architecture creates unnecessary indirection

Our users table uses its own id (app-generated UUID) with auth_user_id as a FK to auth.users. Supabase recommends profiles.id = auth.users.id (same UUID). Our pattern means:

  • Every request needs resolveAppUserId() to translate auth UUID → app UUID (~40ms)
  • ~35 routes do where: { authUserId: auth.userId } Prisma lookups
  • ~17 routes do entity.user.authUserId === auth.userId ownership comparisons
  • Two UUID spaces to reason about across the codebase

No Supabase admin API call (supabase.auth.admin.*) depends on auth.userId directly — they all resolve user.authUserId from Prisma first.

Deferred: A follow-up UUID unification project will make users.id = auth.users.id, eliminating resolveAppUserId(), the authUserId column, and all auth-to-app UUID translation. This is out of scope for the permission redesign but will provide the final ~40ms reduction.

Finding 14: Impersonation permission check is a no-op bug

admin/impersonate/route.ts queries userPermission with permission: { in: ['ADMIN', 'STAFF'] } — checking if role names exist as permission strings. They never do (these are role names, not valid permission strings). This check always returns zero rows and provides no protection. The intent was to prevent impersonating admin users.

Finding 15: app_metadata.role has limited but real usage

The single app_metadata.role string is read in 6 frontend files (post-login redirects, checkout context, resource display) and several API files (JWT validation, auth user filtering). It is written by only one endpoint (users/[id]/role/route.ts, which is dead from the frontend) and one migration script. Since we are embedding the full roles array in JWT claims, this field becomes redundant and should be removed rather than maintained for backward compatibility.

Finding 16: user_permissions write paths are fully cataloged

Only four permission strings are ever written automatically:

Permission Written by Redundant?
reservations.create.own Stripe webhooks, lockout fulfillment Yes — USER role has reservations.*.own
resources.view.any Subscription purchase Yes — antipattern, no reason to gate
waitlists.view.own Waitlist purchase Yes — USER role has waitlists.*.own
(any string) Admin manual grant Must audit production before cutover

Dashboard stats/revenue routes parse .location.<id> from permission strings to extract scoped locations. This should read from user_roles scoped entries instead.

Decision

Redesign the permission system to:

  1. Embed role assignments in JWT via Custom Access Token Hook
  2. Replace DB-based permission checks with in-memory role checks against JWT claims
  3. Eliminate the permission grammar, wildcard engine, and PermissionCatalog
  4. Stop writing to user_permissions from Stripe webhooks and lockout management
  5. Remove app_metadata.role field and the dead legacy endpoint that writes it
  6. Keep entity-level ownership checks unchanged (still compare authUserId — deferred to UUID unification)
  7. Keep resolveAppUserId() for audit context (deferred to UUID unification)

Explicitly deferred to UUID unification: Changing auth.userId from Supabase auth UUID to app UUID, updating ~35 Prisma lookups and ~17 ownership comparisons. This avoids throwaway work since the UUID unification will eliminate the dual-UUID architecture entirely.

Consequences

Positive:

  • Auth overhead drops from 80–160ms to ~40ms on every authenticated request (permission DB queries eliminated; resolveAppUserId remains until UUID unification)
  • Permission checks become synchronous, pure functions — easier to test, no DB dependency
  • ~1500 lines of grammar/wildcard/catalog/permission-service code removed
  • Simpler mental model: "what role do you have?" instead of "does your permission string match this wildcard pattern?"
  • app_metadata.role removed — single source of truth for roles is user_roles table embedded in JWT

Negative:

  • Role changes take up to 1 hour to take effect (token refresh cycle) — mitigated by forced refresh on role change
  • Migration touches ~100 route files (mechanical but large surface area)

Neutral:

  • user_permissions table kept temporarily but deprecated — no new writes, must audit production for manual admin grants before cutover
  • Test infrastructure needs updating (mock claims instead of DB role registration)
  • Ownership checks and authUserId Prisma lookups stay unchanged (deferred to UUID unification)

Implementation Context

This section provides the detail an implementing agent needs to understand the current system.

Current auth flow (per request)

Entry point: validateAuthentication(req) in apps/api/src/lib/validation/auth/auth-validator.ts

1. Test mode? → inject hardcoded userId, skip everything
2. Impersonation header? → look up ImpersonationSession in DB → resolve target user
3. Supabase auth → validateSupabaseAuth(req) → getClaims(token) → extract sub + app_metadata.role
4. resolveAppUserId(authUserId) → SELECT id FROM users WHERE auth_user_id = $1  ← DB HIT
5. Store audit context on request object
6. Return { userId: authUserId, email, isAuthenticated }

Key: auth.userId returned to routes is the Supabase auth UUID, NOT the app's users.id. The resolveAppUserId maps it to the app UUID for audit context only. Then hasPermission() does its own lookup by authUserId again internally. This remains unchanged in this project — the UUID mapping moves to the follow-up UUID unification.

Current JWT structure

{
  "sub": "supabase-auth-uuid",
  "aud": "authenticated",
  "role": "authenticated",
  "email": "user@example.com",
  "app_metadata": {
    "role": "STAFF",
    "provider": "email",
    "providers": ["email"]
  },
  "user_metadata": { "name": "..." }
}

app_metadata.role is set manually via supabase.auth.admin.updateUserById() in apps/api/src/app/api/users/[id]/role/route.ts (dead endpoint — zero frontend callers). No Custom Access Token Hook exists — config.toml has the section commented out.

How routes use auth today (typical pattern)

// 1. Authenticate
const auth = await validateAuthentication(req)
if (auth.error) return auth.error

// 2. Get request-scoped Prisma client
const reqPrisma = getRequestPrismaClient(req as ExtendedRequest)

// 3. Check permission (DB query)
const canDo = await hasPermission(auth.userId || '', 'domain.action.scope', reqPrisma)
if (!canDo) return createErrorResponse('Insufficient permissions', { type: ErrorType.FORBIDDEN, status: 403 })

// 4. Business logic

For entity-scoped checks (e.g., cancel reservation):

const isOwn = reservation.user?.authUserId === auth.userId
const permission = isOwn ? 'reservations.cancel.own' : 'reservations.cancel.any'
const canCancel = await checkEntityPermission(
  auth.userId || '',
  permission,
  { locationId: reservation.resource?.locationId, resourceId: reservation.resourceId },
  tx
)

Test infrastructure

  • registerTestUserRole(userId, role) / registerTestUserScopedRole(userId, role, scopeType, scopeId) — in-memory role registration for unit tests
  • checkTestModeFallback() — checks in-memory maps before DB lookup
  • Integration tests use x-test-user-id + x-test-role headers
  • API_TEST_MODE=true + API_TEST_MODE_ROLE=ADMIN for dev server
  • STRICT_PERMISSION_TESTS=true makes permission failures throw instead of returning false

Impersonation

Table: ImpersonationSession { id, adminUserId, targetUserId, expiresAt, endedAt }

  • Created via POST /api/admin/impersonate (requires security.impersonate.any)
  • Requests with x-impersonate: <sessionId> header operate as target user
  • impersonatorId stored in audit context
  • Impersonation bypasses normal auth flow — looks up target user directly from DB
  • This flow should remain DB-based (it needs to verify the session is valid and not expired)
  • Bug: The target-is-admin check queries userPermission for role name strings ('ADMIN', 'STAFF') which are not valid permission strings. This check is a no-op and should be fixed to query user_roles instead.

account/overview uses a different pattern

apps/api/src/app/api/account/overview/route.ts doesn't use hasPermission(). It calls getAllEffectivePermissions() once to get the full permission set, then uses checkPermissionAgainstList() (in-memory) to decide which sections to include. This is actually closer to the target design — it fetches once and checks in-memory.

Supabase local dev config

apps/api/supabase/config.toml:

  • jwt_expiry = 3600 (1 hour)
  • enable_refresh_token_rotation = true
  • Google OAuth enabled
  • Custom Access Token Hook section commented out — needs uncommenting and configuring

Verified schema: user_roles table

Confirmed via Prisma schema and migration (20260125222753_add_user_roles):

  • scope_type: TEXT, nullable
  • scope_id: UUID, nullable
  • Indexed: (scope_type, scope_id), (user_id), (deleted_at)
  • Unique constraint: (user_id, role, scope_type, scope_id)

Files to understand

File What it does
apps/api/src/lib/validation/auth/auth-validator.ts validateAuthentication(), resolveAppUserId(), impersonation
apps/api/src/lib/auth/supabase/server.ts validateSupabaseAuth(), getAuthenticatedUser(), getClaims()
apps/api/src/utils/auth/permission-check.ts hasPermission(), getUserRoles(), role maps, wildcard matching, test mode
apps/api/src/lib/permissions/PermissionCatalog.ts Grammar, parsePermission(), buildPermission(), isValidPermission()
apps/api/src/lib/permissions/entity-permission-check.ts checkEntityPermission(), checkOwnershipPermission() (dead)
apps/api/src/lib/api/middleware/require-permission.ts requirePermission() (dead)
apps/api/src/services/shared/PermissionService.ts processPurchasePermissions(), grantResourcePermission() (redundant)
apps/api/src/app/api/account/overview/route.ts Uses getAllEffectivePermissions() + checkPermissionAgainstList() pattern
packages/shared-types/src/auth/index.ts Role enum, JwtClaims, UserRoleAssignment types
apps/api/supabase/config.toml Local Supabase config, hook configuration
apps/api/src/app/api/users/[id]/roles/route.ts Role assignment (POST), already calls updateUserById
apps/api/src/app/api/users/[id]/role/route.ts Legacy single-role update — dead from frontend, to be deleted
apps/api/src/lib/auth/jwt/jwt-utils.ts JWT claim validation/extraction, reads app_metadata.role

Complete permission → role equivalence table

Current hasPermission string → what role check it's actually doing, verified against GLOBAL_ROLE_PERMISSIONS and SCOPED_ROLE_PERMISSIONS:

ADMIN-only (via * wildcard — no other role has these globally): credits.adjust.any, credits.create.any, credits.update.any, credits.delete.any, credits.transfer.any, permissions.view.any, permissions.assign.any, permissions.revoke.any, users.role.update.any, users.ban.any, users.delete.any, resources.restore.any, access-gates.create.any, access-gates.update.any, access-gates.delete.any, location-groups.create.any, location-groups.delete.any, location-groups.update.any, pricing.create.any, pricing.delete.any, transactions.create.any, migration.list.any, products.delete.any

STAFF (global — ADMIN + STAFF, but NOT COMMUNITY_MANAGER): users.invite.any, organizations.*, location-groups.view.*, resources.list.*, pricing.money.create.any, security.codes.view.any, security.codes.create.any, resource-groups.members.manage.any, media.*, reservations.create.any (global)

STAFF_OR_ABOVE (ADMIN + STAFF + COMMUNITY_MANAGER): users.list.any, users.view.any, users.update.any, users.create.any, locations.view.*, resources.view.*, assets.*.any, pricing.view.*, credits.view.any, credits.list.any, transactions.view.any, transactions.list.any, reservations.view.any, reservations.list.any, inquiries.*.any, checkout.create.any

COMMUNITY_MANAGER-specific (ADMIN + CM, but NOT STAFF): dashboard.*.any, security.impersonate.any

Location-scoped (STAFF + CM via SCOPED_ROLE_PERMISSIONS): reservations.cancel.any, reservations.noshow.any, reservations.update.any, waitlists.*.any, locations.*.any (write ops), resources.*.any (write ops)

CM-scoped-only (CM via SCOPED_ROLE_PERMISSIONS, not STAFF): users.approve.any, dashboard.*.any

Own-data (USER + PARTNER — any authenticated user): reservations.*.own, credits.*.own, waitlists.*.own, users.view.own, users.update.own, transactions.*.own, purchases.*.own, checkout.create.own

PARTNER-only: users.view.own, users.update.own, transactions.view.own (subset of USER)

References