Skip to content

ADR-014: Two-Plane Architecture Split

Status: Accepted Date: 2026-03-26 Deciders: AI Engineering NAMER Supersedes: Partially refines ADR-005 (ALB JWT for inference path remains unchanged)

ADR-005 chose ALB JWT validation over API Gateway to eliminate per-request costs on the inference path. The C-Series refinements (C.1-C.5) introduce admin and secondary APIs:

  • C.2 Usage API — teams query their own usage (read-only)
  • C.3 Pricing Admin — operators manage dynamic pricing overrides (CRUD)
  • Team Registration — self-service team onboarding
  • Budget Admin — budget CRUD
  • Routing Config — routing rule management
  • Content Scanner — guardrails configuration

These admin endpoints were initially deployed as Lambda Function URLs with hand-rolled JWT validation (validate_admin_scope() in each handler). This created two problems:

  1. Auth duplication: Every admin handler re-implemented JWT extraction, scope validation, and error formatting. A bug in one handler’s auth check could silently bypass authorization.
  2. Inconsistent enforcement: Lambda Function URLs have no built-in auth layer — authorization depends entirely on application code running correctly.

Split the architecture into two planes:

PlaneTransportAuthTraffic Pattern
InferenceALB with validate_tokenALB-native JWT validationHigh-volume, latency-sensitive
AdminAPI Gateway REST APICognito Authorizer (COGNITO_USER_POOLS)Low-volume, correctness-sensitive

All admin endpoints move behind a single API Gateway REST API with a Cognito authorizer. The ALB continues handling the inference path (/v1/chat/completions, /v1/messages).

PathLambdaPurpose
/teamsteam_registrationSelf-service onboarding
/budgetsbudget_adminBudget CRUD
/routingrouting_configRouting rule management
/scannercontent_scannerGuardrails configuration
/pricingpricing_adminDynamic pricing overrides
/usageusage_apiReal-time usage self-service

Each path prefix gets a {proxy+} child resource for sub-paths, with ANY methods and AWS_PROXY Lambda integrations.

Positive:

  • Auth is enforced once at the gateway layer — individual Lambda handlers drop their auth code. API Gateway rejects unauthorized requests before they reach Lambda.
  • Single Cognito authorizer with authorization_scopes covers all admin endpoints uniformly.
  • Admin APIs gain API Gateway features for free: access logging, CloudWatch metrics, request throttling, WAF attachment if needed later.
  • Feature-flagged via enable_admin_api variable — can be enabled per environment.

Negative:

  • API Gateway adds ~10-15ms latency to admin calls (acceptable for admin traffic).
  • API Gateway REST API cost: $3.50/million requests. At admin-level traffic (<10K req/day), cost is negligible (~$1/month).
  • One more infrastructure module to maintain (modules/admin_api).

Neutral:

  • Lambda handlers retain their business logic unchanged — only the auth check is removed.
  • The inference path (ALB) is unaffected.
  1. Keep Lambda Function URLs with per-handler auth — rejected because auth duplication is a security liability and maintenance burden.
  2. API Gateway HTTP API — considered, but REST API provides a native COGNITO_USER_POOLS authorizer with built-in authorization_scopes enforcement. HTTP API’s JWT authorizer requires a custom Lambda authorizer to enforce Cognito scopes.
  3. Single API Gateway for everything (inference + admin) — rejected per ADR-005 reasoning: API Gateway on the inference path adds $260-2,400/month and 10-15ms latency for zero benefit since ALB JWT validation handles it natively.