ADR-014: Two-Plane Architecture Split
Status: Accepted Date: 2026-03-26 Deciders: AI Engineering NAMER Supersedes: Partially refines ADR-005 (ALB JWT for inference path remains unchanged)
Context
Section titled “Context”ADR-005 chose ALB JWT validation over API Gateway to eliminate per-request costs on the inference path. The C-Series refinements (C.1-C.5) introduce admin and secondary APIs:
- C.2 Usage API — teams query their own usage (read-only)
- C.3 Pricing Admin — operators manage dynamic pricing overrides (CRUD)
- Team Registration — self-service team onboarding
- Budget Admin — budget CRUD
- Routing Config — routing rule management
- Content Scanner — guardrails configuration
These admin endpoints were initially deployed as Lambda Function URLs with hand-rolled JWT validation (validate_admin_scope() in each handler). This created two problems:
- Auth duplication: Every admin handler re-implemented JWT extraction, scope validation, and error formatting. A bug in one handler’s auth check could silently bypass authorization.
- Inconsistent enforcement: Lambda Function URLs have no built-in auth layer — authorization depends entirely on application code running correctly.
Decision
Section titled “Decision”Split the architecture into two planes:
| Plane | Transport | Auth | Traffic Pattern |
|---|---|---|---|
| Inference | ALB with validate_token | ALB-native JWT validation | High-volume, latency-sensitive |
| Admin | API Gateway REST API | Cognito Authorizer (COGNITO_USER_POOLS) | Low-volume, correctness-sensitive |
All admin endpoints move behind a single API Gateway REST API with a Cognito authorizer. The ALB continues handling the inference path (/v1/chat/completions, /v1/messages).
Admin API Route Map
Section titled “Admin API Route Map”| Path | Lambda | Purpose |
|---|---|---|
/teams | team_registration | Self-service onboarding |
/budgets | budget_admin | Budget CRUD |
/routing | routing_config | Routing rule management |
/scanner | content_scanner | Guardrails configuration |
/pricing | pricing_admin | Dynamic pricing overrides |
/usage | usage_api | Real-time usage self-service |
Each path prefix gets a {proxy+} child resource for sub-paths, with ANY methods and AWS_PROXY Lambda integrations.
Consequences
Section titled “Consequences”Positive:
- Auth is enforced once at the gateway layer — individual Lambda handlers drop their auth code. API Gateway rejects unauthorized requests before they reach Lambda.
- Single Cognito authorizer with
authorization_scopescovers all admin endpoints uniformly. - Admin APIs gain API Gateway features for free: access logging, CloudWatch metrics, request throttling, WAF attachment if needed later.
- Feature-flagged via
enable_admin_apivariable — can be enabled per environment.
Negative:
- API Gateway adds ~10-15ms latency to admin calls (acceptable for admin traffic).
- API Gateway REST API cost: $3.50/million requests. At admin-level traffic (<10K req/day), cost is negligible (~$1/month).
- One more infrastructure module to maintain (
modules/admin_api).
Neutral:
- Lambda handlers retain their business logic unchanged — only the auth check is removed.
- The inference path (ALB) is unaffected.
Alternatives Considered
Section titled “Alternatives Considered”- Keep Lambda Function URLs with per-handler auth — rejected because auth duplication is a security liability and maintenance burden.
- API Gateway HTTP API — considered, but REST API provides a native
COGNITO_USER_POOLSauthorizer with built-inauthorization_scopesenforcement. HTTP API’s JWT authorizer requires a custom Lambda authorizer to enforce Cognito scopes. - Single API Gateway for everything (inference + admin) — rejected per ADR-005 reasoning: API Gateway on the inference path adds $260-2,400/month and 10-15ms latency for zero benefit since ALB JWT validation handles it natively.