Skip to content

Architecture Decision Records

Architecture Decision Records (ADRs) capture significant technical decisions along with their context, alternatives considered, and consequences. They serve as a historical record so that future contributors understand why a particular approach was chosen, not just what was built.

ADRs are stored in the adr/ directory at the repository root.

ADRTitleStatusSummary
001Portkey OSS as LLM Gateway ProxyAcceptedSelected Portkey OSS over LiteLLM due to LiteLLM’s 14 CVEs (including critical RCE and active SSRF exploitation), systemic memory leaks, and 800 MB+ image size. Portkey has zero known CVEs and a ~62 MB image.
002python:3.13-slim Over ChainguardAcceptedChose python:3.13-slim with multi-stage hardening over Chainguard because the free Chainguard tier locks to latest (Python 3.14), and the 3.13 tag requires a paid subscription.
003Single NAT Gateway + VPC EndpointsAcceptedDeployed one NAT Gateway instead of two, combined with VPC endpoints for ECR, CloudWatch, Secrets Manager, and S3. Saves approximately $32/month with acceptable HA trade-off for outbound internet traffic.
0043-Phase Container Security PipelineAcceptedStructured the security pipeline into three phases: pre-build (hadolint + checkov), post-build (trivy + syft), and post-scan (cosign). Skipped grype (trivy covers it) and osv-scanner (uv audit provides native OSV scanning).
005ALB JWT Validation Over API GatewayAcceptedUses ALB-native validate_token action (launched Nov 2025) instead of API Gateway HTTP API for JWT authentication. Saves $260-2,400/month depending on request volume with zero additional latency.
006Portkey Dual-Format APIAcceptedVerified that Portkey OSS natively serves both OpenAI Chat Completions (/v1/chat/completions) and Anthropic Messages (/v1/messages) on a single port. No custom middleware or translation layer needed.
007AWS Provider Upgrade to >= 6.22AcceptedUpgraded the Terraform AWS provider from ~> 5.0 to ~> 6.22 to enable the ALB JWT validation resource (jwt_validation block in aws_lb_listener). Zero-risk upgrade since infrastructure was deployed fresh on v6.
008Multi-Tenant Client IsolationAcceptedPer-team Cognito app clients via a clients Terraform module. Each team gets isolated credentials, scopes, and usage tracking.
009Provider Routing StrategyAcceptedPortkey’s native routing engine for provider-level fallback and load-balance strategies via x-portkey-config header or default environment variables.
010Cost Attribution PipelineAcceptedLambda subscribes to gateway CloudWatch logs, extracts token usage, computes estimated cost, and publishes custom CloudWatch metrics per team.
011Bedrock Guardrails IntegrationAcceptedTerraform module for Amazon Bedrock Guardrails with configurable content filtering, PII blocking, topic denial, and word filtering policies.
012Response Cache StrategyAcceptedElastiCache Redis cluster in VPC private subnets for exact-match response caching via Portkey Gateway.
013Identity Center SAML/OIDC FederationProposedSAML 2.0 and OIDC federation with the Cognito User Pool, plus a Pre-Token-Generation V2 Lambda for IdP group-to-claim mapping.
014Two-Plane Architecture SplitAcceptedALB stays on the inference path; admin APIs (teams, budgets, routing, scanner, pricing, usage) move behind API Gateway with a Cognito authorizer. Eliminates duplicated JWT validation across handlers.

ADR files follow the pattern:

adr/NNN-short-descriptive-title.md

Where NNN is a zero-padded sequential number (e.g., 008).

Use this template for new ADRs:

# ADR-NNN: Title
**Status**: Proposed | Accepted | Deprecated | Superseded by ADR-XXX
**Date**: YYYY-MM-DD
**Deciders**: AI Engineering NAMER
## Context
What is the issue that we're seeing that is motivating this decision or change?
## Decision
What is the change that we're proposing and/or doing?
## Alternatives Considered
| Criteria | Option A | Option B | Option C |
|----------|----------|----------|----------|
| ... | ... | ... | ... |
## Rationale
Why was this option chosen over the alternatives?
## Consequences
**Positive**: What becomes easier or possible as a result?
**Negative**: What becomes harder or is introduced as a trade-off?
  1. Copy the template above into a new file: adr/NNN-your-title.md.
  2. Set the status to Proposed.
  3. Fill in context, decision, alternatives, rationale, and consequences.
  4. Open a PR. Discussion happens in the PR review.
  5. Once approved and merged, update the status to Accepted.