ADR-017: agentgateway as the data plane
Status: Accepted Date: 2026-06-27 Deciders: AI Engineering NAMER Supersedes: ADR-012 (response cache) Relates to: ADR-001, ADR-006, ADR-009, ADR-011, ADR-014, ADR-015, ADR-016
Decision
Section titled “Decision”Replace Portkey OSS with agentgateway as the data plane. The control plane (the gwcore Lambdas, DynamoDB, Cognito, Firehose to Iceberg) and the identity layer (Cognito M2M, ALB JWT, per-team clients, SSO) stay unchanged. The swap is contained to the ECS container, the rendered agentgateway config, and a small set of integration seams. The LLM response cache is removed; response and semantic caching are out of scope.
agentgateway is a Rust LLM/MCP/A2A proxy (Linux Foundation). Its llm/ crate is a stronger data-plane engine than Portkey OSS on translation fidelity, inline guardrail breadth, reasoning-token handling, and rerank/realtime surfaces, and it is natively an MCP and A2A gateway. The engine choice in ADR-001 aged; the control-plane bet did not.
What changed
Section titled “What changed”- Data-plane container. The ECS task runs the agentgateway image (Rust, distroless base, port 8787), pinned by digest in
versions.env(AGENTGATEWAY_REF/AGENTGATEWAY_IMAGE_DIGEST) and re-tagged into ECR. The Node/npm Portkey build and its CVE-patch apparatus are gone. - Config. agentgateway reads a YAML config delivered inline via
-c, rendered frominfrastructure/modules/compute/agentgateway-config.yaml.tftpl. Routing is anai.groupspriority-group failover (Bedrock primary, Anthropic fallback). Thex-portkey-*header mechanism is gone. - Guardrails.
budget_enforcementis the one in-path Lambda, speaking agentgateway’s{action: pass | reject}webhook contract. Content safety runs inline via agentgateway calling Bedrock Guardrails (ApplyGuardrail API), detect/log-only by default, configured by theguardrailsmodule. The standalonecontent_scannerLambda was removed. - Cost attribution.
cost_attributionparses agentgateway’s flat access-log shape (synthesizing the nestedusageblock from flat token fields, reading the flatoidc_datafield for identity). - Response cache removed. The ElastiCache Redis exact-match response cache is decommissioned (supersedes ADR-012). The replacement is provider-native prompt caching: agentgateway’s
promptCachingpolicy injects BedrockcachePointmarkers on the Bedrock path. Prompt caching is opt-in, Bedrock-path only, and cuts input-token cost on prefix reuse — it is not a response cache (it still round-trips to the model and bills output tokens).
Prompt caching is not a response cache
Section titled “Prompt caching is not a response cache”The removed Redis layer returned a cached completion for an identical request, saving latency and the full call. Prompt caching reuses prompt prefixes to discount input tokens on a cache hit, but still calls the model and still bills output. The latency/throughput win of the response cache is not relocated — it is dropped, by decision. Cache token accounting survives: agentgateway emits cachedInputTokens and cacheCreationInputTokens, which cost_attribution reads.
What improves
Section titled “What improves”- Typed, compile-checked, snapshot-tested cross-API translation, including streaming SSE to AWS event-stream.
- Inline request- and response-side guardrails (Bedrock Guardrails, plus regex/PII, OpenAI Moderation, Google Model Armor, Azure Content Safety).
- First-class reasoning-token accounting and
reasoning_contentsignature replay. - Native MCP and A2A governance — the strategic gap the parity study flagged.
ai.groupspriority-group provider failover.
Consequences
Section titled “Consequences”- v0.1 is agentgateway-only. The dual-contract scaffolding kept during the migration spike was removed in the v0.1 cleanup: the Portkey
{verdict}path,to_portkey_config(), thecontent_scannerLambda, the Redis cache module, the Portkey routing-config presets, and the Portkey release scanner are all deleted. Rollback is no longer a routing flip. - Conditional routing is a known gap. agentgateway has no request-field predicate routing, so Portkey
conditionalstrategies collapse to ordered fallback. See Routing Strategies. - Provider breadth narrows on paper. agentgateway types 8 providers (5 provisioned) versus Portkey’s advertised 200+; practical exposure is unchanged.
This page summarizes the decision. The full ADR — including the original spike’s seam-by-seam migration table, risk register, and phased path — lives at adr/017-agentgateway-data-plane-spike.md in the repository root.