Skip to content

Provider Routing Strategies

The AI Gateway is the agentgateway proxy. Routing is expressed as agentgateway ai.groups priority-group failover: the gateway tries the providers in group 0, then group 1, and so on. The active chain is rendered into the gateway’s YAML config by Terraform and delivered inline at container start.

This page covers how the strategies map onto agentgateway, how to manage custom configs through the routing-config API, and the known limitations carried over from the previous (Portkey) routing engine.

The default rendered config ships a two-tier chain:

backends:
- ai:
groups:
- providers:
- name: bedrock-primary # group 0: tried first
provider:
bedrock:
model: anthropic.claude-sonnet-4-20250514-v1:0
- providers:
- name: anthropic-fallback # group 1: tried if group 0 is evicted
provider:
anthropic:
model: claude-sonnet-4-20250514

A route also carries an ai policy with modelAliases, which maps a requested model ID onto a backend model (for example, gpt-4* to a Bedrock Claude model). The model field in the request body is resolved against these aliases and the active chain.

agentgateway fails over between groups on connection failure and health eviction. It does not fail over on a specific upstream HTTP status code — there is no per-edge on_status_codes trigger like the Portkey engine had. This is the most important behavioral difference to keep in mind when migrating a routing config.

The routing-config API accepts three strategy modes. Each renders to agentgateway via RoutingConfig.to_agentgateway_backend():

Strategyagentgateway mappingNotes
fallbackEach target becomes its own priority group, in order.Reproduces an ordered fallback chain. Failover is on connection/health eviction, not on a status code.
loadbalanceAll targets in one group.agentgateway load-balances within a group using power-of-two-choices. Balancing is capacity-based, not a 0—1 weight ratio. A per-target weight is carried but does not set an exact traffic split.
conditionalRequest-field predicates are dropped; targets collapse to an ordered fallback chain.agentgateway has no request-field predicate routing (e.g. on max_tokens). See the limitation below.

Tries providers in priority order. If the primary group is connection-failed or evicted by health checks, agentgateway moves to the next group. Use this for “Bedrock primary, Anthropic-direct backup” style resilience.

Spreads requests across the targets in a single group. agentgateway uses power-of-two-choices load balancing weighted by backend capacity. This is good for spreading traffic across providers or regions, but it does not implement an exact percentage split — so it is not a precise A/B-test traffic splitter.

The Portkey engine could inspect a request field such as max_tokens and route to a different model tier (a “cost-optimized” pattern). agentgateway has no equivalent request-field predicate routing.

The routing_config Lambda exposes a CRUD API (available when the Admin API is enabled). Custom configs are stored in DynamoDB as the rendered agentgateway backend JSON. Mutations require the admin scope and emit audit events.

MethodPathDescription
GET/routing/configsList custom config summaries
GET/routing/configs/{name}Get a specific custom config
POST/routing/configsCreate a custom config
PUT/routing/configs/{name}Update a custom config
DELETE/routing/configs/{name}Delete a custom config
Terminal window
curl -H "Authorization: Bearer ${ADMIN_TOKEN}" \
https://<admin-api-url>/routing/configs
Terminal window
curl -H "Authorization: Bearer ${ADMIN_TOKEN}" \
https://<admin-api-url>/routing/configs/my-fallback

Each target becomes its own priority group, tried in order:

Terminal window
curl -X POST https://<admin-api-url>/routing/configs \
-H "Authorization: Bearer ${ADMIN_TOKEN}" \
-H "Content-Type: application/json" \
-d '{
"name": "bedrock-then-anthropic",
"strategy": {"mode": "fallback"},
"targets": [
{"name": "primary", "provider": "bedrock", "override_params": {"model": "anthropic.claude-sonnet-4-20250514-v1:0"}},
{"name": "backup", "provider": "anthropic", "override_params": {"model": "claude-sonnet-4-20250514"}}
],
"metadata": {"description": "Bedrock primary, Anthropic-direct fallback"}
}'

All targets share one group; agentgateway balances by capacity:

Terminal window
curl -X POST https://<admin-api-url>/routing/configs \
-H "Authorization: Bearer ${ADMIN_TOKEN}" \
-H "Content-Type: application/json" \
-d '{
"name": "spread-bedrock-anthropic",
"strategy": {"mode": "loadbalance"},
"targets": [
{"name": "bedrock", "provider": "bedrock", "weight": 0.6, "override_params": {"model": "anthropic.claude-sonnet-4-20250514-v1:0"}},
{"name": "anthropic", "provider": "anthropic", "weight": 0.4, "override_params": {"model": "claude-sonnet-4-20250514"}}
],
"metadata": {"description": "Spread Anthropic-model traffic across Bedrock and direct"}
}'
Terminal window
curl -X DELETE https://<admin-api-url>/routing/configs/spread-bedrock-anthropic \
-H "Authorization: Bearer ${ADMIN_TOKEN}"

The routing-config API accepts the following fields. Note which ones still affect agentgateway behavior and which are carried for compatibility only.

FieldDescriptionagentgateway effect
strategy.mode"fallback", "loadbalance", or "conditional"Determines how targets map to priority groups (see table above)
strategy.on_status_codesHTTP status codes that triggered failover under PortkeyNo effect — agentgateway fails over on connection/health eviction
strategy.conditionsCondition objects (conditional mode only)Dropped on render — no predicate routing
targets[].nameUnique target name within the configCarried as the provider name
targets[].providerbedrock, anthropic, openai, azure-openai, googleMapped to the agentgateway provider key (openai to openAI, azure-openai to azure, google to gemini)
targets[].override_params.modelModel ID for this targetSet as the provider model
targets[].weightTraffic weight, 0.0—1.0 (loadbalance only)Carried, but balancing is capacity-based — not an exact ratio
targets[].retryPer-target retry configNo effect — no per-edge retry equivalent
targets[].virtual_keyProvider virtual-key referenceNo effect — credentials come from Secrets Manager / the ECS task role
metadata.descriptionHuman-readable descriptionStored with the config

The default provider chain (served to requests with no custom config) lives in the rendered gateway config, infrastructure/modules/compute/agentgateway-config.yaml.tftpl. To change which providers are reachable or their failover order, edit that template and re-apply Terraform. The enable_provider_fallback and routing_configs Terraform variables control whether named configs are wired in. See the Admin Guide for the deployment workflow.