Error Codes

Complete reference for error responses returned by the AI Gateway, organized by source: ALB/JWT validation, WAF, gateway application logic, rate limiting, and upstream providers.

HTTP Status Code Summary

Code	Meaning	Source	Common Cause
`200`	OK	Gateway	Request succeeded
`401`	Unauthorized	ALB	Invalid, expired, or missing JWT
`403`	Forbidden	WAF or ALB	WAF block, wrong OAuth scope, or IP rate limit
`429`	Too Many Requests	Gateway or provider	Budget exhausted, RPM/token limit, or provider rate limit
`502`	Bad Gateway	Gateway	Upstream provider unreachable or returned an error
`503`	Service Unavailable	ALB	No healthy ECS tasks or gateway overloaded

200 OK

The request was processed successfully. The response body matches the format of the upstream provider (OpenAI Chat Completions or Anthropic Messages).

401 Unauthorized

The ALB rejected the request because the JWT is invalid or missing. The ALB performs JWT validation before the request reaches the gateway container.

Triggers

Trigger	Details
Missing `Authorization` header	No `Bearer <jwt>` token in the request
Expired token	Cognito JWTs have a 1-hour TTL
Invalid signature	Token not signed by the expected Cognito User Pool
Malformed token	Token is not a valid JWT (wrong format, corrupt base64)

Response

The ALB returns an HTML error page (not JSON) with HTTP 401. There is no response body when JWT validation fails at the ALB layer.

Troubleshooting

Refresh your token — Cognito JWTs expire after 1 hour:
Terminal window
```
TOKEN=$(./scripts/get-gateway-token.sh)
```

Verify the token is well-formed:

echo "$TOKEN" | cut -d. -f2 | base64 -d 2>/dev/null | python3 -m json.tool

Check expiry:

echo "$TOKEN" | cut -d. -f2 | base64 -d 2>/dev/null | python3 -c "
import json, sys, datetime
data = json.load(sys.stdin)
exp = data.get('exp', 0)
remaining = exp - int(datetime.datetime.now().timestamp())
print(f'Expires in {remaining // 60} minutes')
"

Claude Code users — Ensure apiKeyHelper is configured and executable:

claude config set --global apiKeyHelper ~/workplace/ai-gateway/scripts/get-gateway-token.sh
chmod +x ~/workplace/ai-gateway/scripts/get-gateway-token.sh

403 Forbidden

The request was authenticated (valid JWT) but rejected by WAF rules or authorization checks.

Triggers

Trigger	Details
WAF block	AWS Managed Rules (common exploits, IP reputation) matched the request
IP rate limit	WAF per-IP rate limit exceeded (2,000 requests per 5-minute window)
Wrong OAuth scope	JWT does not contain the required `https://gateway.internal/invoke` scope

WAF Block Response

When WAF blocks a request, the response includes the x-amzn-waf-action header:

HTTP/1.1 403 Forbidden
x-amzn-waf-action: BLOCK

The response body is an AWS WAF default block page (HTML), not a JSON payload.

WAF Rules in Effect

Rule Group	Description
AWS Common Rule Set	Blocks common web exploits (SQL injection, XSS, etc.)
IP Reputation List	Blocks requests from known-bad IP addresses
Per-IP Rate Limit	2,000 requests per 5-minute window per source IP

Troubleshooting

Check for the WAF header in the response:

curl -v -H "Authorization: Bearer $TOKEN" \
     ${GATEWAY_URL}/v1/chat/completions 2>&1 | grep -i waf

If IP rate limited — Wait for the 5-minute window to expire, or reduce request volume.
If scope is wrong — Decode the JWT and verify the scope claim:
Terminal window
```
echo "$TOKEN" | cut -d. -f2 | base64 -d 2>/dev/null | python3 -m json.tool
```
Contact the gateway admin to update the Cognito app client scopes.

429 Too Many Requests

The gateway or upstream provider is rate-limiting your requests.

Gateway Rate Limit (Team Layer)

Per-team rate limiting is enforced via DynamoDB atomic counters when the Admin API (C.1) is enabled. Two dimensions are tracked:

Dimension	Window	Description
RPM (requests per minute)	1-minute sliding window	Atomic counter per team per minute bucket
Daily tokens	Calendar day (UTC)	Cumulative token count per team per day

Rate Limit Response

{
  "allowed": false,
  "reason": "RPM limit exceeded (101/100 requests per minute)",
  "retry_after_seconds": 42
}

Field	Description
`allowed`	Always `false` when rate limited
`reason`	Human-readable explanation of which limit was hit
`retry_after_seconds`	Number of seconds to wait before retrying

Tier Defaults

Tier	RPM	Daily Tokens	Monthly Budget (USD)
sandbox	20	100,000	$25
standard	100	500,000	$100
premium	500	5,000,000	$1,000
unlimited	2,000	unlimited	$10,000

Budget Enforcement

When budgets are enabled, the budget_enforcement Lambda runs in-path as an agentgateway promptGuard request webhook. If a team’s monthly budget is exhausted (utilization reaches the hard limit percentage, default 100%), the Lambda returns agentgateway’s reject contract:

{"action": "reject"}

agentgateway maps action: reject to an HTTP 429 for the client; the request never reaches the provider. The Lambda also supports model-level budget caps — if a specific model’s spend exceeds its configured limit, only that model is rejected.

Budget enforcement includes a warning threshold (default 80%). Requests are allowed when the warning is reached, but a warning is logged.

Provider Rate Limit

Upstream providers (OpenAI, Anthropic, Google, Azure) return their own 429 responses. These are passed through to the caller. Provider rate limits are independent of gateway rate limits.

Retry strategy: Implement exponential backoff. Most 429 responses include a Retry-After header.

502 Bad Gateway

The gateway could not reach the upstream LLM provider or received an error from it.

Triggers

Trigger	Details
Provider outage	Upstream provider is down or returning errors
Invalid API key	The stored provider key is expired, revoked, or still set to `REPLACE_ME`
Network issue	ECS task lacks outbound internet access (NAT Gateway misconfiguration)
Invalid model	The specified model does not exist at the provider

Response

The gateway passes through the upstream provider’s error response when available. When the provider is completely unreachable, the ALB returns a generic 502 error page.

Troubleshooting

Test a different provider to isolate the issue:

curl -H "Authorization: Bearer $TOKEN" \
     -H "Content-Type: application/json" \
     -d '{"model":"claude-sonnet-4-20250514","max_tokens":1,"messages":[{"role":"user","content":"hi"}]}' \
     ${GATEWAY_URL}/v1/chat/completions

Run the health check with provider testing:

./scripts/check-health.sh --url "$GATEWAY_URL" --token "$TOKEN" --providers

Verify the API key is set in Secrets Manager:

aws secretsmanager get-secret-value \
  --secret-id ai-gateway/openai-api-key \
  --query SecretString --output text

If the value is REPLACE_ME, the key has not been configured.

503 Service Unavailable

The gateway itself is overloaded or all ECS tasks are unhealthy.

Triggers

Trigger	Details
No healthy targets	All ECS tasks failed health checks; ALB has no targets to route to
Gateway overloaded	Concurrent request count exceeds capacity; autoscaling has not yet caught up
Deployment in progress	A rolling deployment temporarily reduces available capacity

Response

The ALB returns its default 503 page (HTML). There is no JSON body.

Troubleshooting

Wait 30—60 seconds — Autoscaling should add capacity.

Check ECS service health:

aws ecs describe-services \
  --cluster ai-gateway-prod \
  --services ai-gateway-gateway \
  --query 'services[0].{desired:desiredCount,running:runningCount,events:events[:3]}'

Check the ALB target group:

aws elbv2 describe-target-health \
  --target-group-arn "$TARGET_GROUP_ARN"

Run the basic health check:

curl -s -o /dev/null -w "%{http_code}" ${GATEWAY_URL}/
# Expected: 200

Gateway Application Errors

These errors come from the agentgateway proxy itself or are passed through from the upstream provider.

No Provider Header (migration note)

agentgateway selects the provider server-side from its rendered config, so there is no x-portkey-provider header and no {"error": "provider is not set"} failure mode. If you are migrating from an earlier (Portkey-based) release, remove any x-portkey-* headers from your client. See the API Reference for how provider and model selection works now.

Invalid Model

The provider rejected the model name, or no modelAlias / backend in the active chain serves it. This typically means:

The model does not exist at the resolved provider
The model name is misspelled
The provider account does not have access to the model

The error response is passed through from the upstream provider.

Cognito Token Endpoint Errors

These errors occur when obtaining a JWT from the Cognito token endpoint, before any gateway request is made.

`invalid_grant`

{"error": "invalid_grant"}

Cause	Fix
Wrong token endpoint	Verify `GATEWAY_TOKEN_ENDPOINT` is `https://<domain>.auth.<region>.amazoncognito.com/oauth2/token`
Wrong credentials	Confirm `GATEWAY_CLIENT_ID` and `GATEWAY_CLIENT_SECRET` are correct
Wrong grant type	The Cognito app client must be configured for `client_credentials` grant

`invalid_client`

{"error": "invalid_client"}

The client ID or secret is incorrect. This is a standard OAuth2 error returned by the Cognito token endpoint. Double-check the values or contact the gateway admin.

`invalid_scope`

{"error": "invalid_scope"}

The requested scope is not configured on the Cognito app client. This is a standard OAuth2 error returned by the Cognito token endpoint. Valid scopes are https://gateway.internal/invoke and https://gateway.internal/admin.

Guardrail Errors

Content safety is inline Bedrock Guardrails. agentgateway’s promptGuard policy calls the ApplyGuardrail API in-path on both the request (source INPUT) and the response (source OUTPUT), signed with the gateway’s ECS task role. There is no separate content-scanner Lambda and no per-team PII-mode endpoint.

Detect / Log-Only by Default

With enable_guardrails = true and enforce_guardrails = false (the default), every filter action is NONE: ApplyGuardrail evaluates the content and emits assessments to the logs, but the gateway passes the request through untouched. In this mode a tripped guardrail does not produce an error.

Blocking Mode

When enforce_guardrails = true, filters BLOCK on a trip (and topic filters are attached). A blocked request is denied before (or after) the provider call, depending on whether the input or output filter tripped. The exact response depends on the guardrail policy that triggered:

Policy	Example Reason
Content filter	Harmful content detected (hate, violence, sexual, misconduct)
PII blocking	PII detected in input or output (SSN, credit card, email, phone)
Topic policy	Request matches a blocked topic (e.g., competitor products, internal financials)
Word policy	Request or response contains a blocked word or phrase

See Terraform Variables — Guardrails for the toggles.

Error Response Quick Reference

Symptom	Likely Code	First Check
HTML error page, no JSON	`401` or `503`	ALB-level rejection; check JWT or ECS health
`x-amzn-waf-action: BLOCK` header	`403`	WAF rule triggered; check IP and request patterns
`"allowed": false` with `retry_after_seconds`	`429`	Team rate limit (Admin-API counter) exceeded
`429` with no provider response	`429`	Budget exhausted; `budget_enforcement` returned `{"action": "reject"}`
Request blocked in enforce mode	Varies	Bedrock Guardrail tripped with `enforce_guardrails = true`
Provider error passthrough	`502`	Check API key in Secrets Manager; test an alternate model/provider
No response / timeout	—	Check gateway URL, VPN, and ECS service status