Error Codes
Complete reference for error responses returned by the AI Gateway, organized by source: ALB/JWT validation, WAF, gateway application logic, rate limiting, and upstream providers.
HTTP Status Code Summary
Section titled “HTTP Status Code Summary”| Code | Meaning | Source | Common Cause |
|---|---|---|---|
200 | OK | Gateway | Request succeeded |
401 | Unauthorized | ALB | Invalid, expired, or missing JWT |
403 | Forbidden | WAF or ALB | WAF block, wrong OAuth scope, or IP rate limit |
429 | Too Many Requests | Gateway or provider | Budget exhausted, RPM/token limit, or provider rate limit |
502 | Bad Gateway | Gateway | Upstream provider unreachable or returned an error |
503 | Service Unavailable | ALB | No healthy ECS tasks or gateway overloaded |
200 OK
Section titled “200 OK”The request was processed successfully. The response body matches the format of the upstream provider (OpenAI Chat Completions or Anthropic Messages).
401 Unauthorized
Section titled “401 Unauthorized”The ALB rejected the request because the JWT is invalid or missing. The ALB performs JWT validation before the request reaches the gateway container.
Triggers
Section titled “Triggers”| Trigger | Details |
|---|---|
Missing Authorization header | No Bearer <jwt> token in the request |
| Expired token | Cognito JWTs have a 1-hour TTL |
| Invalid signature | Token not signed by the expected Cognito User Pool |
| Malformed token | Token is not a valid JWT (wrong format, corrupt base64) |
Response
Section titled “Response”The ALB returns an HTML error page (not JSON) with HTTP 401. There is no response body when JWT validation fails at the ALB layer.
Troubleshooting
Section titled “Troubleshooting”-
Refresh your token — Cognito JWTs expire after 1 hour:
Terminal window TOKEN=$(./scripts/get-gateway-token.sh) -
Verify the token is well-formed:
Terminal window echo "$TOKEN" | cut -d. -f2 | base64 -d 2>/dev/null | python3 -m json.tool -
Check expiry:
Terminal window echo "$TOKEN" | cut -d. -f2 | base64 -d 2>/dev/null | python3 -c "import json, sys, datetimedata = json.load(sys.stdin)exp = data.get('exp', 0)remaining = exp - int(datetime.datetime.now().timestamp())print(f'Expires in {remaining // 60} minutes')" -
Claude Code users — Ensure
apiKeyHelperis configured and executable:Terminal window claude config set --global apiKeyHelper ~/workplace/ai-gateway/scripts/get-gateway-token.shchmod +x ~/workplace/ai-gateway/scripts/get-gateway-token.sh
403 Forbidden
Section titled “403 Forbidden”The request was authenticated (valid JWT) but rejected by WAF rules or authorization checks.
Triggers
Section titled “Triggers”| Trigger | Details |
|---|---|
| WAF block | AWS Managed Rules (common exploits, IP reputation) matched the request |
| IP rate limit | WAF per-IP rate limit exceeded (2,000 requests per 5-minute window) |
| Wrong OAuth scope | JWT does not contain the required https://gateway.internal/invoke scope |
WAF Block Response
Section titled “WAF Block Response”When WAF blocks a request, the response includes the x-amzn-waf-action header:
HTTP/1.1 403 Forbiddenx-amzn-waf-action: BLOCKThe response body is an AWS WAF default block page (HTML), not a JSON payload.
WAF Rules in Effect
Section titled “WAF Rules in Effect”| Rule Group | Description |
|---|---|
| AWS Common Rule Set | Blocks common web exploits (SQL injection, XSS, etc.) |
| IP Reputation List | Blocks requests from known-bad IP addresses |
| Per-IP Rate Limit | 2,000 requests per 5-minute window per source IP |
Troubleshooting
Section titled “Troubleshooting”-
Check for the WAF header in the response:
Terminal window curl -v -H "Authorization: Bearer $TOKEN" \${GATEWAY_URL}/v1/chat/completions 2>&1 | grep -i waf -
If IP rate limited — Wait for the 5-minute window to expire, or reduce request volume.
-
If scope is wrong — Decode the JWT and verify the
scopeclaim:Terminal window echo "$TOKEN" | cut -d. -f2 | base64 -d 2>/dev/null | python3 -m json.toolContact the gateway admin to update the Cognito app client scopes.
429 Too Many Requests
Section titled “429 Too Many Requests”The gateway or upstream provider is rate-limiting your requests.
Gateway Rate Limit (Team Layer)
Section titled “Gateway Rate Limit (Team Layer)”Per-team rate limiting is enforced via DynamoDB atomic counters when the Admin API (C.1) is enabled. Two dimensions are tracked:
| Dimension | Window | Description |
|---|---|---|
| RPM (requests per minute) | 1-minute sliding window | Atomic counter per team per minute bucket |
| Daily tokens | Calendar day (UTC) | Cumulative token count per team per day |
Rate Limit Response
Section titled “Rate Limit Response”{ "allowed": false, "reason": "RPM limit exceeded (101/100 requests per minute)", "retry_after_seconds": 42}| Field | Description |
|---|---|
allowed | Always false when rate limited |
reason | Human-readable explanation of which limit was hit |
retry_after_seconds | Number of seconds to wait before retrying |
Tier Defaults
Section titled “Tier Defaults”| Tier | RPM | Daily Tokens | Monthly Budget (USD) |
|---|---|---|---|
| sandbox | 20 | 100,000 | $25 |
| standard | 100 | 500,000 | $100 |
| premium | 500 | 5,000,000 | $1,000 |
| unlimited | 2,000 | unlimited | $10,000 |
Budget Enforcement
Section titled “Budget Enforcement”When budgets are enabled, the budget_enforcement Lambda runs in-path as an agentgateway promptGuard request webhook. If a team’s monthly budget is exhausted (utilization reaches the hard limit percentage, default 100%), the Lambda returns agentgateway’s reject contract:
{"action": "reject"}agentgateway maps action: reject to an HTTP 429 for the client; the request never reaches the provider. The Lambda also supports model-level budget caps — if a specific model’s spend exceeds its configured limit, only that model is rejected.
Budget enforcement includes a warning threshold (default 80%). Requests are allowed when the warning is reached, but a warning is logged.
Provider Rate Limit
Section titled “Provider Rate Limit”Upstream providers (OpenAI, Anthropic, Google, Azure) return their own 429 responses. These are passed through to the caller. Provider rate limits are independent of gateway rate limits.
Retry strategy: Implement exponential backoff. Most 429 responses include a Retry-After header.
502 Bad Gateway
Section titled “502 Bad Gateway”The gateway could not reach the upstream LLM provider or received an error from it.
Triggers
Section titled “Triggers”| Trigger | Details |
|---|---|
| Provider outage | Upstream provider is down or returning errors |
| Invalid API key | The stored provider key is expired, revoked, or still set to REPLACE_ME |
| Network issue | ECS task lacks outbound internet access (NAT Gateway misconfiguration) |
| Invalid model | The specified model does not exist at the provider |
Response
Section titled “Response”The gateway passes through the upstream provider’s error response when available. When the provider is completely unreachable, the ALB returns a generic 502 error page.
Troubleshooting
Section titled “Troubleshooting”-
Test a different provider to isolate the issue:
Terminal window curl -H "Authorization: Bearer $TOKEN" \-H "Content-Type: application/json" \-d '{"model":"claude-sonnet-4-20250514","max_tokens":1,"messages":[{"role":"user","content":"hi"}]}' \${GATEWAY_URL}/v1/chat/completions -
Run the health check with provider testing:
Terminal window ./scripts/check-health.sh --url "$GATEWAY_URL" --token "$TOKEN" --providers -
Verify the API key is set in Secrets Manager:
Terminal window aws secretsmanager get-secret-value \--secret-id ai-gateway/openai-api-key \--query SecretString --output textIf the value is
REPLACE_ME, the key has not been configured.
503 Service Unavailable
Section titled “503 Service Unavailable”The gateway itself is overloaded or all ECS tasks are unhealthy.
Triggers
Section titled “Triggers”| Trigger | Details |
|---|---|
| No healthy targets | All ECS tasks failed health checks; ALB has no targets to route to |
| Gateway overloaded | Concurrent request count exceeds capacity; autoscaling has not yet caught up |
| Deployment in progress | A rolling deployment temporarily reduces available capacity |
Response
Section titled “Response”The ALB returns its default 503 page (HTML). There is no JSON body.
Troubleshooting
Section titled “Troubleshooting”-
Wait 30—60 seconds — Autoscaling should add capacity.
-
Check ECS service health:
Terminal window aws ecs describe-services \--cluster ai-gateway-prod \--services ai-gateway-gateway \--query 'services[0].{desired:desiredCount,running:runningCount,events:events[:3]}' -
Check the ALB target group:
Terminal window aws elbv2 describe-target-health \--target-group-arn "$TARGET_GROUP_ARN" -
Run the basic health check:
Terminal window curl -s -o /dev/null -w "%{http_code}" ${GATEWAY_URL}/# Expected: 200
Gateway Application Errors
Section titled “Gateway Application Errors”These errors come from the agentgateway proxy itself or are passed through from the upstream provider.
No Provider Header (migration note)
Section titled “No Provider Header (migration note)”agentgateway selects the provider server-side from its rendered config, so there is no x-portkey-provider header and no {"error": "provider is not set"} failure mode. If you are migrating from an earlier (Portkey-based) release, remove any x-portkey-* headers from your client. See the API Reference for how provider and model selection works now.
Invalid Model
Section titled “Invalid Model”The provider rejected the model name, or no modelAlias / backend in the active chain serves it. This typically means:
- The model does not exist at the resolved provider
- The model name is misspelled
- The provider account does not have access to the model
The error response is passed through from the upstream provider.
Cognito Token Endpoint Errors
Section titled “Cognito Token Endpoint Errors”These errors occur when obtaining a JWT from the Cognito token endpoint, before any gateway request is made.
invalid_grant
Section titled “invalid_grant”{"error": "invalid_grant"}| Cause | Fix |
|---|---|
| Wrong token endpoint | Verify GATEWAY_TOKEN_ENDPOINT is https://<domain>.auth.<region>.amazoncognito.com/oauth2/token |
| Wrong credentials | Confirm GATEWAY_CLIENT_ID and GATEWAY_CLIENT_SECRET are correct |
| Wrong grant type | The Cognito app client must be configured for client_credentials grant |
invalid_client
Section titled “invalid_client”{"error": "invalid_client"}The client ID or secret is incorrect. This is a standard OAuth2 error returned by the Cognito token endpoint. Double-check the values or contact the gateway admin.
invalid_scope
Section titled “invalid_scope”{"error": "invalid_scope"}The requested scope is not configured on the Cognito app client. This is a standard OAuth2 error returned by the Cognito token endpoint. Valid scopes are https://gateway.internal/invoke and https://gateway.internal/admin.
Guardrail Errors
Section titled “Guardrail Errors”Content safety is inline Bedrock Guardrails. agentgateway’s promptGuard policy calls the ApplyGuardrail API in-path on both the request (source INPUT) and the response (source OUTPUT), signed with the gateway’s ECS task role. There is no separate content-scanner Lambda and no per-team PII-mode endpoint.
Detect / Log-Only by Default
Section titled “Detect / Log-Only by Default”With enable_guardrails = true and enforce_guardrails = false (the default), every filter action is NONE: ApplyGuardrail evaluates the content and emits assessments to the logs, but the gateway passes the request through untouched. In this mode a tripped guardrail does not produce an error.
Blocking Mode
Section titled “Blocking Mode”When enforce_guardrails = true, filters BLOCK on a trip (and topic filters are attached). A blocked request is denied before (or after) the provider call, depending on whether the input or output filter tripped. The exact response depends on the guardrail policy that triggered:
| Policy | Example Reason |
|---|---|
| Content filter | Harmful content detected (hate, violence, sexual, misconduct) |
| PII blocking | PII detected in input or output (SSN, credit card, email, phone) |
| Topic policy | Request matches a blocked topic (e.g., competitor products, internal financials) |
| Word policy | Request or response contains a blocked word or phrase |
See Terraform Variables — Guardrails for the toggles.
Error Response Quick Reference
Section titled “Error Response Quick Reference”| Symptom | Likely Code | First Check |
|---|---|---|
| HTML error page, no JSON | 401 or 503 | ALB-level rejection; check JWT or ECS health |
x-amzn-waf-action: BLOCK header | 403 | WAF rule triggered; check IP and request patterns |
"allowed": false with retry_after_seconds | 429 | Team rate limit (Admin-API counter) exceeded |
429 with no provider response | 429 | Budget exhausted; budget_enforcement returned {"action": "reject"} |
| Request blocked in enforce mode | Varies | Bedrock Guardrail tripped with enforce_guardrails = true |
| Provider error passthrough | 502 | Check API key in Secrets Manager; test an alternate model/provider |
| No response / timeout | — | Check gateway URL, VPN, and ECS service status |