Skip to content

Error Codes

Complete reference for error responses returned by the AI Gateway, organized by source: ALB/JWT validation, WAF, gateway application logic, rate limiting, and upstream providers.


CodeMeaningSourceCommon Cause
200OKGatewayRequest succeeded
401UnauthorizedALBInvalid, expired, or missing JWT
403ForbiddenWAF or ALBWAF block, wrong OAuth scope, or IP rate limit
429Too Many RequestsGateway or providerBudget exhausted, RPM/token limit, or provider rate limit
502Bad GatewayGatewayUpstream provider unreachable or returned an error
503Service UnavailableALBNo healthy ECS tasks or gateway overloaded

The request was processed successfully. The response body matches the format of the upstream provider (OpenAI Chat Completions or Anthropic Messages).


The ALB rejected the request because the JWT is invalid or missing. The ALB performs JWT validation before the request reaches the gateway container.

TriggerDetails
Missing Authorization headerNo Bearer <jwt> token in the request
Expired tokenCognito JWTs have a 1-hour TTL
Invalid signatureToken not signed by the expected Cognito User Pool
Malformed tokenToken is not a valid JWT (wrong format, corrupt base64)

The ALB returns an HTML error page (not JSON) with HTTP 401. There is no response body when JWT validation fails at the ALB layer.

  1. Refresh your token — Cognito JWTs expire after 1 hour:

    Terminal window
    TOKEN=$(./scripts/get-gateway-token.sh)
  2. Verify the token is well-formed:

    Terminal window
    echo "$TOKEN" | cut -d. -f2 | base64 -d 2>/dev/null | python3 -m json.tool
  3. Check expiry:

    Terminal window
    echo "$TOKEN" | cut -d. -f2 | base64 -d 2>/dev/null | python3 -c "
    import json, sys, datetime
    data = json.load(sys.stdin)
    exp = data.get('exp', 0)
    remaining = exp - int(datetime.datetime.now().timestamp())
    print(f'Expires in {remaining // 60} minutes')
    "
  4. Claude Code users — Ensure apiKeyHelper is configured and executable:

    Terminal window
    claude config set --global apiKeyHelper ~/workplace/ai-gateway/scripts/get-gateway-token.sh
    chmod +x ~/workplace/ai-gateway/scripts/get-gateway-token.sh

The request was authenticated (valid JWT) but rejected by WAF rules or authorization checks.

TriggerDetails
WAF blockAWS Managed Rules (common exploits, IP reputation) matched the request
IP rate limitWAF per-IP rate limit exceeded (2,000 requests per 5-minute window)
Wrong OAuth scopeJWT does not contain the required https://gateway.internal/invoke scope

When WAF blocks a request, the response includes the x-amzn-waf-action header:

HTTP/1.1 403 Forbidden
x-amzn-waf-action: BLOCK

The response body is an AWS WAF default block page (HTML), not a JSON payload.

Rule GroupDescription
AWS Common Rule SetBlocks common web exploits (SQL injection, XSS, etc.)
IP Reputation ListBlocks requests from known-bad IP addresses
Per-IP Rate Limit2,000 requests per 5-minute window per source IP
  1. Check for the WAF header in the response:

    Terminal window
    curl -v -H "Authorization: Bearer $TOKEN" \
    ${GATEWAY_URL}/v1/chat/completions 2>&1 | grep -i waf
  2. If IP rate limited — Wait for the 5-minute window to expire, or reduce request volume.

  3. If scope is wrong — Decode the JWT and verify the scope claim:

    Terminal window
    echo "$TOKEN" | cut -d. -f2 | base64 -d 2>/dev/null | python3 -m json.tool

    Contact the gateway admin to update the Cognito app client scopes.


The gateway or upstream provider is rate-limiting your requests.

Per-team rate limiting is enforced via DynamoDB atomic counters when the Admin API (C.1) is enabled. Two dimensions are tracked:

DimensionWindowDescription
RPM (requests per minute)1-minute sliding windowAtomic counter per team per minute bucket
Daily tokensCalendar day (UTC)Cumulative token count per team per day
{
"allowed": false,
"reason": "RPM limit exceeded (101/100 requests per minute)",
"retry_after_seconds": 42
}
FieldDescription
allowedAlways false when rate limited
reasonHuman-readable explanation of which limit was hit
retry_after_secondsNumber of seconds to wait before retrying
TierRPMDaily TokensMonthly Budget (USD)
sandbox20100,000$25
standard100500,000$100
premium5005,000,000$1,000
unlimited2,000unlimited$10,000

When budgets are enabled, the budget_enforcement Lambda runs in-path as an agentgateway promptGuard request webhook. If a team’s monthly budget is exhausted (utilization reaches the hard limit percentage, default 100%), the Lambda returns agentgateway’s reject contract:

{"action": "reject"}

agentgateway maps action: reject to an HTTP 429 for the client; the request never reaches the provider. The Lambda also supports model-level budget caps — if a specific model’s spend exceeds its configured limit, only that model is rejected.

Budget enforcement includes a warning threshold (default 80%). Requests are allowed when the warning is reached, but a warning is logged.

Upstream providers (OpenAI, Anthropic, Google, Azure) return their own 429 responses. These are passed through to the caller. Provider rate limits are independent of gateway rate limits.

Retry strategy: Implement exponential backoff. Most 429 responses include a Retry-After header.


The gateway could not reach the upstream LLM provider or received an error from it.

TriggerDetails
Provider outageUpstream provider is down or returning errors
Invalid API keyThe stored provider key is expired, revoked, or still set to REPLACE_ME
Network issueECS task lacks outbound internet access (NAT Gateway misconfiguration)
Invalid modelThe specified model does not exist at the provider

The gateway passes through the upstream provider’s error response when available. When the provider is completely unreachable, the ALB returns a generic 502 error page.

  1. Test a different provider to isolate the issue:

    Terminal window
    curl -H "Authorization: Bearer $TOKEN" \
    -H "Content-Type: application/json" \
    -d '{"model":"claude-sonnet-4-20250514","max_tokens":1,"messages":[{"role":"user","content":"hi"}]}' \
    ${GATEWAY_URL}/v1/chat/completions
  2. Run the health check with provider testing:

    Terminal window
    ./scripts/check-health.sh --url "$GATEWAY_URL" --token "$TOKEN" --providers
  3. Verify the API key is set in Secrets Manager:

    Terminal window
    aws secretsmanager get-secret-value \
    --secret-id ai-gateway/openai-api-key \
    --query SecretString --output text

    If the value is REPLACE_ME, the key has not been configured.


The gateway itself is overloaded or all ECS tasks are unhealthy.

TriggerDetails
No healthy targetsAll ECS tasks failed health checks; ALB has no targets to route to
Gateway overloadedConcurrent request count exceeds capacity; autoscaling has not yet caught up
Deployment in progressA rolling deployment temporarily reduces available capacity

The ALB returns its default 503 page (HTML). There is no JSON body.

  1. Wait 30—60 seconds — Autoscaling should add capacity.

  2. Check ECS service health:

    Terminal window
    aws ecs describe-services \
    --cluster ai-gateway-prod \
    --services ai-gateway-gateway \
    --query 'services[0].{desired:desiredCount,running:runningCount,events:events[:3]}'
  3. Check the ALB target group:

    Terminal window
    aws elbv2 describe-target-health \
    --target-group-arn "$TARGET_GROUP_ARN"
  4. Run the basic health check:

    Terminal window
    curl -s -o /dev/null -w "%{http_code}" ${GATEWAY_URL}/
    # Expected: 200

These errors come from the agentgateway proxy itself or are passed through from the upstream provider.

agentgateway selects the provider server-side from its rendered config, so there is no x-portkey-provider header and no {"error": "provider is not set"} failure mode. If you are migrating from an earlier (Portkey-based) release, remove any x-portkey-* headers from your client. See the API Reference for how provider and model selection works now.

The provider rejected the model name, or no modelAlias / backend in the active chain serves it. This typically means:

  • The model does not exist at the resolved provider
  • The model name is misspelled
  • The provider account does not have access to the model

The error response is passed through from the upstream provider.


These errors occur when obtaining a JWT from the Cognito token endpoint, before any gateway request is made.

{"error": "invalid_grant"}
CauseFix
Wrong token endpointVerify GATEWAY_TOKEN_ENDPOINT is https://<domain>.auth.<region>.amazoncognito.com/oauth2/token
Wrong credentialsConfirm GATEWAY_CLIENT_ID and GATEWAY_CLIENT_SECRET are correct
Wrong grant typeThe Cognito app client must be configured for client_credentials grant
{"error": "invalid_client"}

The client ID or secret is incorrect. This is a standard OAuth2 error returned by the Cognito token endpoint. Double-check the values or contact the gateway admin.

{"error": "invalid_scope"}

The requested scope is not configured on the Cognito app client. This is a standard OAuth2 error returned by the Cognito token endpoint. Valid scopes are https://gateway.internal/invoke and https://gateway.internal/admin.


Content safety is inline Bedrock Guardrails. agentgateway’s promptGuard policy calls the ApplyGuardrail API in-path on both the request (source INPUT) and the response (source OUTPUT), signed with the gateway’s ECS task role. There is no separate content-scanner Lambda and no per-team PII-mode endpoint.

With enable_guardrails = true and enforce_guardrails = false (the default), every filter action is NONE: ApplyGuardrail evaluates the content and emits assessments to the logs, but the gateway passes the request through untouched. In this mode a tripped guardrail does not produce an error.

When enforce_guardrails = true, filters BLOCK on a trip (and topic filters are attached). A blocked request is denied before (or after) the provider call, depending on whether the input or output filter tripped. The exact response depends on the guardrail policy that triggered:

PolicyExample Reason
Content filterHarmful content detected (hate, violence, sexual, misconduct)
PII blockingPII detected in input or output (SSN, credit card, email, phone)
Topic policyRequest matches a blocked topic (e.g., competitor products, internal financials)
Word policyRequest or response contains a blocked word or phrase

See Terraform Variables — Guardrails for the toggles.


SymptomLikely CodeFirst Check
HTML error page, no JSON401 or 503ALB-level rejection; check JWT or ECS health
x-amzn-waf-action: BLOCK header403WAF rule triggered; check IP and request patterns
"allowed": false with retry_after_seconds429Team rate limit (Admin-API counter) exceeded
429 with no provider response429Budget exhausted; budget_enforcement returned {"action": "reject"}
Request blocked in enforce modeVariesBedrock Guardrail tripped with enforce_guardrails = true
Provider error passthrough502Check API key in Secrets Manager; test an alternate model/provider
No response / timeoutCheck gateway URL, VPN, and ECS service status