Troubleshooting
Solutions for common issues when using the AI Gateway.
401 Unauthorized
Section titled “401 Unauthorized”The ALB rejected the request because the JWT is invalid or missing.
Possible causes:
- Expired token — Cognito JWTs have a 1-hour TTL. Re-run the token script:
Terminal window TOKEN=$(./scripts/get-gateway-token.sh) - Missing
apiKeyHelper(Claude Code) — Ensure the helper is set and the script is executable:Terminal window claude config set --global apiKeyHelper ~/workplace/ai-gateway/scripts/get-gateway-token.shchmod +x ~/workplace/ai-gateway/scripts/get-gateway-token.sh - Invalid JWT — Verify the token is well-formed:
Terminal window echo "$TOKEN" | cut -d. -f2 | base64 -d 2>/dev/null | python3 -m json.tool - TTL not set as env var (Claude Code) —
CLAUDE_CODE_API_KEY_HELPER_TTL_MSmust be a real environment variable, not set in the settings.jsonenvblock (bug #7660).
403 Forbidden
Section titled “403 Forbidden”The request was authenticated but rejected by authorization or WAF rules.
Possible causes:
- Wrong scope — The Cognito app client may not have the required
https://gateway.internal/invokescope. Contact the gateway admin to verify client configuration. - WAF block — AWS WAF may have blocked the request. Check the
x-amzn-waf-actionresponse header. - IP rate limit exceeded — WAF enforces a 2,000 requests/5-min per-IP limit. Wait and retry, or contact the admin for a higher threshold.
429 Too Many Requests
Section titled “429 Too Many Requests”The gateway or upstream provider is rate-limiting your requests.
Possible causes:
-
Budget exceeded — Your team’s token budget has been exhausted for the current period. Check your budget status:
Terminal window # Decode your token to see team/client claimsecho "$TOKEN" | cut -d. -f2 | base64 -d 2>/dev/null | python3 -m json.toolContact the gateway admin to check remaining budget or request an increase.
-
Provider rate limit — The upstream LLM provider (OpenAI, Anthropic, etc.) is rate-limiting requests. This is typically transient.
- Wait 10-30 seconds and retry.
- If persistent, the gateway may need higher provider-side rate limits.
-
WAF rate limit — AWS WAF enforces a per-IP request limit (2,000 requests per 5 minutes). If you are sending high-volume automated requests, you may hit this limit.
- Spread requests over time or contact the admin for a higher threshold.
Retry strategy: Implement exponential backoff. Most 429 responses include a Retry-After header with the number of seconds to wait.
502 Bad Gateway
Section titled “502 Bad Gateway”The gateway could not reach the upstream LLM provider.
Possible causes:
- Provider outage — The upstream provider (OpenAI, Anthropic, Google, etc.) may be experiencing downtime. Check the provider’s status page.
- Invalid provider API key — The gateway’s stored API key for the provider may be expired or revoked. Contact the gateway admin.
- Network issue — The ECS task may not have outbound internet access. Check NAT Gateway and VPC endpoint configuration.
What to try:
- Test a different provider to isolate the issue:
Terminal window # Try anthropic instead of openai, or vice versacurl -H "Authorization: Bearer $TOKEN" \-H "x-portkey-provider: anthropic" \-H "Content-Type: application/json" \-d '{"model":"claude-sonnet-4-20250514","max_tokens":1,"messages":[{"role":"user","content":"hi"}]}' \${GATEWAY_URL}/v1/chat/completions - Run the health check with provider testing:
Terminal window TOKEN="$TOKEN" ./scripts/check-health.sh --url "$GATEWAY_URL" --token "$TOKEN" --providers
503 Service Unavailable
Section titled “503 Service Unavailable”The gateway itself is overloaded or unhealthy.
Possible causes:
- All ECS tasks unhealthy — The ALB returns 503 when no healthy targets are available. Check ECS service events in the AWS console.
- Gateway overloaded — Too many concurrent requests for the current capacity. The auto-scaling policy should add more tasks, but there is a ramp-up delay.
- Deployment in progress — A rolling deployment may temporarily reduce capacity.
What to try:
- Wait 30-60 seconds and retry. Auto-scaling should recover.
- Run the basic health check:
Terminal window ./scripts/check-health.sh --url "$GATEWAY_URL" - If persistent, contact the gateway admin to check ECS service health and scaling configuration.
Token Refresh
Section titled “Token Refresh”Cognito JWTs expire after 1 hour. Here is how to refresh for each agent type.
Claude Code
Section titled “Claude Code”Claude Code handles token refresh automatically. The apiKeyHelper script is re-invoked:
- Proactively, based on
CLAUDE_CODE_API_KEY_HELPER_TTL_MS(recommended:3000000= 50 minutes) - Reactively, on any
401response
No manual intervention is needed. If token refresh is failing, check that the helper script is executable and that your M2M credentials are still valid:
chmod +x ~/workplace/ai-gateway/scripts/get-gateway-token.sh./scripts/get-gateway-token.sh # Should print a JWTOther Agents
Section titled “Other Agents”For agents that read tokens from environment variables (OpenCode, Goose, Codex CLI, LangChain), tokens do not auto-refresh. Options:
-
Re-export the variable when the token expires:
Terminal window export OPENAI_API_KEY=$(./scripts/get-gateway-token.sh) -
Use the caching wrapper to auto-refresh (recommended). See Authentication — Shell Wrapper Pattern.
-
Restart the agent after exporting a fresh token, since some agents read the env var only at startup.
Checking Token Expiry
Section titled “Checking Token Expiry”Use the health check script to see when your token expires:
TOKEN=$(./scripts/get-gateway-token.sh)TOKEN="$TOKEN" ./scripts/check-health.sh --url "$GATEWAY_URL" --token "$TOKEN"Or decode manually:
echo "$TOKEN" | cut -d. -f2 | base64 -d 2>/dev/null | python3 -c "import json, sys, datetimedata = json.load(sys.stdin)exp = data.get('exp', 0)remaining = exp - int(datetime.datetime.now().timestamp())print(f'Expires in {remaining // 60} minutes ({datetime.datetime.fromtimestamp(exp)})')"Connection Refused / Timeout
Section titled “Connection Refused / Timeout”The gateway URL is unreachable.
Possible causes:
- Wrong gateway URL — Verify the URL is correct:
Expect
Terminal window curl -s -o /dev/null -w "%{http_code}" ${GATEWAY_URL}/200. If you get no response, the URL is wrong or the gateway is down. - VPN required — If the ALB is in a private network, confirm your VPN is connected.
- Unhealthy ALB target — The ALB health check path is
/on port 8787. If all targets are unhealthy, the ALB returns503. Check ECS service events in the AWS console.
Missing x-portkey-provider Header
Section titled “Missing x-portkey-provider Header”{"error": "provider is not set"}Every request must include the x-portkey-provider header. Verify per agent:
Check that ANTHROPIC_CUSTOM_HEADERS is set (newline-separated format):
echo "$ANTHROPIC_CUSTOM_HEADERS"# Should output: x-portkey-provider: anthropicCheck that options.headers is present in opencode.json:
"options": { "headers": { "x-portkey-provider": "openai" }}Check that OPENAI_EXTRA_HEADERS is set:
echo "$OPENAI_EXTRA_HEADERS"# Should output: x-portkey-provider: openaiCheck that requestOptions.headers is present in ~/.continue/config.yaml:
requestOptions: headers: x-portkey-provider: openaiCheck that default_headers is passed to ChatOpenAI:
llm = ChatOpenAI( default_headers={"x-portkey-provider": "openai"}, ...)Check that the headers section exists in ~/.codex/config.toml:
[model_providers.gateway.headers]x-portkey-provider = "openai"Invalid Grant / Token Endpoint Error
Section titled “Invalid Grant / Token Endpoint Error”{"error": "invalid_grant"}The Cognito token request failed.
Possible causes:
- Wrong token endpoint — Verify
GATEWAY_TOKEN_ENDPOINTis the full URL:https://<domain>.auth.<region>.amazoncognito.com/oauth2/token - Wrong credentials — Confirm
GATEWAY_CLIENT_IDandGATEWAY_CLIENT_SECRETare correct. - Wrong grant type — The Cognito app client must be configured for
client_credentialsgrant type. Contact the gateway admin if this is not set.
MCP Tool Search Disabled (Claude Code)
Section titled “MCP Tool Search Disabled (Claude Code)”When connected to a non-first-party host, Claude Code disables MCP tool search by default. Symptoms: tool search returns no results, or tools from MCP servers are not discovered.
Fix:
export ENABLE_TOOL_SEARCH=trueAdd this to the same shell profile block as your other gateway environment variables.
Provider Override Not Working
Section titled “Provider Override Not Working”Requests are going to api.openai.com instead of the gateway.
Do not use the built-in openai provider name. Codex CLI does not allow overriding it. Use a custom provider name:
[model_providers.gateway]name = "AI Gateway"base_url = "${GATEWAY_URL}/v1"Launch with: codex --provider gateway --model gpt-4.1
Use OPENAI_HOST (not OPENAI_BASE_URL), and omit the /v1 suffix:
export OPENAI_HOST="${GATEWAY_URL}"Ensure ANTHROPIC_BASE_URL is exported in your shell profile, not only set in the settings JSON.
Config Precedence
Section titled “Config Precedence”When environment variables and config files conflict, the following precedence applies:
| Agent | Precedence (highest first) |
|---|---|
| Claude Code | Env vars > claude config settings > defaults |
| OpenCode | opencode.json in project root > global config |
| Goose | Env vars > ~/.config/goose/config.yaml |
| Continue.dev | ~/.continue/config.yaml (only source) |
| LangChain | Constructor args > env vars |
| Codex CLI | CLI flags > config.toml > env vars |