Troubleshooting

Solutions for common issues when using the AI Gateway.

401 Unauthorized

The ALB rejected the request because the JWT is invalid or missing.

Possible causes:

Expired token — Cognito JWTs have a 1-hour TTL. Re-run the token script:
Terminal window
```
TOKEN=$(./scripts/get-gateway-token.sh)
```

Missing apiKeyHelper (Claude Code) — Ensure the helper is set and the script is executable:

claude config set --global apiKeyHelper ~/workplace/ai-gateway/scripts/get-gateway-token.sh
chmod +x ~/workplace/ai-gateway/scripts/get-gateway-token.sh

Invalid JWT — Verify the token is well-formed:

echo "$TOKEN" | cut -d. -f2 | base64 -d 2>/dev/null | python3 -m json.tool

TTL not set as env var (Claude Code) — CLAUDE_CODE_API_KEY_HELPER_TTL_MS must be a real environment variable, not set in the settings.json env block (bug #7660).

403 Forbidden

The request was authenticated but rejected by authorization or WAF rules.

Possible causes:

Wrong scope — The Cognito app client may not have the required https://gateway.internal/invoke scope. Contact the gateway admin to verify client configuration.
WAF block — AWS WAF may have blocked the request. Check the x-amzn-waf-action response header.
IP rate limit exceeded — WAF enforces a 2,000 requests/5-min per-IP limit. Wait and retry, or contact the admin for a higher threshold.

429 Too Many Requests

The gateway or upstream provider is rate-limiting your requests.

Possible causes:

Budget exceeded — Your team’s token budget has been exhausted for the current period. Check your budget status:
Terminal window
```
# Decode your token to see team/client claims
echo "$TOKEN" | cut -d. -f2 | base64 -d 2>/dev/null | python3 -m json.tool
```
Contact the gateway admin to check remaining budget or request an increase.
Provider rate limit — The upstream LLM provider (OpenAI, Anthropic, etc.) is rate-limiting requests. This is typically transient.
- Wait 10-30 seconds and retry.
- If persistent, the gateway may need higher provider-side rate limits.
WAF rate limit — AWS WAF enforces a per-IP request limit (2,000 requests per 5 minutes). If you are sending high-volume automated requests, you may hit this limit.
- Spread requests over time or contact the admin for a higher threshold.

Retry strategy: Implement exponential backoff. Most 429 responses include a Retry-After header with the number of seconds to wait.

502 Bad Gateway

The gateway could not reach the upstream LLM provider.

Possible causes:

Provider outage — The upstream provider (OpenAI, Anthropic, Google, etc.) may be experiencing downtime. Check the provider’s status page.
Invalid provider API key — The gateway’s stored API key for the provider may be expired or revoked. Contact the gateway admin.
Network issue — The ECS task may not have outbound internet access. Check NAT Gateway and VPC endpoint configuration.

What to try:

Test a different provider to isolate the issue:

# Try anthropic instead of openai, or vice versa
curl -H "Authorization: Bearer $TOKEN" \
     -H "x-portkey-provider: anthropic" \
     -H "Content-Type: application/json" \
     -d '{"model":"claude-sonnet-4-20250514","max_tokens":1,"messages":[{"role":"user","content":"hi"}]}' \
     ${GATEWAY_URL}/v1/chat/completions

Run the health check with provider testing:

TOKEN="$TOKEN" ./scripts/check-health.sh --url "$GATEWAY_URL" --token "$TOKEN" --providers

503 Service Unavailable

The gateway itself is overloaded or unhealthy.

Possible causes:

All ECS tasks unhealthy — The ALB returns 503 when no healthy targets are available. Check ECS service events in the AWS console.
Gateway overloaded — Too many concurrent requests for the current capacity. The auto-scaling policy should add more tasks, but there is a ramp-up delay.
Deployment in progress — A rolling deployment may temporarily reduce capacity.

What to try:

Wait 30-60 seconds and retry. Auto-scaling should recover.

Run the basic health check:

./scripts/check-health.sh --url "$GATEWAY_URL"

If persistent, contact the gateway admin to check ECS service health and scaling configuration.

Token Refresh

Cognito JWTs expire after 1 hour. Here is how to refresh for each agent type.

Claude Code

Claude Code handles token refresh automatically. The apiKeyHelper script is re-invoked:

Proactively, based on CLAUDE_CODE_API_KEY_HELPER_TTL_MS (recommended: 3000000 = 50 minutes)
Reactively, on any 401 response

No manual intervention is needed. If token refresh is failing, check that the helper script is executable and that your M2M credentials are still valid:

chmod +x ~/workplace/ai-gateway/scripts/get-gateway-token.sh
./scripts/get-gateway-token.sh  # Should print a JWT

Other Agents

For agents that read tokens from environment variables (OpenCode, Goose, Codex CLI, LangChain), tokens do not auto-refresh. Options:

Re-export the variable when the token expires:
Terminal window
```
export OPENAI_API_KEY=$(./scripts/get-gateway-token.sh)
```
Use the caching wrapper to auto-refresh (recommended). See Authentication — Shell Wrapper Pattern.
Restart the agent after exporting a fresh token, since some agents read the env var only at startup.

Checking Token Expiry

Use the health check script to see when your token expires:

TOKEN=$(./scripts/get-gateway-token.sh)
TOKEN="$TOKEN" ./scripts/check-health.sh --url "$GATEWAY_URL" --token "$TOKEN"

Or decode manually:

echo "$TOKEN" | cut -d. -f2 | base64 -d 2>/dev/null | python3 -c "
import json, sys, datetime
data = json.load(sys.stdin)
exp = data.get('exp', 0)
remaining = exp - int(datetime.datetime.now().timestamp())
print(f'Expires in {remaining // 60} minutes ({datetime.datetime.fromtimestamp(exp)})')
"

Connection Refused / Timeout

The gateway URL is unreachable.

Possible causes:

Wrong gateway URL — Verify the URL is correct:
Terminal window
```
curl -s -o /dev/null -w "%{http_code}" ${GATEWAY_URL}/
```
Expect 200. If you get no response, the URL is wrong or the gateway is down.
VPN required — If the ALB is in a private network, confirm your VPN is connected.
Unhealthy ALB target — The ALB health check path is / on port 8787. If all targets are unhealthy, the ALB returns 503. Check ECS service events in the AWS console.

Missing `x-portkey-provider` Header

{"error": "provider is not set"}

Every request must include the x-portkey-provider header. Verify per agent:

Check that ANTHROPIC_CUSTOM_HEADERS is set (newline-separated format):

echo "$ANTHROPIC_CUSTOM_HEADERS"
# Should output: x-portkey-provider: anthropic

Check that options.headers is present in opencode.json:

"options": {
  "headers": {
    "x-portkey-provider": "openai"
  }
}

Check that OPENAI_EXTRA_HEADERS is set:

echo "$OPENAI_EXTRA_HEADERS"
# Should output: x-portkey-provider: openai

Check that requestOptions.headers is present in ~/.continue/config.yaml:

requestOptions:
  headers:
    x-portkey-provider: openai

Check that default_headers is passed to ChatOpenAI:

llm = ChatOpenAI(
    default_headers={"x-portkey-provider": "openai"},
    ...
)

Check that the headers section exists in ~/.codex/config.toml:

[model_providers.gateway.headers]
x-portkey-provider = "openai"

Invalid Grant / Token Endpoint Error

{"error": "invalid_grant"}

The Cognito token request failed.

Possible causes:

Wrong token endpoint — Verify GATEWAY_TOKEN_ENDPOINT is the full URL:
```
https://<domain>.auth.<region>.amazoncognito.com/oauth2/token
```
Wrong credentials — Confirm GATEWAY_CLIENT_ID and GATEWAY_CLIENT_SECRET are correct.
Wrong grant type — The Cognito app client must be configured for client_credentials grant type. Contact the gateway admin if this is not set.

MCP Tool Search Disabled (Claude Code)

When connected to a non-first-party host, Claude Code disables MCP tool search by default. Symptoms: tool search returns no results, or tools from MCP servers are not discovered.

Fix:

export ENABLE_TOOL_SEARCH=true

Add this to the same shell profile block as your other gateway environment variables.

Provider Override Not Working

Requests are going to api.openai.com instead of the gateway.

Do not use the built-in openai provider name. Codex CLI does not allow overriding it. Use a custom provider name:

[model_providers.gateway]
name = "AI Gateway"
base_url = "${GATEWAY_URL}/v1"

Launch with: codex --provider gateway --model gpt-4.1

Use OPENAI_HOST (not OPENAI_BASE_URL), and omit the /v1 suffix:

export OPENAI_HOST="${GATEWAY_URL}"

Ensure ANTHROPIC_BASE_URL is exported in your shell profile, not only set in the settings JSON.

Config Precedence

When environment variables and config files conflict, the following precedence applies:

Agent	Precedence (highest first)
Claude Code	Env vars > `claude config` settings > defaults
OpenCode	`opencode.json` in project root > global config
Goose	Env vars > `~/.config/goose/config.yaml`
Continue.dev	`~/.continue/config.yaml` (only source)
LangChain	Constructor args > env vars
Codex CLI	CLI flags > `config.toml` > env vars

Troubleshooting

401 Unauthorized

403 Forbidden

429 Too Many Requests

502 Bad Gateway

503 Service Unavailable

Token Refresh

Claude Code

Other Agents

Checking Token Expiry

Connection Refused / Timeout

Missing x-portkey-provider Header

Invalid Grant / Token Endpoint Error

MCP Tool Search Disabled (Claude Code)

Provider Override Not Working

Config Precedence

Missing `x-portkey-provider` Header