Skip to content

The AI Gateway uses Cognito Machine-to-Machine (M2M) authentication with the client_credentials OAuth2 grant. The ALB validates JWTs natively — no API Gateway or custom authorizer required.


Cognito M2M authentication is designed for service-to-service communication where no human login is involved. Each client (agent, script, or service) is issued a client ID and client secret, which it exchanges for a short-lived JWT access token.

The ALB performs JWT validation at the edge: it checks the token signature against Cognito’s JWKS endpoint, verifies iss, exp, nbf, iat, and the required scope claim. Invalid or expired tokens receive a 401 Unauthorized response directly from the ALB — the request never reaches the gateway container.


sequenceDiagram
    participant Client as AI Agent / Script
    participant Cognito as Cognito User Pool
    participant ALB as Application Load Balancer
    participant ECS as Portkey Gateway (ECS)

    Client->>Cognito: POST /oauth2/token<br/>grant_type=client_credentials<br/>client_id + client_secret
    Cognito-->>Client: 200 OK<br/>access_token (JWT, 1h TTL)

    Client->>ALB: POST /v1/messages<br/>Authorization: Bearer JWT<br/>x-portkey-provider: anthropic
    ALB->>ALB: Validate JWT signature (JWKS)<br/>Check iss, exp, scope

    alt Token valid
        ALB->>ECS: Forward request
        ECS-->>ALB: LLM response
        ALB-->>Client: 200 OK (response)
    else Token invalid or expired
        ALB-->>Client: 401 Unauthorized
    end
  1. Token request — The client sends a POST to the Cognito /oauth2/token endpoint with grant_type=client_credentials, providing the client ID and secret.
  2. Token issuance — Cognito validates the credentials and returns a signed JWT access token with a 1-hour TTL and the https://gateway.internal/invoke scope.
  3. Request with Bearer token — The client includes the JWT in the Authorization: Bearer <token> header on every request to the gateway.
  4. ALB JWT validation — The ALB validates the token signature against Cognito’s JWKS endpoint, checks standard claims (iss, exp, nbf, iat), and verifies the required scope. Invalid tokens are rejected with a 401 before reaching the backend.
  5. Forward to ECS — Valid requests pass through to the Portkey gateway container on ECS Fargate.

Use the provided script to obtain a token from the command line.

VariableDescription
GATEWAY_CLIENT_IDCognito app-client ID (issued per team or service account)
GATEWAY_CLIENT_SECRETCognito app-client secret
GATEWAY_TOKEN_ENDPOINTFull Cognito token URL, e.g. https://<domain>.auth.<region>.amazoncognito.com/oauth2/token

Set these in your shell profile (~/.zshrc or ~/.bashrc):

Terminal window
export GATEWAY_CLIENT_ID="<your-client-id>"
export GATEWAY_CLIENT_SECRET="<your-client-secret>"
export GATEWAY_TOKEN_ENDPOINT="<cognito-token-endpoint>"
Terminal window
# Fetch a token (raw JWT printed to stdout)
TOKEN=$(./scripts/get-gateway-token.sh)
# Verify the token payload
echo "$TOKEN" | cut -d. -f2 | base64 -d 2>/dev/null | python3 -m json.tool

The script exits non-zero on failure and writes diagnostics to stderr:

Exit CodeMeaning
0Success
1Missing environment variable
2Token request failed (curl error or non-200 HTTP)
3JSON parsing failed or access_token missing in response

Cognito JWTs have a default TTL of 3600 seconds (1 hour). To avoid redundant token requests, cache the token and refresh before expiry.

Claude Code handles caching automatically via apiKeyHelper:

  1. apiKeyHelper is called once at startup to fetch the token.
  2. CLAUDE_CODE_API_KEY_HELPER_TTL_MS=3000000 (50 minutes) tells Claude Code to re-invoke the helper before the token expires.
  3. On a 401 response, Claude Code immediately re-invokes the helper regardless of TTL.

No additional caching is needed.

For agents that read API keys from environment variables (Goose, OpenCode, Codex CLI), use a caching wrapper:

~/.local/bin/gateway-token-cached.sh
#!/usr/bin/env bash
#
# Caches the gateway token in a file, refreshing when older than 50 minutes.
set -euo pipefail
CACHE_FILE="${HOME}/.cache/ai-gateway/token"
MAX_AGE=3000 # seconds (50 minutes)
mkdir -p "$(dirname "$CACHE_FILE")"
# Refresh if cache is missing or stale
if [[ ! -f "$CACHE_FILE" ]] || \
[[ $(( $(date +%s) - $(stat -c %Y "$CACHE_FILE" 2>/dev/null || echo 0) )) -gt $MAX_AGE ]]; then
~/workplace/ai-gateway/scripts/get-gateway-token.sh > "$CACHE_FILE"
chmod 600 "$CACHE_FILE"
fi
cat "$CACHE_FILE"

Then in your shell profile:

Terminal window
export OPENAI_API_KEY=$(~/.local/bin/gateway-token-cached.sh)
export GATEWAY_API_KEY=$(~/.local/bin/gateway-token-cached.sh)