Authentication

The AI Gateway uses Cognito Machine-to-Machine (M2M) authentication with the client_credentials OAuth2 grant. The ALB validates JWTs natively — no API Gateway or custom authorizer required.

How It Works

Cognito M2M authentication is designed for service-to-service communication where no human login is involved. Each client (agent, script, or service) is issued a client ID and client secret, which it exchanges for a short-lived JWT access token.

The ALB performs JWT validation at the edge: it checks the token signature against Cognito’s JWKS endpoint, verifies iss, exp, nbf, iat, and the required scope claim. Invalid or expired tokens receive a 401 Unauthorized response directly from the ALB — the request never reaches the gateway container.

Auth Flow

sequenceDiagram
    participant Client as AI Agent / Script
    participant Cognito as Cognito User Pool
    participant ALB as Application Load Balancer
    participant ECS as Portkey Gateway (ECS)

    Client->>Cognito: POST /oauth2/token<br/>grant_type=client_credentials<br/>client_id + client_secret
    Cognito-->>Client: 200 OK<br/>access_token (JWT, 1h TTL)

    Client->>ALB: POST /v1/messages<br/>Authorization: Bearer JWT<br/>x-portkey-provider: anthropic
    ALB->>ALB: Validate JWT signature (JWKS)<br/>Check iss, exp, scope

    alt Token valid
        ALB->>ECS: Forward request
        ECS-->>ALB: LLM response
        ALB-->>Client: 200 OK (response)
    else Token invalid or expired
        ALB-->>Client: 401 Unauthorized
    end

Step-by-Step

Token request — The client sends a POST to the Cognito /oauth2/token endpoint with grant_type=client_credentials, providing the client ID and secret.
Token issuance — Cognito validates the credentials and returns a signed JWT access token with a 1-hour TTL and the https://gateway.internal/invoke scope.
Request with Bearer token — The client includes the JWT in the Authorization: Bearer <token> header on every request to the gateway.
ALB JWT validation — The ALB validates the token signature against Cognito’s JWKS endpoint, checks standard claims (iss, exp, nbf, iat), and verifies the required scope. Invalid tokens are rejected with a 401 before reaching the backend.
Forward to ECS — Valid requests pass through to the Portkey gateway container on ECS Fargate.

Getting a Token

Use the provided script to obtain a token from the command line.

Required Environment Variables

Variable	Description
`GATEWAY_CLIENT_ID`	Cognito app-client ID (issued per team or service account)
`GATEWAY_CLIENT_SECRET`	Cognito app-client secret
`GATEWAY_TOKEN_ENDPOINT`	Full Cognito token URL, e.g. `https://<domain>.auth.<region>.amazoncognito.com/oauth2/token`

Set these in your shell profile (~/.zshrc or ~/.bashrc):

export GATEWAY_CLIENT_ID="<your-client-id>"
export GATEWAY_CLIENT_SECRET="<your-client-secret>"
export GATEWAY_TOKEN_ENDPOINT="<cognito-token-endpoint>"

Using the Script

# Fetch a token (raw JWT printed to stdout)
TOKEN=$(./scripts/get-gateway-token.sh)

# Verify the token payload
echo "$TOKEN" | cut -d. -f2 | base64 -d 2>/dev/null | python3 -m json.tool

The script exits non-zero on failure and writes diagnostics to stderr:

Exit Code	Meaning
`0`	Success
`1`	Missing environment variable
`2`	Token request failed (curl error or non-200 HTTP)
`3`	JSON parsing failed or `access_token` missing in response

Token Caching

Cognito JWTs have a default TTL of 3600 seconds (1 hour). To avoid redundant token requests, cache the token and refresh before expiry.

Claude Code

Claude Code handles caching automatically via apiKeyHelper:

apiKeyHelper is called once at startup to fetch the token.
CLAUDE_CODE_API_KEY_HELPER_TTL_MS=3000000 (50 minutes) tells Claude Code to re-invoke the helper before the token expires.
On a 401 response, Claude Code immediately re-invokes the helper regardless of TTL.

No additional caching is needed.

Shell Wrapper Pattern (Other Agents)

For agents that read API keys from environment variables (Goose, OpenCode, Codex CLI), use a caching wrapper:

#!/usr/bin/env bash
#
# Caches the gateway token in a file, refreshing when older than 50 minutes.

set -euo pipefail

CACHE_FILE="${HOME}/.cache/ai-gateway/token"
MAX_AGE=3000  # seconds (50 minutes)

mkdir -p "$(dirname "$CACHE_FILE")"

# Refresh if cache is missing or stale
if [[ ! -f "$CACHE_FILE" ]] || \
   [[ $(( $(date +%s) - $(stat -c %Y "$CACHE_FILE" 2>/dev/null || echo 0) )) -gt $MAX_AGE ]]; then
  ~/workplace/ai-gateway/scripts/get-gateway-token.sh > "$CACHE_FILE"
  chmod 600 "$CACHE_FILE"
fi

cat "$CACHE_FILE"

Then in your shell profile:

export OPENAI_API_KEY=$(~/.local/bin/gateway-token-cached.sh)
export GATEWAY_API_KEY=$(~/.local/bin/gateway-token-cached.sh)

Next Steps

Agent Setup — Configure your AI agent to use the gateway
API Reference — Endpoints, headers, and request formats