Agent Setup
Configure any of the supported AI coding agents to route requests through the AI Gateway.
Overview
Section titled “Overview”The gateway is the agentgateway proxy, which serves two API formats natively on a single port:
| Format | Endpoint | Used By |
|---|---|---|
| Anthropic Messages | /v1/messages | Claude Code |
| OpenAI Chat Completions | /v1/chat/completions | OpenCode, Goose, Continue.dev, LangChain, Codex CLI |
Provider and model selection is server-side. agentgateway reads its routing from a YAML config rendered by Terraform: a priority-group failover chain (Bedrock primary, Anthropic-direct fallback) plus modelAliases that map requested model IDs onto backend models. Clients do not send a provider routing header — there is no x-portkey-provider and no per-request routing override. You point your agent at the gateway URL with a valid JWT, and the gateway decides where the request goes.
Prerequisites
Section titled “Prerequisites”Before configuring any agent, ensure you have:
- The gateway URL (
GATEWAY_URL) — ALB DNS name fromterraform output alb_dns_name - Cognito credentials —
GATEWAY_CLIENT_ID,GATEWAY_CLIENT_SECRET,GATEWAY_TOKEN_ENDPOINT - The token script —
scripts/get-gateway-token.shmust be executable (chmod +x)
See Authentication for details on obtaining tokens.
Agent Configurations
Section titled “Agent Configurations”1. Claude Code
Section titled “1. Claude Code”Claude Code talks the native Anthropic Messages API (/v1/messages). It uses apiKeyHelper to auto-fetch and re-fetch tokens.
Step 1 — Set the API key helper
Section titled “Step 1 — Set the API key helper”claude config set --global apiKeyHelper ~/workplace/ai-gateway/scripts/get-gateway-token.shStep 2 — Set environment variables
Section titled “Step 2 — Set environment variables”Add to your shell profile (~/.zshrc or ~/.bashrc):
# Gateway base URL -- no /v1 suffix (Claude Code appends it)export ANTHROPIC_BASE_URL="${GATEWAY_URL}"
# Token TTL -- must be a real env var, NOT in settings.json env block.# Claude Code bug #7660: TTL in the settings.json env block is ignored.export CLAUDE_CODE_API_KEY_HELPER_TTL_MS=3000000Step 3 (optional) — Re-enable MCP tool search
Section titled “Step 3 (optional) — Re-enable MCP tool search”When Claude Code connects to a non-first-party host, it disables MCP tool search by default. To re-enable:
export ENABLE_TOOL_SEARCH=trueFull shell profile block
Section titled “Full shell profile block”# --- AI Gateway (Claude Code) ---export ANTHROPIC_BASE_URL="${GATEWAY_URL}"export CLAUDE_CODE_API_KEY_HELPER_TTL_MS=3000000export ENABLE_TOOL_SEARCH=trueexport GATEWAY_CLIENT_ID="<your-client-id>"export GATEWAY_CLIENT_SECRET="<your-client-secret>"export GATEWAY_TOKEN_ENDPOINT="<cognito-token-endpoint>"2. OpenCode
Section titled “2. OpenCode”OpenCode uses @ai-sdk/openai-compatible for custom providers.
Configuration file
Section titled “Configuration file”Create or edit opencode.json in your project root:
{ "$schema": "https://opencode.ai/config.json", "provider": { "gateway": { "id": "gateway", "name": "AI Gateway", "type": "@ai-sdk/openai-compatible", "options": { "baseURL": "${GATEWAY_URL}/v1" }, "models": { "gpt-4.1": { "id": "gpt-4.1", "name": "GPT-4.1 (via Gateway)", "type": "chat", "attachment": true } } } }, "model": { "chat": "gateway/gpt-4.1" }}API key
Section titled “API key”Set the API key in your environment:
export OPENAI_API_KEY=$(~/workplace/ai-gateway/scripts/get-gateway-token.sh)3. Goose
Section titled “3. Goose”Goose reads provider configuration from environment variables.
Environment variables
Section titled “Environment variables”export GOOSE_PROVIDER=openai# Host only -- Goose appends /v1 internally. Do NOT add /v1 here.export OPENAI_HOST="${GATEWAY_URL}"export OPENAI_API_KEY=$(~/workplace/ai-gateway/scripts/get-gateway-token.sh)Wrapper script for fresh tokens
Section titled “Wrapper script for fresh tokens”Goose reads OPENAI_API_KEY once at startup. Create a wrapper at ~/bin/goose-gateway.sh to refresh the token on every launch:
#!/usr/bin/env bash# Refresh token and launch Gooseexport GOOSE_PROVIDER=openaiexport OPENAI_HOST="${GATEWAY_URL}"export OPENAI_API_KEY=$(~/workplace/ai-gateway/scripts/get-gateway-token.sh)
exec goose "$@"chmod +x ~/bin/goose-gateway.sh4. Continue.dev
Section titled “4. Continue.dev”Continue supports config.yaml (the config.json format is deprecated).
Configuration file
Section titled “Configuration file”Edit ~/.continue/config.yaml:
models: - name: GPT-4.1 (Gateway) provider: openai model: gpt-4.1 apiBase: "${GATEWAY_URL}/v1" apiKey: "${TOKEN}"
- name: Claude Sonnet (Gateway) provider: openai model: claude-sonnet-4-20250514 apiBase: "${GATEWAY_URL}/v1" apiKey: "${TOKEN}"Replace ${TOKEN} with the output of scripts/get-gateway-token.sh, or use the shell wrapper pattern to keep it fresh. No requestOptions.headers are needed — the gateway maps the requested model onto a backend via modelAliases and routes through its priority-group failover chain.
5. LangChain
Section titled “5. LangChain”Use ChatOpenAI from langchain-openai to route through the gateway:
from langchain_openai import ChatOpenAIimport subprocess
# Fetch a fresh tokentoken = subprocess.run( ["./scripts/get-gateway-token.sh"], capture_output=True, text=True, check=True,).stdout
llm = ChatOpenAI( base_url="${GATEWAY_URL}/v1", api_key=token, model="gpt-4.1",)
response = llm.invoke("Hello from LangChain via the AI Gateway")print(response.content)To target an Anthropic model, just change the model argument — the gateway resolves it against modelAliases and its provider failover chain. No custom headers are needed:
llm = ChatOpenAI( base_url="${GATEWAY_URL}/v1", api_key=token, model="claude-sonnet-4-20250514",)6. Codex CLI
Section titled “6. Codex CLI”Codex CLI cannot override its built-in openai provider. Define a new provider name in the config.
Configuration file
Section titled “Configuration file”Edit ~/.codex/config.toml:
[model_providers.gateway]name = "AI Gateway"base_url = "${GATEWAY_URL}/v1"env_key = "GATEWAY_API_KEY"API key and launch
Section titled “API key and launch”export GATEWAY_API_KEY=$(~/workplace/ai-gateway/scripts/get-gateway-token.sh)
codex --provider gateway --model gpt-4.1Token Caching
Section titled “Token Caching”Cognito JWTs have a default TTL of 3600 seconds (1 hour). To avoid unnecessary token requests, cache the token and refresh before expiry.
Claude Code
Section titled “Claude Code”Claude Code handles caching automatically:
apiKeyHelperis called once at startup to fetch the token.CLAUDE_CODE_API_KEY_HELPER_TTL_MS=3000000(50 minutes) tells Claude Code to proactively re-invoke the helper before the token expires.- On a
401response, Claude Code immediately re-invokes the helper regardless of TTL.
No additional caching is needed.
Other Agents — Shell Wrapper Pattern
Section titled “Other Agents — Shell Wrapper Pattern”For agents that read API keys from environment variables (Goose, OpenCode, Codex CLI), use a caching wrapper to avoid calling Cognito on every command:
#!/usr/bin/env bash## Caches the gateway token in a file, refreshing when older than 50 minutes.
set -euo pipefail
CACHE_FILE="${HOME}/.cache/ai-gateway/token"MAX_AGE=3000 # seconds (50 minutes)
mkdir -p "$(dirname "$CACHE_FILE")"
# Refresh if cache is missing or staleif [[ ! -f "$CACHE_FILE" ]] || \ [[ $(( $(date +%s) - $(stat -c %Y "$CACHE_FILE" 2>/dev/null || echo 0) )) -gt $MAX_AGE ]]; then ~/workplace/ai-gateway/scripts/get-gateway-token.sh > "$CACHE_FILE" chmod 600 "$CACHE_FILE"fi
cat "$CACHE_FILE"Then in your shell profile:
export OPENAI_API_KEY=$(~/.local/bin/gateway-token-cached.sh)export GATEWAY_API_KEY=$(~/.local/bin/gateway-token-cached.sh)