AI Gateway

Lightweight LLM inference gateway on AWS — route any AI agent to any model provider through a single endpoint.

Overview

AI Gateway deploys agentgateway — a Rust LLM/MCP proxy on a distroless base — on ECS Fargate behind an Application Load Balancer, giving your AI coding agents a unified entry point to multiple model providers. It speaks both the OpenAI Chat Completions and Anthropic Messages API formats natively on a single port, so every major agent works without translation layers or custom adapters. Provider and model selection is server-side: agentgateway routes through a priority-group failover chain defined in its config.

Authentication is handled by Cognito M2M (client_credentials grant) with ALB-native JWT validation — no API Gateway required, no per-request cost added.

Key Features

Feature	Description
Dual API format	Serves `/v1/chat/completions` (OpenAI) and `/v1/messages` (Anthropic) natively on one port
Multi-provider routing	Routes to Bedrock, OpenAI, Anthropic, Google, and Azure OpenAI via server-side priority-group failover
Cognito M2M auth	Machine-to-machine JWT authentication with ALB-native validation
Zero per-request cost	ALB JWT validation eliminates the need for API Gateway
Auto-scaling	ECS Fargate scales on CPU utilization and ALB request count
Inline content safety	Bedrock Guardrails called in-path (ApplyGuardrail), detect/log-only by default
Observability	OpenTelemetry sidecar with CloudWatch logs, X-Ray traces, and operational dashboards

Compatible Agents

Agent	API Format	Endpoint
Claude Code	Anthropic Messages	`/v1/messages`
OpenCode	OpenAI Chat Completions	`/v1/chat/completions`
Goose	OpenAI Chat Completions	`/v1/chat/completions`
Continue.dev	OpenAI Chat Completions	`/v1/chat/completions`
LangChain	OpenAI Chat Completions	`/v1/chat/completions`
Codex CLI	OpenAI Chat Completions	`/v1/chat/completions`