Getting Started
Clone, install, deploy, and make your first request in under 5 minutes.
Lightweight LLM inference gateway on AWS — route any AI agent to any model provider through a single endpoint.
AI Gateway deploys Portkey AI Gateway OSS (v1.15.2) on ECS Fargate behind an Application Load Balancer, giving your AI coding agents a unified entry point to multiple model providers. It speaks both the OpenAI Chat Completions and Anthropic Messages API formats natively, so every major agent works without translation layers or custom adapters.
Authentication is handled by Cognito M2M (client_credentials grant) with ALB-native JWT validation — no API Gateway required, no per-request cost added.
| Feature | Description |
|---|---|
| Dual API format | Serves /v1/chat/completions (OpenAI) and /v1/messages (Anthropic) natively |
| Multi-provider routing | Routes to Bedrock, OpenAI, Anthropic, Google, and Azure OpenAI via a single header |
| Cognito M2M auth | Machine-to-machine JWT authentication with ALB-native validation |
| Zero per-request cost | ALB JWT validation eliminates the need for API Gateway |
| Auto-scaling | ECS Fargate scales on CPU utilization and ALB request count |
| Observability | OpenTelemetry sidecar with CloudWatch logs, X-Ray traces, and operational dashboards |
| Agent | API Format | Endpoint |
|---|---|---|
| Claude Code | Anthropic Messages | /v1/messages |
| OpenCode | OpenAI Chat Completions | /v1/chat/completions |
| Goose | OpenAI Chat Completions | /v1/chat/completions |
| Continue.dev | OpenAI Chat Completions | /v1/chat/completions |
| LangChain | OpenAI Chat Completions | /v1/chat/completions |
| Codex CLI | OpenAI Chat Completions | /v1/chat/completions |
Getting Started
Clone, install, deploy, and make your first request in under 5 minutes.
User Guide
Configure your AI agent, learn the API, and troubleshoot common issues.
Admin Guide
Deploy, manage environments, configure security, and monitor the gateway.
Developer Guide
Contribute to the project, understand the architecture, and run CI locally.