Deployment
Prerequisites
Section titled “Prerequisites”Before deploying the AI Gateway, ensure the following are in place:
| Prerequisite | Description |
|---|---|
| AWS Account | An AWS account with permissions to create VPC, ECS, ALB, Cognito, KMS, Secrets Manager, and CloudWatch resources |
| Terraform >= 1.9 | Required version specified in versions.tf |
| AWS Provider ~> 6.22 | Required for ALB JWT validation support |
| S3 Bucket | One per environment for Terraform state (e.g., ai-gateway-tfstate-dev) |
| DynamoDB Table | One per environment for state locking (e.g., ai-gateway-tfstate-lock-dev) |
| ACM Certificate | TLS certificate for the ALB HTTPS listener (optional for dev, required for prod) |
| AWS CLI | Configured with appropriate credentials |
First-Time Setup Checklist
Section titled “First-Time Setup Checklist”-
Create the S3 state bucket for your target environment:
Terminal window aws s3api create-bucket \--bucket ai-gateway-tfstate-dev \--region us-east-1aws s3api put-bucket-versioning \--bucket ai-gateway-tfstate-dev \--versioning-configuration Status=Enabledaws s3api put-bucket-encryption \--bucket ai-gateway-tfstate-dev \--server-side-encryption-configuration \'{"Rules":[{"ApplyServerSideEncryptionByDefault":{"SSEAlgorithm":"aws:kms"}}]}' -
Create the DynamoDB lock table:
Terminal window aws dynamodb create-table \--table-name ai-gateway-tfstate-lock-dev \--attribute-definitions AttributeName=LockID,AttributeType=S \--key-schema AttributeName=LockID,KeyType=HASH \--billing-mode PAY_PER_REQUEST \--region us-east-1 -
Request or import an ACM certificate for your domain (if using HTTPS):
Terminal window aws acm request-certificate \--domain-name gateway.example.com \--validation-method DNS \--region us-east-1 -
Set provider API keys in Secrets Manager after the first apply (the secrets are created with placeholder values):
Terminal window aws secretsmanager put-secret-value \--secret-id ai-gateway/openai-api-key \--secret-string "sk-your-actual-key"aws secretsmanager put-secret-value \--secret-id ai-gateway/anthropic-api-key \--secret-string "sk-ant-your-actual-key"aws secretsmanager put-secret-value \--secret-id ai-gateway/google-api-key \--secret-string "your-google-api-key"aws secretsmanager put-secret-value \--secret-id ai-gateway/azure-api-key \--secret-string "your-azure-api-key"
Module Structure
Section titled “Module Structure”The infrastructure is organized into 4 local modules under infrastructure/modules/:
infrastructure/ main.tf # Root module — wires modules together variables.tf # Root-level input variables outputs.tf # Root-level outputs versions.tf # Terraform and provider version constraints providers.tf # AWS provider configuration otel-config.yaml # OpenTelemetry Collector configuration moved.tf # State migration blocks (safe to remove after first apply) environments/ dev.tfvars # Dev environment variable overrides prod.tfvars # Prod environment variable overrides modules/ observability/ # KMS, log groups, dashboard, saved queries networking/ # VPC, ALB, WAF, VPC endpoints auth/ # Cognito user pool, resource server, JWT listener compute/ # ECS cluster/service, ECR, IAM, Secrets ManagerModule Dependency Order
Section titled “Module Dependency Order”Modules must be applied in this order due to inter-module references:
- observability — Creates KMS keys and log groups needed by all other modules.
- networking — Creates VPC, ALB, and WAF. Receives the logs KMS key ARN from observability.
- auth — Creates Cognito resources and the JWT listener. Receives ALB ARN and target group from networking.
- compute — Creates ECS resources, ECR, IAM roles, and secrets. Receives subnets and ALB details from networking, and log group names from observability.
Terraform Deployment
Section titled “Terraform Deployment”Backend Configuration
Section titled “Backend Configuration”The backend is configured as an empty s3 block in versions.tf. You provide the actual bucket, key, region, and lock table at init time via -backend-config flags or a backend config file.
Deploy with var-file (Direct Terraform)
Section titled “Deploy with var-file (Direct Terraform)”cd infrastructure/
# Initialize with backend configurationterraform init \ -backend-config="bucket=ai-gateway-tfstate-dev" \ -backend-config="key=terraform.tfstate" \ -backend-config="region=us-east-1" \ -backend-config="encrypt=true" \ -backend-config="dynamodb_table=ai-gateway-tfstate-lock-dev"
# Preview changesterraform plan -var-file=environments/dev.tfvars
# Applyterraform apply -var-file=environments/dev.tfvarsFor production:
terraform init \ -backend-config="bucket=ai-gateway-tfstate-prod" \ -backend-config="key=terraform.tfstate" \ -backend-config="region=us-east-1" \ -backend-config="encrypt=true" \ -backend-config="dynamodb_table=ai-gateway-tfstate-lock-prod"
terraform plan -var-file=environments/prod.tfvarsterraform apply -var-file=environments/prod.tfvarsDeploy with Terragrunt (Recommended)
Section titled “Deploy with Terragrunt (Recommended)”Terragrunt wraps Terraform to manage multiple environments with DRY configuration. See Environments for the full Terragrunt directory layout.
# Deploy devcd terragrunt/dev/terragrunt planterragrunt apply
# Deploy prodcd terragrunt/prod/terragrunt planterragrunt applyTerragrunt automatically:
- Configures the S3 backend with environment-specific bucket and lock table names
- Generates the provider block with the correct region and tags
- Merges common inputs (from
_env/common.hcl) with environment-specific inputs
Updating the Gateway Image Version
Section titled “Updating the Gateway Image Version”The Portkey gateway image version is controlled by the portkey_image variable. To update:
-
Update the variable in the appropriate tfvars or Terragrunt inputs:
portkey_image = "portkeyai/gateway:1.16.0"Or for Terragrunt, update
_env/common.hcl:locals {project_name = "ai-gateway"portkey_image = "portkeyai/gateway:1.16.0"} -
Apply the change:
Terminal window terraform plan -var-file=environments/prod.tfvarsterraform apply -var-file=environments/prod.tfvars -
ECS performs a rolling deployment automatically (see below).
Rolling Deployments
Section titled “Rolling Deployments”The ECS service is configured for zero-downtime rolling deployments:
| Setting | Value | Effect |
|---|---|---|
deployment_minimum_healthy_percent | 100 | All existing tasks stay running during deployment |
deployment_maximum_percent | 200 | New tasks are launched alongside existing ones |
| Circuit breaker | Enabled with rollback | Automatically rolls back if new tasks fail health checks |
When you update the task definition (via terraform apply or aws ecs update-service --force-new-deployment), ECS:
- Launches new tasks with the updated definition
- Waits for new tasks to pass ALB health checks (HTTP 200 on port 8787, path
/) - Drains connections from old tasks (30-second deregistration delay)
- Stops old tasks
Manual Force Deployment
Section titled “Manual Force Deployment”To trigger a redeployment without changing the task definition:
aws ecs update-service \ --cluster ai-gateway-dev \ --service ai-gateway-gateway \ --force-new-deployment
# Wait for stability (timeout: 10 minutes)aws ecs wait services-stable \ --cluster ai-gateway-dev \ --services ai-gateway-gatewayDestroying Infrastructure
Section titled “Destroying Infrastructure”To tear down an environment:
# Direct Terraformterraform destroy -var-file=environments/dev.tfvars
# Terragruntcd terragrunt/dev/terragrunt destroy