Environments
The AI Gateway supports two environments out of the box: dev and prod. Each environment gets its own Terraform state, its own set of AWS resources, and its own configuration tuned for its purpose.
Dev vs Prod Comparison
Section titled “Dev vs Prod Comparison”| Setting | Dev | Prod |
|---|---|---|
environment | dev | prod |
aws_region | us-east-1 | us-east-1 |
gateway_cpu | 512 (0.5 vCPU) | 1024 (1 vCPU) |
gateway_memory | 1024 MiB | 2048 MiB |
gateway_desired_count | 2 | 2 |
autoscaling_min_capacity | 1 | 2 |
autoscaling_max_capacity | 3 | 6 |
enable_waf | false | true |
enable_jwt_auth | false | false (enable when ready) |
certificate_arn | "" (HTTP only) | "" (set to ACM cert ARN) |
cognito_domain_prefix | ai-gateway-dev | ai-gateway-prod |
| Terraform state bucket | ai-gateway-tfstate-dev | ai-gateway-tfstate-prod |
| DynamoDB lock table | ai-gateway-tfstate-lock-dev | ai-gateway-tfstate-lock-prod |
tfvars Files
Section titled “tfvars Files”Environment-specific variables are stored in infrastructure/environments/:
dev.tfvars
Section titled “dev.tfvars”environment = "dev"aws_region = "us-east-1"gateway_desired_count = 2gateway_cpu = 512gateway_memory = 1024autoscaling_min_capacity = 1autoscaling_max_capacity = 3enable_waf = falsecertificate_arn = ""cognito_domain_prefix = "ai-gateway-dev"enable_jwt_auth = falseprod.tfvars
Section titled “prod.tfvars”environment = "prod"aws_region = "us-east-1"gateway_desired_count = 2gateway_cpu = 1024gateway_memory = 2048autoscaling_min_capacity = 2autoscaling_max_capacity = 6enable_waf = truecertificate_arn = "" # Set to your ACM cert ARNcognito_domain_prefix = "ai-gateway-prod"enable_jwt_auth = falseTerragrunt Directory Structure
Section titled “Terragrunt Directory Structure”Terragrunt provides a cleaner multi-environment workflow. The directory layout:
terragrunt/ terragrunt.hcl # Root config: remote state, provider generation _env/ common.hcl # Shared inputs (project_name, portkey_image) dev/ env.hcl # Dev-specific locals (environment, region) terragrunt.hcl # Dev inputs (CPU, memory, WAF, scaling) prod/ env.hcl # Prod-specific locals (environment, region) terragrunt.hcl # Prod inputs (CPU, memory, WAF, scaling)How It Works
Section titled “How It Works”The root terragrunt.hcl configures:
- Remote state: S3 bucket named
ai-gateway-tfstate-{environment}with DynamoDB lock tableai-gateway-tfstate-lock-{environment} - Provider generation: Injects the AWS provider block with the correct region and default tags (tagged
ManagedBy = "terragrunt") - Terraform source: Points to
infrastructure/at the repo root
Each environment directory contains:
env.hcl— definesenvironmentandaws_regionas localsterragrunt.hcl— includes the root config, readscommon.hclandenv.hcl, then merges all inputs
Deploying with Terragrunt
Section titled “Deploying with Terragrunt”# Devcd terragrunt/dev/terragrunt initterragrunt planterragrunt apply
# Prodcd terragrunt/prod/terragrunt initterragrunt planterragrunt applyCreating Additional Environments
Section titled “Creating Additional Environments”To add a new environment (e.g., staging):
Option A: tfvars
Section titled “Option A: tfvars”-
Create
infrastructure/environments/staging.tfvars:environment = "staging"aws_region = "us-east-1"gateway_desired_count = 2gateway_cpu = 1024gateway_memory = 2048autoscaling_min_capacity = 1autoscaling_max_capacity = 4enable_waf = truecertificate_arn = "arn:aws:acm:us-east-1:123456789012:certificate/abc-123"cognito_domain_prefix = "ai-gateway-staging"enable_jwt_auth = false -
Update the
environmentvariable validation ininfrastructure/variables.tfto allow"staging":validation {condition = contains(["dev", "staging", "prod"], var.environment)error_message = "Environment must be 'dev', 'staging', or 'prod'."} -
Create the state backend resources:
Terminal window aws s3api create-bucket --bucket ai-gateway-tfstate-staging --region us-east-1aws s3api put-bucket-versioning --bucket ai-gateway-tfstate-staging \--versioning-configuration Status=Enabledaws dynamodb create-table --table-name ai-gateway-tfstate-lock-staging \--attribute-definitions AttributeName=LockID,AttributeType=S \--key-schema AttributeName=LockID,KeyType=HASH \--billing-mode PAY_PER_REQUEST --region us-east-1 -
Deploy:
Terminal window cd infrastructure/terraform init \-backend-config="bucket=ai-gateway-tfstate-staging" \-backend-config="key=terraform.tfstate" \-backend-config="region=us-east-1" \-backend-config="encrypt=true" \-backend-config="dynamodb_table=ai-gateway-tfstate-lock-staging"terraform apply -var-file=environments/staging.tfvars
Option B: Terragrunt
Section titled “Option B: Terragrunt”-
Create
terragrunt/staging/env.hcl:locals {environment = "staging"aws_region = "us-east-1"} -
Create
terragrunt/staging/terragrunt.hcl:include "root" {path = find_in_parent_folders()}locals {common = read_terragrunt_config(find_in_parent_folders("common.hcl", "_env/common.hcl"))env = read_terragrunt_config("env.hcl")}inputs = merge(local.common.locals,local.env.locals,{gateway_desired_count = 2gateway_cpu = 1024gateway_memory = 2048autoscaling_min_capacity = 1autoscaling_max_capacity = 4enable_waf = truecertificate_arn = "arn:aws:acm:us-east-1:123456789012:certificate/abc-123"cognito_domain_prefix = "ai-gateway-staging"enable_jwt_auth = false}) -
Update the
environmentvariable validation (same as Option A, step 2). -
Deploy:
Terminal window cd terragrunt/staging/terragrunt applyTerragrunt automatically creates the S3 bucket and DynamoDB table if they do not exist.
Common Customizations
Section titled “Common Customizations”VPC CIDR
Section titled “VPC CIDR”The default VPC CIDR is 10.0.0.0/16. To avoid conflicts with existing VPCs, override it:
vpc_cidr = "10.1.0.0/16"Subnet allocation is derived automatically from the CIDR:
| Subnet Type | CIDR Derivation | Example (10.0.0.0/16) |
|---|---|---|
| Public Subnet AZ-a | cidrsubnet(vpc_cidr, 8, 1) | 10.0.1.0/24 |
| Public Subnet AZ-b | cidrsubnet(vpc_cidr, 8, 2) | 10.0.2.0/24 |
| Private Subnet AZ-a | cidrsubnet(vpc_cidr, 8, 10) | 10.0.10.0/24 |
| Private Subnet AZ-b | cidrsubnet(vpc_cidr, 8, 20) | 10.0.20.0/24 |
Instance Sizing
Section titled “Instance Sizing”CPU and memory are allocated at the ECS task level, then split between the gateway container and the ADOT sidecar:
| Variable | Task Total | Gateway | ADOT Sidecar |
|---|---|---|---|
gateway_cpu = 512 | 512 units | 256 units | 256 units |
gateway_cpu = 1024 | 1024 units | 768 units | 256 units |
gateway_memory = 1024 | 1024 MiB | 768 MiB | 256 MiB |
gateway_memory = 2048 | 2048 MiB | 1792 MiB | 256 MiB |
The ADOT sidecar always receives 256 CPU units and 256 MiB memory. The remainder goes to the Portkey gateway container.
Autoscaling Thresholds
Section titled “Autoscaling Thresholds”Two autoscaling policies are configured:
| Policy | Metric | Target | Scale-out Cooldown | Scale-in Cooldown |
|---|---|---|---|---|
| CPU | ECSServiceAverageCPUUtilization | 70% | 60s | 300s |
| Requests | ALBRequestCountPerTarget | 500 requests/target | 60s | 300s |
To adjust these, modify the autoscaling_policies block in modules/compute/main.tf.
Region
Section titled “Region”Both dev and prod default to us-east-1. To deploy in a different region, update the aws_region in your tfvars or Terragrunt env.hcl. Ensure an ACM certificate is available in the target region.