Skip to content

The AI Gateway supports two environments out of the box: dev and prod. Each environment gets its own Terraform state, its own set of AWS resources, and its own configuration tuned for its purpose.

SettingDevProd
environmentdevprod
aws_regionus-east-1us-east-1
gateway_cpu512 (0.5 vCPU)1024 (1 vCPU)
gateway_memory1024 MiB2048 MiB
gateway_desired_count22
autoscaling_min_capacity12
autoscaling_max_capacity36
enable_waffalsetrue
enable_jwt_authfalsefalse (enable when ready)
certificate_arn"" (HTTP only)"" (set to ACM cert ARN)
cognito_domain_prefixai-gateway-devai-gateway-prod
Terraform state bucketai-gateway-tfstate-devai-gateway-tfstate-prod
DynamoDB lock tableai-gateway-tfstate-lock-devai-gateway-tfstate-lock-prod

Environment-specific variables are stored in infrastructure/environments/:

environment = "dev"
aws_region = "us-east-1"
gateway_desired_count = 2
gateway_cpu = 512
gateway_memory = 1024
autoscaling_min_capacity = 1
autoscaling_max_capacity = 3
enable_waf = false
certificate_arn = ""
cognito_domain_prefix = "ai-gateway-dev"
enable_jwt_auth = false
environment = "prod"
aws_region = "us-east-1"
gateway_desired_count = 2
gateway_cpu = 1024
gateway_memory = 2048
autoscaling_min_capacity = 2
autoscaling_max_capacity = 6
enable_waf = true
certificate_arn = "" # Set to your ACM cert ARN
cognito_domain_prefix = "ai-gateway-prod"
enable_jwt_auth = false

Terragrunt provides a cleaner multi-environment workflow. The directory layout:

terragrunt/
terragrunt.hcl # Root config: remote state, provider generation
_env/
common.hcl # Shared inputs (project_name, portkey_image)
dev/
env.hcl # Dev-specific locals (environment, region)
terragrunt.hcl # Dev inputs (CPU, memory, WAF, scaling)
prod/
env.hcl # Prod-specific locals (environment, region)
terragrunt.hcl # Prod inputs (CPU, memory, WAF, scaling)

The root terragrunt.hcl configures:

  • Remote state: S3 bucket named ai-gateway-tfstate-{environment} with DynamoDB lock table ai-gateway-tfstate-lock-{environment}
  • Provider generation: Injects the AWS provider block with the correct region and default tags (tagged ManagedBy = "terragrunt")
  • Terraform source: Points to infrastructure/ at the repo root

Each environment directory contains:

  • env.hcl — defines environment and aws_region as locals
  • terragrunt.hcl — includes the root config, reads common.hcl and env.hcl, then merges all inputs
Terminal window
# Dev
cd terragrunt/dev/
terragrunt init
terragrunt plan
terragrunt apply
# Prod
cd terragrunt/prod/
terragrunt init
terragrunt plan
terragrunt apply

To add a new environment (e.g., staging):

  1. Create infrastructure/environments/staging.tfvars:

    environment = "staging"
    aws_region = "us-east-1"
    gateway_desired_count = 2
    gateway_cpu = 1024
    gateway_memory = 2048
    autoscaling_min_capacity = 1
    autoscaling_max_capacity = 4
    enable_waf = true
    certificate_arn = "arn:aws:acm:us-east-1:123456789012:certificate/abc-123"
    cognito_domain_prefix = "ai-gateway-staging"
    enable_jwt_auth = false
  2. Update the environment variable validation in infrastructure/variables.tf to allow "staging":

    validation {
    condition = contains(["dev", "staging", "prod"], var.environment)
    error_message = "Environment must be 'dev', 'staging', or 'prod'."
    }
  3. Create the state backend resources:

    Terminal window
    aws s3api create-bucket --bucket ai-gateway-tfstate-staging --region us-east-1
    aws s3api put-bucket-versioning --bucket ai-gateway-tfstate-staging \
    --versioning-configuration Status=Enabled
    aws dynamodb create-table --table-name ai-gateway-tfstate-lock-staging \
    --attribute-definitions AttributeName=LockID,AttributeType=S \
    --key-schema AttributeName=LockID,KeyType=HASH \
    --billing-mode PAY_PER_REQUEST --region us-east-1
  4. Deploy:

    Terminal window
    cd infrastructure/
    terraform init \
    -backend-config="bucket=ai-gateway-tfstate-staging" \
    -backend-config="key=terraform.tfstate" \
    -backend-config="region=us-east-1" \
    -backend-config="encrypt=true" \
    -backend-config="dynamodb_table=ai-gateway-tfstate-lock-staging"
    terraform apply -var-file=environments/staging.tfvars
  1. Create terragrunt/staging/env.hcl:

    locals {
    environment = "staging"
    aws_region = "us-east-1"
    }
  2. Create terragrunt/staging/terragrunt.hcl:

    include "root" {
    path = find_in_parent_folders()
    }
    locals {
    common = read_terragrunt_config(find_in_parent_folders("common.hcl", "_env/common.hcl"))
    env = read_terragrunt_config("env.hcl")
    }
    inputs = merge(
    local.common.locals,
    local.env.locals,
    {
    gateway_desired_count = 2
    gateway_cpu = 1024
    gateway_memory = 2048
    autoscaling_min_capacity = 1
    autoscaling_max_capacity = 4
    enable_waf = true
    certificate_arn = "arn:aws:acm:us-east-1:123456789012:certificate/abc-123"
    cognito_domain_prefix = "ai-gateway-staging"
    enable_jwt_auth = false
    }
    )
  3. Update the environment variable validation (same as Option A, step 2).

  4. Deploy:

    Terminal window
    cd terragrunt/staging/
    terragrunt apply

    Terragrunt automatically creates the S3 bucket and DynamoDB table if they do not exist.

The default VPC CIDR is 10.0.0.0/16. To avoid conflicts with existing VPCs, override it:

vpc_cidr = "10.1.0.0/16"

Subnet allocation is derived automatically from the CIDR:

Subnet TypeCIDR DerivationExample (10.0.0.0/16)
Public Subnet AZ-acidrsubnet(vpc_cidr, 8, 1)10.0.1.0/24
Public Subnet AZ-bcidrsubnet(vpc_cidr, 8, 2)10.0.2.0/24
Private Subnet AZ-acidrsubnet(vpc_cidr, 8, 10)10.0.10.0/24
Private Subnet AZ-bcidrsubnet(vpc_cidr, 8, 20)10.0.20.0/24

CPU and memory are allocated at the ECS task level, then split between the gateway container and the ADOT sidecar:

VariableTask TotalGatewayADOT Sidecar
gateway_cpu = 512512 units256 units256 units
gateway_cpu = 10241024 units768 units256 units
gateway_memory = 10241024 MiB768 MiB256 MiB
gateway_memory = 20482048 MiB1792 MiB256 MiB

The ADOT sidecar always receives 256 CPU units and 256 MiB memory. The remainder goes to the Portkey gateway container.

Two autoscaling policies are configured:

PolicyMetricTargetScale-out CooldownScale-in Cooldown
CPUECSServiceAverageCPUUtilization70%60s300s
RequestsALBRequestCountPerTarget500 requests/target60s300s

To adjust these, modify the autoscaling_policies block in modules/compute/main.tf.

Both dev and prod default to us-east-1. To deploy in a different region, update the aws_region in your tfvars or Terragrunt env.hcl. Ensure an ACM certificate is available in the target region.