Skip to content

The AI Gateway implements defense in depth across network, application, container, and CI/CD layers. This page covers each security control and how to configure it.

WAFv2 is attached to the ALB and is controlled by the enable_waf variable (true in prod, false in dev). When enabled, four rules are evaluated in priority order:

PriorityRule NameTypeActionDescription
1AWSManagedRulesCommonRuleSetAWS ManagedOverride (none)OWASP Top 10 protections: SQL injection, XSS, path traversal, etc.
2AWSManagedRulesAmazonIpReputationListAWS ManagedOverride (none)Blocks requests from IP addresses with poor reputation (botnets, scanners)
3RateLimitPerIPCustomBlockRate-limits to 2,000 requests per 5-minute window per source IP
4AWSManagedRulesKnownBadInputsRuleSetAWS ManagedOverride (none)Blocks request patterns known to be associated with exploitation (Log4j, etc.)

The default action is Allow — only requests matching a blocking rule are rejected.

All rules emit CloudWatch metrics (e.g., ai-gateway-common-rules, ai-gateway-rate-limit) with sampled request logging enabled.

# In your tfvars or Terragrunt inputs:
enable_waf = true

No other configuration is needed. The WAF ACL, logging, and ALB association are all created automatically when enable_waf = true.

When enable_jwt_auth = true and a certificate_arn is provided, the ALB replaces its standard HTTPS forward listener with a two-action listener that validates JWTs before forwarding traffic.

  1. The client obtains an access token from the Cognito token endpoint using the client_credentials grant.
  2. The client sends requests with the token in the Authorization: Bearer <token> header.
  3. The ALB validates the JWT:
    • Verifies the signature against the Cognito JWKS endpoint
    • Checks the issuer claim matches the Cognito User Pool
    • Confirms the scope claim contains https://gateway.internal/invoke
  4. If validation passes, the request is forwarded to the ECS target group.
  5. If validation fails, the ALB returns HTTP 401 automatically.
sequenceDiagram
    participant Client
    participant Cognito as Cognito Token Endpoint
    participant ALB as ALB (Port 443)
    participant ECS as ECS Fargate

    Client->>Cognito: POST /oauth2/token<br/>grant_type=client_credentials
    Cognito-->>Client: access_token (JWT, 1h TTL)

    Client->>ALB: GET /v1/chat/completions<br/>Authorization: Bearer {token}
    ALB->>ALB: Validate JWT signature (JWKS)<br/>Check issuer, scope claim

    alt Valid Token
        ALB->>ECS: Forward request
        ECS-->>ALB: Response
        ALB-->>Client: 200 OK
    else Invalid Token
        ALB-->>Client: 401 Unauthorized
    end
# Both are required:
certificate_arn = "arn:aws:acm:us-east-1:123456789012:certificate/abc-123"
enable_jwt_auth = true

Use the provided helper script:

Terminal window
export GATEWAY_CLIENT_ID="your-client-id"
export GATEWAY_CLIENT_SECRET="your-client-secret"
export GATEWAY_TOKEN_ENDPOINT="https://ai-gateway-prod.auth.us-east-1.amazoncognito.com/oauth2/token"
token=$(./scripts/get-gateway-token.sh)
curl -H "Authorization: Bearer $token" \
https://gateway.example.com/v1/chat/completions \
-d '{"model": "gpt-4", "messages": [{"role": "user", "content": "Hello"}]}'

The auth module creates a full Cognito M2M (machine-to-machine) authentication stack:

ResourcePurpose
User PoolIdentity store with deletion_protection = "ACTIVE" and admin-only user creation
Resource ServerDefines the https://gateway.internal identifier with two scopes: invoke and admin
User Pool ClientM2M client using client_credentials grant with 1-hour access token validity
User Pool DomainProvides the /oauth2/token endpoint (e.g., ai-gateway-prod.auth.us-east-1.amazoncognito.com)
ScopeIdentifierPurpose
invokehttps://gateway.internal/invokeRequired to call gateway API endpoints. Validated by the ALB JWT listener.
adminhttps://gateway.internal/adminReserved for administrative operations. Not enforced by default.

The default M2M client receives both scopes. For multi-client setups, use the B.1 Multi-Client Onboarding feature to issue per-team credentials with fine-grained scope assignments.

Four provider API keys are stored in AWS Secrets Manager, encrypted with a dedicated KMS key:

Secret PathEnvironment VariableProvider
ai-gateway/openai-api-keyOPENAI_API_KEYOpenAI
ai-gateway/anthropic-api-keyANTHROPIC_API_KEYAnthropic
ai-gateway/google-api-keyGOOGLE_API_KEYGoogle AI
ai-gateway/azure-api-keyAZURE_API_KEYAzure OpenAI

ECS tasks retrieve these secrets at launch via the task execution role, which has secretsmanager:GetSecretValue permission scoped to arn:aws:secretsmanager:*:*:secret:ai-gateway/*.

Three dedicated KMS keys are used, all with automatic annual key rotation enabled:

KMS Key AliasPurpose
alias/ai-gateway-logsEncrypts CloudWatch log groups
alias/ai-gateway-ecrEncrypts ECR container images
alias/ai-gateway-secretsEncrypts Secrets Manager values
  • ECS tasks run in private subnets with no direct internet ingress.
  • A single NAT Gateway in a public subnet provides outbound internet access (for reaching LLM provider APIs).
  • The ALB is deployed in public subnets and is the only internet-facing component.
  • ALB security group allows inbound on ports 80 and 443 only; egress is restricted to the VPC CIDR.
  • ECS security group allows inbound on port 8787 only from the ALB security group; egress is unrestricted (for provider API calls via NAT).

Four VPC endpoints eliminate the need to route service traffic through the NAT Gateway:

EndpointTypeService
S3Gatewaycom.amazonaws.{region}.s3
ECR APIInterfacecom.amazonaws.{region}.ecr.api
ECR DKRInterfacecom.amazonaws.{region}.ecr.dkr
CloudWatch LogsInterfacecom.amazonaws.{region}.logs
Secrets ManagerInterfacecom.amazonaws.{region}.secretsmanager

Interface endpoints are secured by a dedicated security group that allows HTTPS (port 443) from private subnet CIDRs only.

ControlImplementation
ECR scan-on-pushEvery image pushed to ECR is automatically scanned for vulnerabilities
Immutable tagsECR image_tag_mutability = "IMMUTABLE" prevents tag overwriting
KMS encryptionECR images are encrypted at rest with a dedicated KMS key
Lifecycle policyOnly the last 10 images are retained; older images are automatically expired
Cosign signingThe CI pipeline signs images with Sigstore cosign (keyless) after push

The CI/CD pipeline runs 12 security tools across three phases. All findings are uploaded as SARIF to the GitHub Security tab.

ToolPhaseTargetPurpose
SemgrepPre-buildPython sourceSAST: OWASP Top 10, security audit, Python-specific rules
GitleaksPre-buildGit historySecret detection in code and commit history
CheckovPre-buildTerraformIaC misconfiguration scanning (CIS, SOC2 benchmarks)
HadolintPre-buildDockerfilesDockerfile best-practice linting
TFLintPre-buildTerraformTerraform linting and provider-specific checks
TrivyPre-buildContainer imageVulnerability scanning (CRITICAL, HIGH severity)
SyftPre-buildContainer imageSBOM generation (CycloneDX format, retained 90 days)
CodeQLPost-buildPython sourceSemantic code analysis (security + quality queries)
ScorecardPost-buildRepositoryOpenSSF supply-chain security assessment
Dependency ReviewPost-buildPR diffsVulnerability and license check on dependency changes
DependabotContinuousAll ecosystemsAutomated dependency updates (Python, Terraform, GitHub Actions)
CosignPost-pushECR imageKeyless image signing with Sigstore
flowchart LR
    subgraph prebuild["Phase 1: Pre-Build"]
        direction TB
        Q["Code Quality\nRuff, Pyright"]
        SAST["SAST\nSemgrep, Gitleaks"]
        IAC["IaC Security\nCheckov, TFLint,\nTerraform validate"]
        CONT["Container Security\nHadolint, Trivy, Syft"]
    end

    subgraph build["Phase 2: Build and Push"]
        direction TB
        ECR["Pull, Tag, Push\nto ECR"]
        SIGN["Cosign Sign\nKeyless"]
    end

    subgraph postbuild["Phase 3: Post-Build"]
        direction TB
        DEPLOY["Deploy to ECS\nRolling Update"]
        CQL["CodeQL Analysis\nWeekly Schedule"]
        SC["OpenSSF Scorecard\nWeekly Schedule"]
    end

    subgraph continuous["Continuous"]
        direction TB
        DEP["Dependabot\nWeekly PRs"]
        DR["Dependency Review\nPR Gate"]
    end

    Q & SAST & IAC & CONT --> build
    ECR --> SIGN
    SIGN --> DEPLOY
    CQL ~~~ SC
    DEP ~~~ DR