Security
The AI Gateway implements defense in depth across network, application, container, and CI/CD layers. This page covers each security control and how to configure it.
WAFv2 Rules
Section titled “WAFv2 Rules”WAFv2 is attached to the ALB and is controlled by the enable_waf variable (true in prod, false in dev). When enabled, four rules are evaluated in priority order:
| Priority | Rule Name | Type | Action | Description |
|---|---|---|---|---|
| 1 | AWSManagedRulesCommonRuleSet | AWS Managed | Override (none) | OWASP Top 10 protections: SQL injection, XSS, path traversal, etc. |
| 2 | AWSManagedRulesAmazonIpReputationList | AWS Managed | Override (none) | Blocks requests from IP addresses with poor reputation (botnets, scanners) |
| 3 | RateLimitPerIP | Custom | Block | Rate-limits to 2,000 requests per 5-minute window per source IP |
| 4 | AWSManagedRulesKnownBadInputsRuleSet | AWS Managed | Override (none) | Blocks request patterns known to be associated with exploitation (Log4j, etc.) |
The default action is Allow — only requests matching a blocking rule are rejected.
All rules emit CloudWatch metrics (e.g., ai-gateway-common-rules, ai-gateway-rate-limit) with sampled request logging enabled.
Enabling WAF
Section titled “Enabling WAF”# In your tfvars or Terragrunt inputs:enable_waf = trueNo other configuration is needed. The WAF ACL, logging, and ALB association are all created automatically when enable_waf = true.
ALB JWT Validation
Section titled “ALB JWT Validation”When enable_jwt_auth = true and a certificate_arn is provided, the ALB replaces its standard HTTPS forward listener with a two-action listener that validates JWTs before forwarding traffic.
How It Works
Section titled “How It Works”- The client obtains an access token from the Cognito token endpoint using the
client_credentialsgrant. - The client sends requests with the token in the
Authorization: Bearer <token>header. - The ALB validates the JWT:
- Verifies the signature against the Cognito JWKS endpoint
- Checks the
issuerclaim matches the Cognito User Pool - Confirms the
scopeclaim containshttps://gateway.internal/invoke
- If validation passes, the request is forwarded to the ECS target group.
- If validation fails, the ALB returns HTTP 401 automatically.
sequenceDiagram
participant Client
participant Cognito as Cognito Token Endpoint
participant ALB as ALB (Port 443)
participant ECS as ECS Fargate
Client->>Cognito: POST /oauth2/token<br/>grant_type=client_credentials
Cognito-->>Client: access_token (JWT, 1h TTL)
Client->>ALB: GET /v1/chat/completions<br/>Authorization: Bearer {token}
ALB->>ALB: Validate JWT signature (JWKS)<br/>Check issuer, scope claim
alt Valid Token
ALB->>ECS: Forward request
ECS-->>ALB: Response
ALB-->>Client: 200 OK
else Invalid Token
ALB-->>Client: 401 Unauthorized
end
Enabling JWT Auth
Section titled “Enabling JWT Auth”# Both are required:certificate_arn = "arn:aws:acm:us-east-1:123456789012:certificate/abc-123"enable_jwt_auth = trueObtaining a Token
Section titled “Obtaining a Token”Use the provided helper script:
export GATEWAY_CLIENT_ID="your-client-id"export GATEWAY_CLIENT_SECRET="your-client-secret"export GATEWAY_TOKEN_ENDPOINT="https://ai-gateway-prod.auth.us-east-1.amazoncognito.com/oauth2/token"
token=$(./scripts/get-gateway-token.sh)
curl -H "Authorization: Bearer $token" \ https://gateway.example.com/v1/chat/completions \ -d '{"model": "gpt-4", "messages": [{"role": "user", "content": "Hello"}]}'Cognito Configuration
Section titled “Cognito Configuration”The auth module creates a full Cognito M2M (machine-to-machine) authentication stack:
| Resource | Purpose |
|---|---|
| User Pool | Identity store with deletion_protection = "ACTIVE" and admin-only user creation |
| Resource Server | Defines the https://gateway.internal identifier with two scopes: invoke and admin |
| User Pool Client | M2M client using client_credentials grant with 1-hour access token validity |
| User Pool Domain | Provides the /oauth2/token endpoint (e.g., ai-gateway-prod.auth.us-east-1.amazoncognito.com) |
Scopes
Section titled “Scopes”| Scope | Identifier | Purpose |
|---|---|---|
| invoke | https://gateway.internal/invoke | Required to call gateway API endpoints. Validated by the ALB JWT listener. |
| admin | https://gateway.internal/admin | Reserved for administrative operations. Not enforced by default. |
The default M2M client receives both scopes. For multi-client setups, use the B.1 Multi-Client Onboarding feature to issue per-team credentials with fine-grained scope assignments.
Secrets Manager
Section titled “Secrets Manager”Four provider API keys are stored in AWS Secrets Manager, encrypted with a dedicated KMS key:
| Secret Path | Environment Variable | Provider |
|---|---|---|
ai-gateway/openai-api-key | OPENAI_API_KEY | OpenAI |
ai-gateway/anthropic-api-key | ANTHROPIC_API_KEY | Anthropic |
ai-gateway/google-api-key | GOOGLE_API_KEY | Google AI |
ai-gateway/azure-api-key | AZURE_API_KEY | Azure OpenAI |
ECS tasks retrieve these secrets at launch via the task execution role, which has secretsmanager:GetSecretValue permission scoped to arn:aws:secretsmanager:*:*:secret:ai-gateway/*.
KMS Encryption
Section titled “KMS Encryption”Three dedicated KMS keys are used, all with automatic annual key rotation enabled:
| KMS Key Alias | Purpose |
|---|---|
alias/ai-gateway-logs | Encrypts CloudWatch log groups |
alias/ai-gateway-ecr | Encrypts ECR container images |
alias/ai-gateway-secrets | Encrypts Secrets Manager values |
Network Security
Section titled “Network Security”VPC Isolation
Section titled “VPC Isolation”- ECS tasks run in private subnets with no direct internet ingress.
- A single NAT Gateway in a public subnet provides outbound internet access (for reaching LLM provider APIs).
- The ALB is deployed in public subnets and is the only internet-facing component.
- ALB security group allows inbound on ports 80 and 443 only; egress is restricted to the VPC CIDR.
- ECS security group allows inbound on port 8787 only from the ALB security group; egress is unrestricted (for provider API calls via NAT).
VPC Endpoints
Section titled “VPC Endpoints”Four VPC endpoints eliminate the need to route service traffic through the NAT Gateway:
| Endpoint | Type | Service |
|---|---|---|
| S3 | Gateway | com.amazonaws.{region}.s3 |
| ECR API | Interface | com.amazonaws.{region}.ecr.api |
| ECR DKR | Interface | com.amazonaws.{region}.ecr.dkr |
| CloudWatch Logs | Interface | com.amazonaws.{region}.logs |
| Secrets Manager | Interface | com.amazonaws.{region}.secretsmanager |
Interface endpoints are secured by a dedicated security group that allows HTTPS (port 443) from private subnet CIDRs only.
Container Security
Section titled “Container Security”| Control | Implementation |
|---|---|
| ECR scan-on-push | Every image pushed to ECR is automatically scanned for vulnerabilities |
| Immutable tags | ECR image_tag_mutability = "IMMUTABLE" prevents tag overwriting |
| KMS encryption | ECR images are encrypted at rest with a dedicated KMS key |
| Lifecycle policy | Only the last 10 images are retained; older images are automatically expired |
| Cosign signing | The CI pipeline signs images with Sigstore cosign (keyless) after push |
Security Scanning Pipeline
Section titled “Security Scanning Pipeline”The CI/CD pipeline runs 12 security tools across three phases. All findings are uploaded as SARIF to the GitHub Security tab.
| Tool | Phase | Target | Purpose |
|---|---|---|---|
| Semgrep | Pre-build | Python source | SAST: OWASP Top 10, security audit, Python-specific rules |
| Gitleaks | Pre-build | Git history | Secret detection in code and commit history |
| Checkov | Pre-build | Terraform | IaC misconfiguration scanning (CIS, SOC2 benchmarks) |
| Hadolint | Pre-build | Dockerfiles | Dockerfile best-practice linting |
| TFLint | Pre-build | Terraform | Terraform linting and provider-specific checks |
| Trivy | Pre-build | Container image | Vulnerability scanning (CRITICAL, HIGH severity) |
| Syft | Pre-build | Container image | SBOM generation (CycloneDX format, retained 90 days) |
| CodeQL | Post-build | Python source | Semantic code analysis (security + quality queries) |
| Scorecard | Post-build | Repository | OpenSSF supply-chain security assessment |
| Dependency Review | Post-build | PR diffs | Vulnerability and license check on dependency changes |
| Dependabot | Continuous | All ecosystems | Automated dependency updates (Python, Terraform, GitHub Actions) |
| Cosign | Post-push | ECR image | Keyless image signing with Sigstore |
Pipeline Flow
Section titled “Pipeline Flow”flowchart LR
subgraph prebuild["Phase 1: Pre-Build"]
direction TB
Q["Code Quality\nRuff, Pyright"]
SAST["SAST\nSemgrep, Gitleaks"]
IAC["IaC Security\nCheckov, TFLint,\nTerraform validate"]
CONT["Container Security\nHadolint, Trivy, Syft"]
end
subgraph build["Phase 2: Build and Push"]
direction TB
ECR["Pull, Tag, Push\nto ECR"]
SIGN["Cosign Sign\nKeyless"]
end
subgraph postbuild["Phase 3: Post-Build"]
direction TB
DEPLOY["Deploy to ECS\nRolling Update"]
CQL["CodeQL Analysis\nWeekly Schedule"]
SC["OpenSSF Scorecard\nWeekly Schedule"]
end
subgraph continuous["Continuous"]
direction TB
DEP["Dependabot\nWeekly PRs"]
DR["Dependency Review\nPR Gate"]
end
Q & SAST & IAC & CONT --> build
ECR --> SIGN
SIGN --> DEPLOY
CQL ~~~ SC
DEP ~~~ DR