Environment Variables

The AI Gateway ECS task runs two containers: the gateway (agentgateway, a Rust proxy on a distroless base) and the OTel collector (AWS Distro for OpenTelemetry). Provider API keys are injected securely from AWS Secrets Manager. Everything else about routing, providers, guardrails, and access-log shaping lives in the YAML config that Terraform renders and passes to agentgateway inline via -c — not through environment variables.

Gateway Container

Runtime Configuration

agentgateway takes its entire configuration inline from the -c argument (the rendered agentgateway-config.yaml.tftpl), so the container sets no runtime environment variables (environment = [] in the task definition). The container listens on port 8787, which matches the ALB target group health check; that port is fixed in the config, not set via an env var.

Provider API Keys (Secrets)

Provider API keys are injected from AWS Secrets Manager using the ECS secrets integration. The ECS agent fetches the secret value at task launch and exposes it as an environment variable inside the container.

Variable	Secrets Manager Path	Description
`OPENAI_API_KEY`	`ai-gateway/openai-api-key`	OpenAI API key
`ANTHROPIC_API_KEY`	`ai-gateway/anthropic-api-key`	Anthropic API key
`GOOGLE_API_KEY`	`ai-gateway/google-api-key`	Google API key
`AZURE_API_KEY`	`ai-gateway/azure-api-key`	Azure OpenAI API key

How secrets are injected

Terraform creates four aws_secretsmanager_secret resources under the ai-gateway/ prefix, encrypted with a dedicated KMS key (alias/ai-gateway-secrets).
The ECS task definition references each secret by ARN in the secrets block (not environment).
At task launch, the ECS agent calls secretsmanager:GetSecretValue using the task execution role and injects the plaintext value as the named environment variable.
The agentgateway config references the variable via shell expansion (e.g. ${ANTHROPIC_API_KEY}); the secret value never appears in the task definition or CloudWatch logs.

# Update a secret via CLI
aws secretsmanager put-secret-value \
  --secret-id ai-gateway/openai-api-key \
  --secret-string "sk-..."

After updating a secret, force a new ECS deployment to pick up the change. Running tasks do not reload secrets automatically.

aws ecs update-service --cluster ai-gateway-prod \
  --service ai-gateway-gateway --force-new-deployment

Routing, Guardrails, and Budget Enforcement (Inline Config, No Env Vars)

None of routing, guardrails, prompt caching, or budget enforcement is configured through environment variables. They are all rendered into the inline YAML config (agentgateway-config.yaml.tftpl):

Concern	Where it lives	Notes
Provider routing / failover	`ai.groups` priority groups	Bedrock primary, Anthropic-direct fallback by default. See Routing Strategies.
Model aliases	`policies.ai.modelAliases`	Maps requested model IDs onto backend models.
Prompt caching	`policies.ai.promptCaching`	Opt-in, Bedrock-only (injects `cachePoint` markers). Not a response cache.
Budget enforcement	`promptGuard.request` webhook	Points at the `budget_enforcement` Lambda Function URL (`budget_enforcement_webhook_url`). Renders only when set.
Content safety	`promptGuard` `bedrockGuardrails`	Inline Bedrock Guardrails (ApplyGuardrail), keyed by `bedrock_guardrail_id`. Renders only when set.

OTel Collector Container

The OTel collector sidecar runs the AWS Distro for OpenTelemetry (public.ecr.aws/aws-observability/aws-otel-collector:v0.47.0).

Variables

Variable	Description
`AOT_CONFIG_CONTENT`	Full content of the OTel Collector configuration YAML (injected inline)

The AOT_CONFIG_CONTENT variable contains the entire OTel configuration, which the collector reads at startup. The configuration is defined in infrastructure/otel-config.yaml and passed through Terraform.

OTel Configuration Summary

The collector configuration uses the ${env:AWS_REGION} placeholder, which is resolved from the ECS task metadata (set automatically by Fargate).

Receivers

Receiver	Endpoint	Description
OTLP gRPC	`localhost:4317`	Accepts traces, metrics, and logs over gRPC
OTLP HTTP	`localhost:4318`	Accepts traces, metrics, and logs over HTTP

Processors

Processor	Description
`batch`	Batches telemetry data (timeout: 5s, batch size: 512)
`memory_limiter`	Caps collector memory usage at 100 MiB
`resource`	Sets `service.name` attribute to `ai-gateway`
`attributes/genai`	Maps provider-specific attributes to OpenTelemetry GenAI semantic conventions

The attributes/genai processor enriches spans with the following mappings:

Source Attribute	GenAI Semantic Convention
`provider`	`gen_ai.system`
`model`	`gen_ai.request.model`
`usage.prompt_tokens`	`gen_ai.usage.input_tokens`
`usage.completion_tokens`	`gen_ai.usage.output_tokens`
`usage.cached_tokens`	`gen_ai.usage.cached_tokens`
`finish_reason`	`gen_ai.response.finish_reason`

Exporters

Exporter	Destination	Description
`awsxray`	AWS X-Ray	Distributed traces
`awsemf`	CloudWatch Metrics (Embedded Metric Format)	Custom metrics under the `AIGateway` namespace
`awscloudwatchlogs`	CloudWatch Logs (`/ecs/ai-gateway/otel-logs`)	Collector logs

CloudWatch Metrics Published

The EMF exporter publishes the following metrics to the AIGateway namespace:

Metric	Dimensions	Description
`PromptTokens`	Provider, Model	Input tokens per request
`CompletionTokens`	Provider, Model	Output tokens per request
`CachedTokens`	Provider, Model	Cached tokens per request
`TokensUsed`	Provider, Model	Total tokens per request
`EstimatedCostUsd`	Provider, Model, Team	Estimated cost in USD
`RequestCount`	Provider, Model, StatusClass	Request count by status class
`ResponseTime`	Provider, Model	End-to-end response latency
`TimeToFirstToken`	Provider, Model	Time to first token (streaming)
`CacheHits`	Provider	Cache hit count
`CacheMisses`	Provider	Cache miss count
`CacheCostSavingsUsd`	Provider	Cost savings from cache hits

Pipelines

Pipeline	Receivers	Processors	Exporters
`traces`	OTLP	memory_limiter, resource, attributes/genai, batch	AWS X-Ray
`metrics`	OTLP	memory_limiter, resource, attributes/genai, batch	AWS EMF
`logs`	OTLP	memory_limiter, resource, batch	CloudWatch Logs

Resource Allocation

The ECS task divides CPU and memory between the two containers:

Container	CPU	Memory
Gateway	`gateway_cpu - 256`	`gateway_memory - 256`
OTel Collector	256 units	256 MiB

With the default task size of 1024 CPU / 2048 MiB, the gateway gets 768 CPU units and 1792 MiB of memory.

Logging

Both containers use the awslogs log driver:

Container	Log Group	Stream Prefix
Gateway	`/ecs/ai-gateway/gateway` (from observability module)	`gateway`
OTel Collector	`/ecs/ai-gateway/otel` (from observability module)	`otel`