Skip to content

Environment Variables

The AI Gateway ECS task runs two containers: the gateway (agentgateway, a Rust proxy on a distroless base) and the OTel collector (AWS Distro for OpenTelemetry). Provider API keys are injected securely from AWS Secrets Manager. Everything else about routing, providers, guardrails, and access-log shaping lives in the YAML config that Terraform renders and passes to agentgateway inline via -c — not through environment variables.


agentgateway takes its entire configuration inline from the -c argument (the rendered agentgateway-config.yaml.tftpl), so the container sets no runtime environment variables (environment = [] in the task definition). The container listens on port 8787, which matches the ALB target group health check; that port is fixed in the config, not set via an env var.

Provider API keys are injected from AWS Secrets Manager using the ECS secrets integration. The ECS agent fetches the secret value at task launch and exposes it as an environment variable inside the container.

VariableSecrets Manager PathDescription
OPENAI_API_KEYai-gateway/openai-api-keyOpenAI API key
ANTHROPIC_API_KEYai-gateway/anthropic-api-keyAnthropic API key
GOOGLE_API_KEYai-gateway/google-api-keyGoogle API key
AZURE_API_KEYai-gateway/azure-api-keyAzure OpenAI API key
  1. Terraform creates four aws_secretsmanager_secret resources under the ai-gateway/ prefix, encrypted with a dedicated KMS key (alias/ai-gateway-secrets).
  2. The ECS task definition references each secret by ARN in the secrets block (not environment).
  3. At task launch, the ECS agent calls secretsmanager:GetSecretValue using the task execution role and injects the plaintext value as the named environment variable.
  4. The agentgateway config references the variable via shell expansion (e.g. ${ANTHROPIC_API_KEY}); the secret value never appears in the task definition or CloudWatch logs.
Terminal window
# Update a secret via CLI
aws secretsmanager put-secret-value \
--secret-id ai-gateway/openai-api-key \
--secret-string "sk-..."

Routing, Guardrails, and Budget Enforcement (Inline Config, No Env Vars)

Section titled “Routing, Guardrails, and Budget Enforcement (Inline Config, No Env Vars)”

None of routing, guardrails, prompt caching, or budget enforcement is configured through environment variables. They are all rendered into the inline YAML config (agentgateway-config.yaml.tftpl):

ConcernWhere it livesNotes
Provider routing / failoverai.groups priority groupsBedrock primary, Anthropic-direct fallback by default. See Routing Strategies.
Model aliasespolicies.ai.modelAliasesMaps requested model IDs onto backend models.
Prompt cachingpolicies.ai.promptCachingOpt-in, Bedrock-only (injects cachePoint markers). Not a response cache.
Budget enforcementpromptGuard.request webhookPoints at the budget_enforcement Lambda Function URL (budget_enforcement_webhook_url). Renders only when set.
Content safetypromptGuard bedrockGuardrailsInline Bedrock Guardrails (ApplyGuardrail), keyed by bedrock_guardrail_id. Renders only when set.

The OTel collector sidecar runs the AWS Distro for OpenTelemetry (public.ecr.aws/aws-observability/aws-otel-collector:v0.47.0).

VariableDescription
AOT_CONFIG_CONTENTFull content of the OTel Collector configuration YAML (injected inline)

The AOT_CONFIG_CONTENT variable contains the entire OTel configuration, which the collector reads at startup. The configuration is defined in infrastructure/otel-config.yaml and passed through Terraform.

The collector configuration uses the ${env:AWS_REGION} placeholder, which is resolved from the ECS task metadata (set automatically by Fargate).

ReceiverEndpointDescription
OTLP gRPClocalhost:4317Accepts traces, metrics, and logs over gRPC
OTLP HTTPlocalhost:4318Accepts traces, metrics, and logs over HTTP
ProcessorDescription
batchBatches telemetry data (timeout: 5s, batch size: 512)
memory_limiterCaps collector memory usage at 100 MiB
resourceSets service.name attribute to ai-gateway
attributes/genaiMaps provider-specific attributes to OpenTelemetry GenAI semantic conventions

The attributes/genai processor enriches spans with the following mappings:

Source AttributeGenAI Semantic Convention
providergen_ai.system
modelgen_ai.request.model
usage.prompt_tokensgen_ai.usage.input_tokens
usage.completion_tokensgen_ai.usage.output_tokens
usage.cached_tokensgen_ai.usage.cached_tokens
finish_reasongen_ai.response.finish_reason
ExporterDestinationDescription
awsxrayAWS X-RayDistributed traces
awsemfCloudWatch Metrics (Embedded Metric Format)Custom metrics under the AIGateway namespace
awscloudwatchlogsCloudWatch Logs (/ecs/ai-gateway/otel-logs)Collector logs

The EMF exporter publishes the following metrics to the AIGateway namespace:

MetricDimensionsDescription
PromptTokensProvider, ModelInput tokens per request
CompletionTokensProvider, ModelOutput tokens per request
CachedTokensProvider, ModelCached tokens per request
TokensUsedProvider, ModelTotal tokens per request
EstimatedCostUsdProvider, Model, TeamEstimated cost in USD
RequestCountProvider, Model, StatusClassRequest count by status class
ResponseTimeProvider, ModelEnd-to-end response latency
TimeToFirstTokenProvider, ModelTime to first token (streaming)
CacheHitsProviderCache hit count
CacheMissesProviderCache miss count
CacheCostSavingsUsdProviderCost savings from cache hits
PipelineReceiversProcessorsExporters
tracesOTLPmemory_limiter, resource, attributes/genai, batchAWS X-Ray
metricsOTLPmemory_limiter, resource, attributes/genai, batchAWS EMF
logsOTLPmemory_limiter, resource, batchCloudWatch Logs

The ECS task divides CPU and memory between the two containers:

ContainerCPUMemory
Gatewaygateway_cpu - 256gateway_memory - 256
OTel Collector256 units256 MiB

With the default task size of 1024 CPU / 2048 MiB, the gateway gets 768 CPU units and 1792 MiB of memory.


Both containers use the awslogs log driver:

ContainerLog GroupStream Prefix
Gateway/ecs/ai-gateway/gateway (from observability module)gateway
OTel Collector/ecs/ai-gateway/otel (from observability module)otel