Skip to content

Upgrading

This guide covers the three main upgrade paths for the AI Gateway: upgrading the agentgateway data-plane image, upgrading Terraform providers, and enabling new features.

Upgrading the agentgateway Data-Plane Image

Section titled “Upgrading the agentgateway Data-Plane Image”

The data plane is the agentgateway proxy (Rust, distroless base), pinned by image digest in versions.env at the repository root. We do not build the binary from source — agentgateway publishes an official hardened image, so the Dockerfile re-tags the upstream image by digest into our ECR. The digest is the immutable supply-chain contract; the tag is informational. See ADR-017 for the decision that replaced the Portkey OSS build.

versions.env holds three values:

  • AGENTGATEWAY_REF — the upstream image tag published by the agentgateway project (e.g. v0.5.6). Informational; the digest is the real pin.
  • AGENTGATEWAY_VERSION — a human-readable label used in image tags, labels, and docs.
  • AGENTGATEWAY_IMAGE_DIGEST — the sha256:... digest of the upstream multi-arch image. This is the immutable pin; the build references ghcr.io/agentgateway/agentgateway@<digest>.

A scheduled GitHub Actions workflow watches upstream agentgateway releases and, when a newer release is found, opens a pull request that bumps AGENTGATEWAY_REF + AGENTGATEWAY_IMAGE_DIGEST in versions.env. The workflow pulls the new image, re-tags it by digest, and runs the container security scans before the PR is opened.

If you prefer to upgrade manually or need to pin a specific image:

1. Resolve the digest for the target tag

Terminal window
docker buildx imagetools inspect ghcr.io/agentgateway/agentgateway:v0.5.7

Copy the sha256:... digest from the output.

2. Update versions.env

Bump AGENTGATEWAY_REF and AGENTGATEWAY_IMAGE_DIGEST together — they must always describe the same image:

Terminal window
# versions.env
AGENTGATEWAY_REF=v0.5.7
AGENTGATEWAY_VERSION=0.5.7
AGENTGATEWAY_IMAGE_DIGEST=sha256:<digest-from-step-1>

3. Update the gateway_image default

Update the gateway_image default in infrastructure/variables.tf to match the new tag, so terraform plan/validate resolve a consistent default.

4. Test the build locally

Terminal window
docker build \
--build-arg AGENTGATEWAY_REF=v0.5.7 \
--build-arg AGENTGATEWAY_VERSION=0.5.7 \
--build-arg AGENTGATEWAY_IMAGE=ghcr.io/agentgateway/agentgateway@sha256:<digest> \
-t ai-gateway:test .

The build re-tags the pinned upstream image; it intentionally adds no layers, preserving the distroless attack surface.

5. Push a version tag to trigger the release

Terminal window
# Bump the project version (updates pyproject.toml, generates CHANGELOG.md, commits, tags)
mise run release:bump-patch # or bump-minor / bump-major
# Push the tag to trigger the release workflow
git push origin main --tags

The v* tag triggers .github/workflows/release.yml, which:

  • Re-tags the pinned upstream image by digest and pushes it to ECR (and GHCR).
  • Signs the image with cosign (keyless Sigstore).
  • Generates CycloneDX and SPDX SBOMs.
  • Creates a GitHub Release with an auto-generated changelog.

The Dockerfile is a single stage that re-tags the pinned upstream image:

StepWhat Happens
baseFROM ghcr.io/agentgateway/agentgateway@<AGENTGATEWAY_IMAGE_DIGEST> — the pinned distroless image
labelsOCI labels stamp AGENTGATEWAY_VERSION and the upstream base name
exposePort 8787 (the gateway listener); readiness is on 15021, checked by ECS

No application layers are added: there is no shell, no package manager, and the entrypoint (/app/agentgateway) is inherited from the upstream image. The ECS task definition supplies the config via command: ["-c", "<rendered config>"].

The versions.env file is loaded by the CI and release workflows via cat versions.env >> "$GITHUB_ENV", making AGENTGATEWAY_REF, AGENTGATEWAY_VERSION, and AGENTGATEWAY_IMAGE_DIGEST available as --build-arg values during the build.


Terraform provider versions are pinned in infrastructure/versions.tf:

terraform {
required_version = "~> 1.14"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 6.22"
}
}
}

1. Update the version constraint

Edit infrastructure/versions.tf and adjust the version constraint. The ~> operator allows patch-level upgrades within the specified minor version (e.g., ~> 6.22 allows 6.22.x through 6.99.x).

# Example: allow up to 6.x
version = "~> 6.39"

2. Run terraform init -upgrade

Terminal window
cd infrastructure
terraform init -upgrade

This downloads the latest provider version that satisfies the constraint and updates .terraform.lock.hcl.

3. Run terraform plan

Terminal window
terraform plan -var-file=envs/dev.tfvars

Review the plan carefully. Provider upgrades can introduce new resource behaviors, deprecated arguments, or changed defaults.

4. Commit the lock file

Terminal window
git add versions.tf .terraform.lock.hcl
git commit -m "deps(terraform): bump hashicorp/aws to ~> 6.39"

Most optional features are controlled by Terraform boolean toggle variables and can be enabled independently. Per-team Cognito clients are the exception: they are driven by the client_configs map (non-empty enables the clients module), not a boolean. See Feature Toggles for the full list.

1. Add the toggle variables to your .tfvars file

# Platform features
enable_cost_attribution = true
# Per-team Cognito clients (map-driven, not a boolean toggle)
client_configs = {
platform = {
allowed_scopes = ["https://gateway.internal/invoke"]
description = "Platform engineering team"
}
}
# Metering & governance
enable_admin_api = true
enable_audit_log = true
# Identity & SSO
enable_user_auth = true

2. Provide any required configuration

Some features require additional configuration variables beyond the toggle. For example, SSO requires identity_providers and group_mapping. Refer to the Feature Toggles page for the full variable list.

3. Apply

Terminal window
terraform plan -var-file=envs/dev.tfvars
terraform apply -var-file=envs/dev.tfvars

Setting a toggle to false and running terraform apply destroys only the resources created by that feature. Base infrastructure and other features are unaffected.

# Disable guardrails while keeping everything else
enable_guardrails = false

If a new agentgateway image introduces issues, revert versions.env (and the gateway_image default) to the previous known-good values and push a new release tag:

Terminal window
# Revert versions.env to the previous known-good digest
git checkout HEAD~1 -- versions.env infrastructure/variables.tf
# Commit and tag a new release
git add versions.env infrastructure/variables.tf
git commit -m "fix: revert agentgateway image to v0.5.5"
mise run release:bump-patch
git push origin main --tags

The release workflow will re-tag and deploy the reverted image by digest.

For infrastructure changes, use Terraform’s standard rollback approach:

Option 1: Revert the code and re-apply

Terminal window
git revert <commit-sha>
terraform apply -var-file=envs/dev.tfvars

Option 2: Re-apply with the previous variable values

If you only changed .tfvars values, revert those values and re-apply. Terraform will converge the infrastructure to match the previous state.

To roll back a feature, set its toggle to false and apply:

# Roll back SSO
enable_user_auth = false
Terminal window
terraform apply -var-file=envs/dev.tfvars

This destroys only the resources created by that feature. M2M authentication and other features continue to function.