cloud

Run Istio ambient mode with waypoint proxies

How I enroll workloads into ambient mesh and add service-scoped L7 waypoints.

This post records how I use Istio ambient mode in the same GitOps-managed Kubernetes cluster from the previous posts.

Sidecars are powerful, but for a home cluster I like ambient mode because the first step is much smaller: enroll a namespace, let ztunnel handle secure L4 traffic, then add a waypoint only where I actually need L7 behavior.

The example application set stays the same:

  • example-api: public API at api.example.com
  • example-worker: public worker UI at worker.example.com
  • example-admin: internal admin app, no public route

In this post, example-api gets a service-scoped waypoint. example-worker starts with ambient L4 only. example-admin stays internal, but it can still be enrolled in ambient mode later.

Series

This post is part of my home Kubernetes GitOps series:

  1. Bootstrap a new RKE cluster for GitOps
  2. Use Argo CD to manage my home Kubernetes cluster
  3. Use Vault and External Secrets in Kubernetes
  4. Run Istio ambient mode with waypoint proxies
  5. Expose Kubernetes services with Istio Gateway API
  6. Build an OpenTelemetry stack for Kubernetes apps
  7. Run Airflow on Kubernetes with GitOps-managed values
  8. Use Mozilla SOPS with GitOps for encrypted Kubernetes Secrets

What ambient mode changes

With sidecar mode, every pod gets an Envoy sidecar. With ambient mode, the first layer is node-level traffic capture through Istio CNI and ztunnel.

For a namespace, the switch is just a label:

apiVersion: v1
kind: Namespace
metadata:
  name: example-api
  labels:
    istio.io/dataplane-mode: ambient

After new pods start in that namespace, ztunnel can handle L4 mesh traffic. I do not need to inject a sidecar into every workload.

That does not mean I get every L7 feature automatically. For HTTP routing, authorization, telemetry, or policy that needs L7 visibility, I add a waypoint.

GitOps sync order

Ambient mode has a few platform pieces that must exist before application pods are enrolled.

In my app-of-apps layout, the order is:

  • wave -40: Gateway API CRDs
  • wave -30: Istio base CRDs and cluster roles
  • wave -20: Istio control plane
  • wave -10: Istio CNI
  • wave -5: ztunnel
  • wave 10+: application namespaces and workloads

The important part is that the namespace label should not be applied before the cluster has CNI and ztunnel healthy. Otherwise, the first sync becomes harder to debug because workloads and the mesh are arriving at the same time.

Add a waypoint

For service-scoped L7 processing, I create a Gateway with gatewayClassName: istio-waypoint.

apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: example-api-waypoint
  namespace: example-api
  labels:
    istio.io/waypoint-for: service
spec:
  gatewayClassName: istio-waypoint
  listeners:
    - name: mesh
      port: 15008
      protocol: HBONE

Then I label the Service that should use it:

apiVersion: v1
kind: Service
metadata:
  name: example-api
  namespace: example-api
  labels:
    istio.io/use-waypoint: example-api-waypoint
spec:
  selector:
    app: example-api
  ports:
    - name: http
      port: 3000
      targetPort: 3000

This keeps the waypoint scoped to traffic addressed to the Service. I like that boundary: the namespace can be in ambient mode, but I still choose where L7 processing is worth the extra component.

Restart workloads after enrollment

After the namespace label and mesh components are ready, I restart the workload. That makes the pod state match the new namespace setup.

kubectl -n example-api rollout restart deployment example-api
kubectl -n example-worker rollout restart deployment example-worker

For a real migration, I do this gradually. Ambient mode is easier to introduce than sidecars, but I still want a clean before-and-after point when checking traffic and metrics.

External services

Some pods call HTTPS APIs outside the cluster. For those, I add explicit ServiceEntry resources so the mesh and Kiali can classify egress traffic.

apiVersion: networking.istio.io/v1
kind: ServiceEntry
metadata:
  name: example-vendor-api
  namespace: example-api
spec:
  hosts:
    - api.vendor.example
  location: MESH_EXTERNAL
  ports:
    - number: 443
      name: tls
      protocol: TLS
  resolution: DNS

I avoid wildcard hosts here. Explicit FQDNs are easier to audit and usually enough for service-to-service integrations.

Validate

First I check the ambient platform pieces.

kubectl -n istio-system get pods -l app=ztunnel
kubectl -n istio-system get pods -l k8s-app=istio-cni-node
kubectl -n istio-system get pods -l app=istiod

Then I check the namespace and waypoint.

kubectl get namespace example-api --show-labels
kubectl -n example-api get gateway example-api-waypoint
kubectl -n example-api get deploy,svc | grep waypoint
kubectl -n example-api get svc example-api --show-labels

For egress visibility:

kubectl -n example-api get serviceentry

For application health:

kubectl -n example-api get pods -o wide
kubectl -n example-api logs deploy/example-api --tail=80

Kiali access

I keep Kiali private. If anonymous auth is enabled for a home lab, it should still stay as ClusterIP and be opened with a local port-forward.

kubectl -n istio-system port-forward svc/kiali 20001:20001

Then open http://127.0.0.1:20001 locally.

I do not expose Kiali through the public Gateway. It is an observability and control-plane tool, not an app endpoint.

Common problems

If the waypoint does not appear, I check whether Gateway API CRDs and Istio control plane synced before the application.

If the Service label exists but traffic does not look right, I check whether the workload pods were restarted after namespace enrollment.

If Kiali shows unknown external traffic, I check whether the pod is calling a host that does not have a matching ServiceEntry.

If a namespace is not ready for mesh traffic, I remove the ambient label from that namespace instead of trying to debug everything at once.

Conclusion

Ambient mode gives me a nice migration path. I can enroll namespaces without touching every pod spec, then add waypoints only for services that need L7 features.

For my GitOps flow, the rule is simple: install mesh infrastructure early, enroll applications later, and keep Kiali private while I verify the graph.