cloud

Run Airflow on Kubernetes with GitOps-managed values

How I keep Airflow Helm values in Git while runtime Secrets come from Vault.

This post records how I run Apache Airflow on Kubernetes with Argo CD.

Airflow is a good test for a GitOps cluster because it has both normal Helm values and a lot of sensitive runtime values: fernet key, webserver secret key, metadata database connection, broker URL, Redis password, the default admin user, and sometimes a private DAG repository SSH key.

My rule is:

  • non-secret Helm values live in Git
  • secret values live in Vault
  • External Secrets Operator generates the Kubernetes Secrets
  • Argo CD deploys Airflow after those Secrets exist

Airflow is a platform service in this example series. It runs alongside the same example app set used in earlier posts, but it has its own namespace and Vault path.

Series

This post is part of my home Kubernetes GitOps series:

  1. Bootstrap a new RKE cluster for GitOps
  2. Use Argo CD to manage my home Kubernetes cluster
  3. Use Vault and External Secrets in Kubernetes
  4. Run Istio ambient mode with waypoint proxies
  5. Expose Kubernetes services with Istio Gateway API
  6. Build an OpenTelemetry stack for Kubernetes apps
  7. Run Airflow on Kubernetes with GitOps-managed values
  8. Use Mozilla SOPS with GitOps for encrypted Kubernetes Secrets

Application order

I split Airflow into two Argo CD Applications:

  • airflow-secrets: sync wave 5
  • airflow: sync wave 10

The secret Application creates the ClusterSecretStore and ExternalSecret objects. The Helm Application installs Airflow after that.

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: airflow-secrets
  namespace: argocd
  annotations:
    argocd.argoproj.io/sync-wave: "5"
spec:
  project: default
  source:
    repoURL: ssh://[email protected]/platform/k8s-infra.git
    targetRevision: main
    path: apps/airflow
  destination:
    server: https://kubernetes.default.svc
    namespace: airflow
  syncPolicy:
    automated:
      prune: true
      selfHeal: true
    syncOptions:
      - CreateNamespace=true
      - SkipDryRunOnMissingResource=true

The Airflow chart then uses a multi-source Application. One source is the upstream Helm chart, and the other source is my Git repository for values.

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: airflow
  namespace: argocd
  annotations:
    argocd.argoproj.io/sync-wave: "10"
spec:
  project: default
  sources:
    - repoURL: https://airflow.apache.org
      chart: airflow
      targetRevision: 1.21.0
      helm:
        releaseName: airflow
        valueFiles:
          - $values/apps/airflow/airflow-values.yaml
    - repoURL: ssh://[email protected]/platform/k8s-infra.git
      targetRevision: main
      ref: values
  destination:
    server: https://kubernetes.default.svc
    namespace: airflow
  syncPolicy:
    automated:
      prune: true
      selfHeal: true
    retry:
      limit: 5
      backoff:
        duration: 30s
        factor: 2
        maxDuration: 5m
    syncOptions:
      - CreateNamespace=true

I pin the chart version. I do not want Airflow chart upgrades to happen just because Argo CD refreshed the Application.

Non-secret values in Git

The Git-tracked values file contains runtime shape, persistence, and references to Kubernetes Secret names.

executor: CeleryExecutor

env:
  - name: AIRFLOW__CORE__LOAD_EXAMPLES
    value: "FALSE"
  - name: AIRFLOW__WEBSERVER__DEFAULT_UI_TIMEZONE
    value: Asia/Taipei

data:
  metadataSecretName: airflow-metadata
  brokerUrlSecretName: airflow-broker-url

fernetKeySecretName: airflow-fernet-key
apiSecretKeySecretName: airflow-api-secret-key
jwtSecretName: airflow-jwt-secret
webserverSecretKeySecretName: airflow-webserver-secret-key

redis:
  passwordSecretName: airflow-redis-password

dags:
  gitSync:
    enabled: true
    repo: ssh://[email protected]/platform/airflow-dags.git
    branch: main
    subPath: ""
    sshKeySecret: airflow-ssh-secret
    knownHosts: |
      git.example.com ssh-ed25519 <verified-public-host-key>

logs:
  persistence:
    size: 10Gi

triggerer:
  persistence:
    size: 5Gi

workers:
  celery:
    persistence:
      size: 10Gi

The knownHosts value above is only a placeholder. In a real setup, I verify the Git server host key fingerprint before committing the public host key.

I avoid putting the DAG repo behind an example custom SSH port here. If the Git server uses normal SSH, the repo URL stays clean. If a custom port is needed, that should be a Git server decision, not an Airflow chart default.

Secret values in Vault

The secret payload can be one YAML document in Vault. For example:

createUserJob:
  defaultUser:
    username: admin
    password: replace-with-a-generated-password
    email: [email protected]
    firstName: platform
    lastName: admin
fernetKey: replace-with-fernet-key
apiSecretKey: replace-with-api-secret-key
jwtSecret: replace-with-jwt-secret
metadataConnection: postgresql://airflow:[email protected]:5432/airflow?sslmode=disable
brokerUrl: redis://:replace-me@airflow-redis:6379/0
redisPassword: replace-with-redis-password
webserverSecretKey: replace-with-webserver-secret-key
extraSecrets:
  airflow-ssh-secret:
    data:
      gitSshKey: |
        -----BEGIN OPENSSH PRIVATE KEY-----
        <private-key-from-vault>
        -----END OPENSSH PRIVATE KEY-----

Then I write it to Vault:

vault kv put secret/airflow/config config=@airflow-values-secret.yml

The example values are intentionally fake. In a real cluster, I generate those keys and passwords once, store them in Vault, and keep them stable across chart upgrades. Rotating them is a planned operation, not something I let Helm do accidentally.

External Secrets

The ClusterSecretStore points to Vault over HTTPS with the internal CA trusted by External Secrets Operator.

apiVersion: external-secrets.io/v1
kind: ClusterSecretStore
metadata:
  name: vault-airflow
spec:
  provider:
    vault:
      server: "https://vault.example.internal:8200"
      path: "secret"
      version: "v2"
      caProvider:
        type: ConfigMap
        name: vault-ca
        namespace: external-secrets
        key: ca.crt
      auth:
        kubernetes:
          mountPath: kubernetes
          role: external-secrets-airflow
          serviceAccountRef:
            name: external-secrets
            namespace: external-secrets

I avoid plain HTTP for Vault. The Airflow payload contains high-value secrets, so the Vault connection should be HTTPS even on an internal network.

Each ExternalSecret reads the same Vault payload and templates one Kubernetes Secret from it.

apiVersion: external-secrets.io/v1
kind: ExternalSecret
metadata:
  name: airflow-fernet-key
  namespace: airflow
spec:
  refreshInterval: 2m
  secretStoreRef:
    name: vault-airflow
    kind: ClusterSecretStore
  target:
    name: airflow-fernet-key
    creationPolicy: Owner
    template:
      engineVersion: v2
      data:
        fernet-key: '{{ (fromYaml .config).fernetKey }}'
  data:
    - secretKey: config
      remoteRef:
        key: airflow/config
        property: config

I use the same pattern for airflow-api-secret-key, airflow-jwt-secret, airflow-metadata, airflow-broker-url, airflow-redis-password, airflow-webserver-secret-key, airflow-default-user, and airflow-ssh-secret.

Vault policy

The Airflow role only needs read access to the Airflow path.

vault policy write airflow-secret-read - <<'EOF'
path "secret/data/airflow/*" {
  capabilities = ["read"]
}
EOF

vault write auth/kubernetes/role/external-secrets-airflow \
  bound_service_account_names=external-secrets \
  bound_service_account_namespaces=external-secrets \
  audience=https://kubernetes.default.svc.cluster.local \
  policies=airflow-secret-read \
  ttl=1h

This keeps Airflow’s secret access separate from example-api, example-worker, and example-admin.

Default user job

I also let the chart create or reset the default user from a generated Secret. The Application values can inject environment variables from airflow-default-user.

createUserJob:
  enabled: true
  useHelmHooks: false
  applyCustomEnv: false
  args:
    - bash
    - -ec
    - |
      airflow db check-migrations --migration-wait-timeout=300
      airflow users create \
        --username "${AIRFLOW_DEFAULT_USERNAME}" \
        --firstname "${AIRFLOW_DEFAULT_FIRST_NAME}" \
        --lastname "${AIRFLOW_DEFAULT_LAST_NAME}" \
        --role Admin \
        --email "${AIRFLOW_DEFAULT_EMAIL}" \
        --password "${AIRFLOW_DEFAULT_PASSWORD}" \
      || true
      airflow users reset-password \
        --username "${AIRFLOW_DEFAULT_USERNAME}" \
        --password "${AIRFLOW_DEFAULT_PASSWORD}"

The important part is useHelmHooks: false. I want Argo CD to see and manage the job instead of Helm hiding it behind hook behavior.

Validate

I check generated Secrets first.

kubectl get clustersecretstore vault-airflow
kubectl -n airflow get externalsecret
kubectl -n airflow get secret airflow-fernet-key airflow-metadata airflow-ssh-secret

Then I check the chart resources.

kubectl -n airflow get pods
kubectl -n airflow get jobs
kubectl -n airflow logs job/airflow-create-user --tail=120

If pods start before the Secrets exist, I sync airflow-secrets first and then refresh the Airflow Application.

Common problems

If Airflow keeps generating new keys, I check whether the chart is creating secrets instead of reading the existing Secret names.

If the DAG sync container cannot clone, I check the SSH private key Secret and the verified knownHosts value. I do not disable host key checking just to make Git sync green.

If External Secrets is ready but the Kubernetes Secret is missing, I check the Vault policy path. KV v2 policies need secret/data/airflow/*.

If the default user job fails, I check whether migrations finished before the user command ran. A retry policy helps, but it should not hide a broken database connection string.

Conclusion

Airflow fits GitOps well as long as I keep a hard line between values and secrets. Git describes the chart and Secret names. Vault stores the values. External Secrets turns those values into Kubernetes Secrets. Argo CD applies the chart after that.

That split makes the Airflow install rebuildable without putting the most sensitive pieces into the repository.