Hooks are one of the most underused features in Argo CD. They let you run Kubernetes jobs at specific stages of a deployment, turning GitOps from “apply YAML” into a full release workflow. After a few production scares last year, mainly schema migrations and feature flags racing each other, I doubled down on hooks. Here’s what actually delivered value.

PreSync: stop bad releases early

PreSync hooks run before Argo CD applies your manifests. I standardised a Job named db-guardrail that checks both connectivity and the database schema drift:

apiVersion: batch/v1
kind: Job
metadata:
  name: db-guardrail
  annotations:
    argocd.argoproj.io/hook: PreSync
spec:
  backoffLimit: 0
  template:
    spec:
      restartPolicy: Never
      containers:
        - name: eva
          image: ghcr.io/almogxsh/eva:latest
          command: ["eva", "migrate", "--check"]

If this job fails, maybe because the migration wasn’t committed or the database is unreachable, the sync aborts with a clear error. That’s cheaper than discovering the issue after I’ve already taken pods offline.

Bonus: enforce feature flag readiness

Another PreSync job calls my flag service to ensure the new flag state matches what the release expects. It prevents the common “feature enabled in production, but rollout stuck in staging” problem.

Sync and Sync-Wave: orchestrate migrations safely

The Sync stage is where Argo CD applies manifests. By introducing sync waves, I can enforce an order:

metadata:
  annotations:
    argocd.argoproj.io/sync-wave: "0"   # migrations
---
metadata:
  annotations:
    argocd.argoproj.io/sync-wave: "1"   # deployment
---
metadata:
  annotations:
    argocd.argoproj.io/sync-wave: "2"   # canary service

Wave 0 runs a migration job with argocd.argoproj.io/hook: Sync. Only after it finishes successfully do I roll out the Deployment (wave 1), and finally update the Service or Gateway (wave 2). This sequence cut failed rollouts by ~70% because database changes are always in place before the application deploys.

PostSync: verify and self-heal

PostSync hooks are great for automated smoke tests. Mine run K6 tests against the public endpoint and emit metrics to Prometheus:

metadata:
  name: rollout-smoke
  annotations:
    argocd.argoproj.io/hook: PostSync
    argocd.argoproj.io/hook-delete-policy: HookSucceeded

If the test fails, Argo CD marks the sync as degraded. I wired a Prometheus alert that pages the on-call engineer and automatically kicks off a rollback by running argocd app rollback with the most recent healthy revision. That rollback is another hook, triggered through an ApplicationSet generator tied to the alert.

Cleanup: don’t leave hooks lying around

Always set hook-delete-policy. I use:

  • HookSucceeded for checks I don’t need after a successful rollout.
  • BeforeHookCreation on PreSync checks so old jobs are deleted before a new sync starts.

Without it, Kubernetes keeps finished jobs forever and your namespace fills up with hook history that makes troubleshooting harder.

Operational tips I wish I knew earlier

  1. Name hooks clearly. pre-db-guardrail beats job-12345. When alarms fire at 2 AM, recognisable names matter.
  2. Log from hooks to the same place as your apps. Central logs help me correlate migrations with deployment issues.
  3. Use retries wisely. For flaky checks like network reachability, a backoffLimit: 2 can save you from paging just because the DB restart overlapped your sync.
  4. Test locally with Argo CD CLI. argocd app sync app-name --dry-run --hook surfaces ordering issues before you touch production.

Argo CD hooks turn GitOps into a full release pipeline without bolting on a separate orchestrator. Start by wrapping risky operations such as database migrations, feature flag flips, or smoke tests and you’ll quickly wonder how you ever shipped without them.