Hooks are one of the most underused features in Argo CD. They let you run Kubernetes jobs at specific stages of a deployment, turning GitOps from “apply YAML” into a full release workflow. After a few production scares last year, mainly schema migrations and feature flags racing each other, I doubled down on hooks. Here’s what actually delivered value.
PreSync: stop bad releases early
PreSync hooks run before Argo CD applies your manifests. I standardised a Job named db-guardrail that checks both connectivity and the database schema drift:
apiVersion: batch/v1
kind: Job
metadata:
name: db-guardrail
annotations:
argocd.argoproj.io/hook: PreSync
spec:
backoffLimit: 0
template:
spec:
restartPolicy: Never
containers:
- name: eva
image: ghcr.io/almogxsh/eva:latest
command: ["eva", "migrate", "--check"]
If this job fails, maybe because the migration wasn’t committed or the database is unreachable, the sync aborts with a clear error. That’s cheaper than discovering the issue after I’ve already taken pods offline.
Bonus: enforce feature flag readiness
Another PreSync job calls my flag service to ensure the new flag state matches what the release expects. It prevents the common “feature enabled in production, but rollout stuck in staging” problem.
Sync and Sync-Wave: orchestrate migrations safely
The Sync stage is where Argo CD applies manifests. By introducing sync waves, I can enforce an order:
metadata:
annotations:
argocd.argoproj.io/sync-wave: "0" # migrations
---
metadata:
annotations:
argocd.argoproj.io/sync-wave: "1" # deployment
---
metadata:
annotations:
argocd.argoproj.io/sync-wave: "2" # canary service
Wave 0 runs a migration job with argocd.argoproj.io/hook: Sync. Only after it finishes successfully do I roll out the Deployment (wave 1), and finally update the Service or Gateway (wave 2). This sequence cut failed rollouts by ~70% because database changes are always in place before the application deploys.
PostSync: verify and self-heal
PostSync hooks are great for automated smoke tests. Mine run K6 tests against the public endpoint and emit metrics to Prometheus:
metadata:
name: rollout-smoke
annotations:
argocd.argoproj.io/hook: PostSync
argocd.argoproj.io/hook-delete-policy: HookSucceeded
If the test fails, Argo CD marks the sync as degraded. I wired a Prometheus alert that pages the on-call engineer and automatically kicks off a rollback by running argocd app rollback with the most recent healthy revision. That rollback is another hook, triggered through an ApplicationSet generator tied to the alert.
Cleanup: don’t leave hooks lying around
Always set hook-delete-policy. I use:
HookSucceededfor checks I don’t need after a successful rollout.BeforeHookCreationon PreSync checks so old jobs are deleted before a new sync starts.
Without it, Kubernetes keeps finished jobs forever and your namespace fills up with hook history that makes troubleshooting harder.
Operational tips I wish I knew earlier
- Name hooks clearly.
pre-db-guardrailbeatsjob-12345. When alarms fire at 2 AM, recognisable names matter. - Log from hooks to the same place as your apps. Central logs help me correlate migrations with deployment issues.
- Use retries wisely. For flaky checks like network reachability, a
backoffLimit: 2can save you from paging just because the DB restart overlapped your sync. - Test locally with Argo CD CLI.
argocd app sync app-name --dry-run --hooksurfaces ordering issues before you touch production.
Argo CD hooks turn GitOps into a full release pipeline without bolting on a separate orchestrator. Start by wrapping risky operations such as database migrations, feature flag flips, or smoke tests and you’ll quickly wonder how you ever shipped without them.