Skip to content

Generations and rollback

A generation is a single, numbered snapshot of a deployed stack's state. Every punix service deploy writes a new generation manifest and then flips a single symlink in one atomic operation. Rollback is "atomically flip back to the previous generation." Constant time, no re-evaluation, no rebuild, no fetching.

This page is what that looks like in practice, and what guarantees you can rely on.

The on-disk layout

For a stack named MyStack, deployed twice:

<deployments_root>/MyStack/
├── gen-001.json         ← first deploy's manifest
├── gen-002.json         ← second deploy's manifest
└── current → gen-002.json    ← the symlink, the one piece of mutable state

That's it. The current symlink is the only piece of observable state that changes during a deploy. Everything else — manifests, new config files, store paths — is staged first and only becomes "live" when the symlink flips.

The generation manifest

gen-NNN.json is what makes rollback possible without re-evaluating the original PCL. Each manifest records:

{
  "generation":   2,
  "stack_name":   "MyStack",
  "deployed_at":  "2026-05-21T14:23:34+00:00",
  "source_hash":  "<sha256 of the PCL source bytes>",
  "backend":      "systemd",
  "store_paths": [
    "/store/abc...-caddy-2.9",
    "/store/def...-forgejo-9.0"
  ],
  "services": [
    {"name": "caddy", "package": "Caddy", "package_path": "/store/...",
     "binary": "caddy", "exec_path": "/store/.../bin/caddy",
     "args": [], "depends_on": [],
     "environment": [{"key": "LOG_LEVEL", "kind": "literal", "value": "info"}]}
  ],
  "config_files": [
    {"path": "/etc/caddy/Caddyfile", "sha256": "..."},
    {"path": "/etc/systemd/system/caddy.service", "sha256": "..."}
  ]
}

The fields that matter for operators:

  • store_paths — exactly the closure that needs to exist on disk for this generation to run. The garbage collector reads this list across all live generations and unions them into the keep-set: nothing in the union is ever collected.
  • source_hash — the SHA-256 of the PCL bytes the deploy evaluated. Lets a manifest answer "what was the source at this point?" without your having to keep the source file around.
  • config_files[].sha256 — the hash of what got written to each config-file path. If /etc/caddy/Caddyfile was mutated out-of-band between deploys, you can detect the drift by comparing the disk file's hash to the manifest's record.
  • services[*].environment{key, kind, …} records. kind: "literal" carries a literal value; kind: "from_env" carries a name (the env-var name, NOT its resolved value). Secret values never appear in the manifest — see Secrets.

The manifest is a complete record of what the deploy did, in JSON that's grep-able and jq-able.

Why rollback is constant-time

Rollback reads the previous generation's gen-NNN.json, walks its store_paths to verify they still exist, and flips the symlink. It does not re-parse the PCL source, re-resolve any module references, re-run any recipe, or re-fetch any source.

In particular: rollback works even if you delete the PCL source. Everything rollback needs lives in the manifest, which lives next to the generation.

The latency of rollback is dominated by one syscall. It does not scale with the size of your dependency closure.

What current points at, and why it matters

current is a symlink. It points at one gen-NNN.json. That file is the rollback contract for "the running deploy is this one."

When a deploy succeeds, current flips to point at the new generation. When you punix service rollback, current flips back to point at the previous one.

A deploy that crashes mid-flight (network, kill, power loss) does NOT leave current in a half-flipped state. The symlink update is one atomic rename syscall — either it happened or it didn't. If it didn't happen, you're still on the previous generation.

The atomicity guarantee in plain language

There is exactly one moment, during a deploy, when "what's running" can change: the rename of the current symlink. Before that rename: previous generation. After that rename: new generation. There is no "during."

Concretely: this means a deploy can fail in any of these places and your services keep running as before —

  • Network error during the closure push.
  • Out-of-disk on the target.
  • The deploy process being SIGKILLed.
  • A config-file write failing (one of /etc/foo/bar.conf couldn't be written).
  • The deploy machine losing power.

In each case, current is still the old generation. The new gen-NNN.json may exist on disk; some config files may have been written to staging paths. None of it is referenced. The incomplete deploy is invisible to running services.

Rollback's two failure modes

"No previous generation"

error: [E12] stack 'MyStack': no previous generation
(current = 1); nothing to roll back to.

current points at gen-001.json — there's no gen-000.json. v0.6 rollback is one-step (going back further is a planned follow-up); workaround for now is to re-deploy from an earlier PCL.

"Pinned store path missing"

error: [E12] stack 'MyStack' gen-001: pinned store path missing:
  /store/abc...-curl-8.5.0

A path the previous generation depends on has been removed — typically because someone force-cleaned the store with the wrong root set. Rollback surfaces this before flipping current, so you don't end up running pointing at a generation whose binaries are gone.

The recovery is to rebuild the missing path. Because the path is content-addressed, the rebuild produces the same hash — so the manifest's pin still resolves after the rebuild.

Deploy: rollback — operator triage and recovery.

Forward roll: deliberate by design

There is no punix service set-current STACK N command in v0.6. To roll forward after a rollback, re-run punix service deploy with the desired PCL — it creates a new generation, and the historical generations stay on disk.

This is intentional. Rollback is the dangerous direction (you might be downgrading a security fix); we make it explicit. A future set-current command will accept "go to any preserved gen" for power-user scenarios.

Generation FIFO trim

Manifest.trim_generations(stack, keep=10) (the default keeps the ten most recent generations) removes older generations once the count exceeds keep. It will never remove the currently-live generation. If current → gen-001.json and you've deployed nine more, gen-001 stays; gen-002 is the trim target.

The keep count is per-stack; bigger services with shorter cadence can override it.

What rollback does NOT do

  • Re-write config files. Rollback flips the symlink; the config files written by the deploy that produced the target generation are presumed still on disk. If you hand-edited /etc/foo/bar.conf between deploys, rollback does not revert that edit. (Detecting the drift is possible via config_files[].sha256; an opt-in "rollback also rewrites configs" is on the roadmap.)
  • Restart services. v0.6's deploy/rollback flow flips current and stops. The operator still runs systemctl daemon-reload && systemctl restart … manually (or the equivalent for the chosen backend). Lifecycle wire-up is Stage 8 work.