Secrets management — operator guide

This is the day-to-day reference for working with secrets in p24-infra. For the design rationale see docs/improvements/03-secrets-management.md.


What is sops + age?

sops is Mozilla’s “secrets operations” tool — a CLI that encrypts/decrypts structured files (yaml, json, env) leaving the keys readable but the values encrypted. We use it with age, a modern minimalist alternative to PGP. Together they give us:

  • Plaintext keys, encrypted values — diffs in git remain readable (you can see which secret changed, not the value).
  • Multi-recipient encryption — every file is encrypted to N age public keys (the dev’s machine, each VPS, and the GH Actions runner). Any single recipient can decrypt.
  • Auditability — every rotation is a git commit. git blame shows when and why each secret last moved.
  • No SaaS, no per-seat fees, no vendor lock-in.

The three tiers

TierStorageExamplesRotation
1 — OIDCnothing stored anywhere — short-lived JWT issued per workflow runVercel deploys, Cloudflare DNS edits (future), Wasabi STS (future)never (per-run)
2 — Auto-rotatingrefresh token onlyClaude Code OAuth, GitHub App installation tokensonly on account compromise / quarterly hygiene
3 — sops-encrypted staticsecrets/*.sops.yaml in gitSMTP password, Grafana admin pw, Anthropic API keymanual — but one place, one PR, audited

Rule: move every secret as high up this hierarchy as the upstream supports.


Repository layout

.sops.yaml                          recipient rules — which age pubkeys decrypt which file
secrets/
├── shared.sops.yaml                used on multiple targets (SMTP, Wasabi, Supabase, …)
├── vps-i1.sops.yaml                IONOS-only (Grafana admin, Cloudflare token)
├── vps-h1.sops.yaml                Hostinger-only (n8n encryption key, WAHA api key)
└── github-actions.sops.yaml        sync'd into GH Secrets for workflow consumption

scripts/git-hooks/pre-commit        rejects plaintext-secret commits
.github/workflows/secrets-sync.yml  CI: on push to main touching secrets/, decrypts and ships

Daily tasks

View a secret

sops -d secrets/shared.sops.yaml

Decrypts and prints to stdout. Requires your personal age private key at ~/.age/personal.key (Windows: C:\Users\konar\.age\personal.key) plus SOPS_AGE_KEY_FILE=~/.age/personal.key if not in the default location.

Add or change a secret

sops edit secrets/shared.sops.yaml

Opens your $EDITOR with decrypted content. Save + exit → sops re-encrypts before writing. Then:

git add secrets/shared.sops.yaml
git commit -m "feat(secrets): add pdf_service_api_key"
git push

The secrets-sync workflow picks it up on push to main and ships to the relevant VPS(es) within ~2 minutes.

Add a new recipient (developer or VPS)

  1. Generate keypair on target host:

    age-keygen -o ~/.age/<name>.key       # Linux/macOS
    # Windows PowerShell:
    age-keygen -o $env:USERPROFILE\.age\<name>.key

    The command prints Public key: age1.... Save the private key in 1Password under “p24-infra age — ”. Note the public key.

  2. Add public key to .sops.yaml in every block that should be decryptable by this recipient.

  3. Re-encrypt existing files for the new recipient:

    sops updatekeys secrets/shared.sops.yaml
    sops updatekeys secrets/vps-i1.sops.yaml
    # …etc, only the files that include this recipient in the rules
  4. Commit + push. The new recipient can now sops -d locally.

Rotate a secret

sops edit secrets/vps-i1.sops.yaml     # change the value, save
git commit -am "fix(secrets): rotate grafana_admin_password"
git push

Then append a row to docs/secrets-rotation-log.md (date, secret, reason, rotator). The secrets-sync workflow auto-deploys.


Emergency response — “I think a secret leaked”

Treat any secret that touched an LLM session, screen-share, public chat, public commit, or paste-bin as compromised — rotate within 24 h, no exceptions.

Step 1 — assess blast radius

# Where is this secret used?
grep -rln <SECRET_NAME> .
# Look at git history for accidental leaks
git log -p --all -S '<first-8-chars-of-value>'

Step 2 — rotate at the source

  • API token? Revoke in provider dashboard → generate new → paste into sops edit secrets/<file>.sops.yaml.
  • SSH key? Generate new keypair, replace authorized_keys on every host, update secrets/… with new private key.
  • Password? Change in upstream, update sops, redeploy.

Step 3 — verify deploy

gh run watch                                       # secrets-sync workflow
ssh root@<vps> 'sudo cat /opt/p24-infra/monitoring/.env | grep -c <KEY_NAME>'   # must be 1

Step 4 — log

Append to docs/secrets-rotation-log.md:

| 2026-05-12 | GRAFANA_ADMIN_PASSWORD | leaked in Claude session | radieu | yes |

Step 5 — if the secret was committed in plaintext at any point

Even after rotation, assume the old value is permanently public. git push --force can scrub history but only blocks naïve scrapers — caches, forks, and CI logs may retain it. The only safe path is rotate. History cleanup is optional and best done with git filter-repo; coordinate with the team before force-pushing main.


How secrets-sync.yml works

On push to main touching secrets/** or .sops.yaml:

  1. sync-vps-i1 — checks out repo, installs sops, decrypts shared + vps-i1 with AGE_KEY_GHA, converts yaml → KEY=value lines (uppercased), scps as claude-admin to /tmp/monitoring.env.new on vps-i1, then sudo install to /opt/p24-infra/monitoring/.env (atomic), then docker compose up -d to roll services.
  2. sync-vps-h1 — same pattern but to Hostinger. Currently a TODO until claude-admin user exists there (or we use VPS_ROOT_SSH_KEY).
  3. sync-github-secrets — decrypts github-actions.sops.yaml, iterates, calls gh secret set on radieu/p24-infra for each entry.

What lives where

  • Real age private keys:

    • ~/.age/personal.key on the dev machine, backed up to 1Password.
    • /root/.age/secrets.key on each VPS (mode 0600).
    • GitHub Actions: the env-var AGE_KEY materialised from secret AGE_KEY_GHA at runtime (never written to disk except ~/.age/keys.txt inside the ephemeral runner).
  • Public keys: in .sops.yaml only — never sensitive.

  • Derived .env files on VPSes: regenerated by CI, not edited by hand. Treat them as cache.


See also