Spec 13 — Hostinger runbook

Purpose

docs/runbook.md is solid but covers only IONOS. Hostinger has been running n8n + WAHA + Claude agent + GH runner since 2026-05-08 and is now production-critical for automation. No runbook → 3 AM on-call has to reconstruct from CLAUDE.md + git history.


Rulebook

  1. Runbook is a living doc. Every new alert added to Prometheus rules must have a corresponding recovery section in the relevant runbook within the same PR.
  2. One runbook per VPS. docs/runbook.md for IONOS; new docs/hostinger-runbook.md for Hostinger. (Consolidating later if we get a third VPS — pattern likely is one per host class.)
  3. Always include a copy-paste recovery command. Not just “check logs”; the exact docker compose logs ... invocation.

Implementation plan

Create docs/hostinger-runbook.md with sections mirroring IONOS:

  • ServerDown
  • ContainerCrashLooping (per-container: n8n, waha, traefik, claude-proxy)
  • LowDisk
  • HighMemory
  • HighCPU
  • GH runner offline (hstgr-srv1072950)
  • WAHAContainerDown (already alerted, no runbook)
  • N8N workflow execution failure spike
  • Disaster recovery — restore from backup (spec 01)

Each section: symptom + 3–6 line numbered recovery procedure.

Acceptance criteria

  • docs/hostinger-runbook.md exists with all 9 sections above
  • Each existing alert rule (infrastructure.yml, whatsapp_gateway group) has a corresponding runbook section
  • Cross-link from docs/runbook.md at top: “For Hostinger VPS, see …”

Cost impact

0 €.

Back-out plan

Delete the file.

Risks / open questions

None.