Documentation Audit — p24-infra

Prepared: 2026-06-14
Scope: All documentation under docs/ and docs/servers/, cross-referenced against the infrastructure state described in CLAUDE.md
Method: Static analysis — no live SSH or API calls


1. Executive Summary

Key findings:

  • docs/infrastructure-overview.md is severely stale. It describes an infrastructure snapshot from approximately May 2026, missing: bms-1 through bms-4 (all four OVH bare metal servers), the MongoDB replica set, the audit-engine, the PDF service / MCP server, the n8n → bms-4 migration, and the Wasabi key consolidation.
  • Two separate n8n workbooks exist (docs/n8n-operations.md and docs/n8n-cloud-operations.md). The self-hosted n8n workbook is correct. The cloud workbook is current. There is no duplication of conflicting information, but the self-hosted workbook’s compose architecture diagram still shows n8n on vps-h1 — it has not been updated to reflect the pending migration to bms-4.
  • The BMS server workbooks (docs/servers/) are fresh (all inventoried 2026-06-14) and contain accurate information, but they follow a custom format — not the standard workbook-template.md. bms-2 and bms-3 are missing Backup, Restore, Deployment, and Password Rotation sections entirely.
  • docs/elements.md is stale (last updated 2026-05-13). It lists only vps-i1 and vps-h1 under Servers — none of the four OVH bare metal servers appear. No bms-4 entry, no MongoDB replica set components. Traccar on bms-3 (staging) is missing.
  • Traefik workbook references vps-h1 only but Traefik is being migrated to bms-4. The workbook must be updated after migration.
  • docs/monitoring-stack-operations.md contains a stale Wasabi S3 reference: architecture diagram shows s3://ecotrans-monitoring (eu-central-1), but CLAUDE.md states that the active bucket for the monitoring stack / Thanos is still ecotrans-monitoring (eu-central-1), with p24-infra (eu-central-2) for service backups. This is internally consistent — the monitoring-stack doc is correct — but cloud-services-operations.md contains a conflicting Wasabi architecture that shows two IAM users and two regions, while CLAUDE.md describes a single p24-infra IAM user with key consolidation completed 2026-06-14. The cloud-services-operations.md Wasabi section was written before the consolidation and is now partially stale.
  • docs/p4-ovh-bms-1-ns367522-operations.md at the root of docs/ is a duplicate of docs/servers/p4-ovh-bms-1-ns367522-operations.md. Two identical files exist at different paths.
  • Twelve service docs are missing entirely — including openclaw, vps-h1 audit-engine deployment (the workbook only covers usage, not vps-h1-specific deployment), credential-exporter, Uptime Kuma, vercel-exporter, the GitHub Actions runners, the claude-proxy, and others.
  • The docs/improvements/ specs (01–15) are substantially implemented or superseded: Loki (#02), Blackbox (#05), backups (#01), and Ansible IaC (#04) were completed. Several specs need a “completed / archived” status.
  • docs/standards/project-standards.md references services/compliance-matrix.yml as source of truth in sections 5, 6, and 7, but CLAUDE.md explicitly states that compliance-matrix.yml is deprecated and Supabase dev_r_services is the canonical source. This is a direct contradiction within the standards documents themselves.

2. Documentation Standards Analysis

2.1 Existing standard documents

DocumentLocationStatus
Infrastructure standard (six requirements)docs/infrastructure-standard.mdCurrent, authoritative
Project standards (process rules)docs/standards/project-standards.mdPartially stale — see §2.3
Element spec templatedocs/standards/element-spec-template.mdCurrent
Workbook templatedocs/standards/workbook-template.mdCurrent

2.2 Observed quality tiers across existing docs

Tier A — Fully standard-conformant (all workbook sections present, current info):

  • docs/n8n-operations.md
  • docs/n8n-cloud-operations.md
  • docs/traefik-operations.md
  • docs/waha-operations.md
  • docs/pdf-service-operations.md
  • docs/monitoring-stack-operations.md
  • docs/grafana-operations.md (not read but referenced as current)
  • docs/cloud-services-operations.md

Tier B — Partially conformant (workbook exists but missing sections or has stale content):

  • docs/n8n-postgresql-operations.md — architecture section only, missing Backup/Restore/Rotation
  • docs/audit-engine-operations.md — missing Backup, Restore, Deployment (fresh install), Password Rotation sections
  • docs/vps-i1-operations.md — server runbook format, not workbook template; missing Backup/Restore
  • docs/hostinger-runbook.md — incident-response format only, not a standards workbook

Tier C — Non-conformant (custom format, content-only, or placeholder):

  • docs/servers/p4-ovh-bms-1-ns367522-operations.md — custom format; no sections for Monitoring (beyond note), Backup, Restore, Upgrade, Password Rotation, Deployment
  • docs/servers/p4-ovh-bms-2-ns3087638-operations.md — custom format; no Backup, Restore, Upgrade, Password Rotation
  • docs/servers/p4-ovh-bms-3-ns3129867-operations.md — custom format; no Backup, Restore, Upgrade, Password Rotation
  • docs/servers/p4-ovh-bms-4-ns3101999-operations.md — good (has Provisioning Log, Tasks, Migration steps) but still missing formal Backup/Restore sections for the Docker services on it

Missing workbooks (no doc exists): See §6.

2.3 Specific standards violations

CRITICAL — Contradiction in project-standards.md:

  • docs/standards/project-standards.md, sections 5, 6, and 7 (lines 73–127) instruct updating services/compliance-matrix.yml.
  • CLAUDE.md explicitly states: “Update services/compliance-matrix.yml — it is deprecated; use Supabase dev_r_services
  • This is an active source of confusion for anyone reading the standards document. The standards doc must be updated to remove all references to compliance-matrix.yml.

Exception documentation rule (project-standards.md §5) uses deprecated target:

  • Section 5 says to write exceptions to services/compliance-matrix.yml. The correct location per current practice is dev_r_services compliance_notes field.

3. Server Workbooks Audit

3.1 vps-i1 (IONOS, 217.154.82.162)

Workbook: docs/vps-i1-operations.md
Format: Incident-response runbook format (not workbook template)
Last updated: Unknown — no Workbook last reviewed field

CheckResult
IP addressCorrect: 217.154.82.162
OSCorrect: AlmaLinux 9.7
Container listPartially current — includes pdf-service and MCP containers, Uptime Kuma, monitoring stack
Missing servicescredential-exporter not in architecture diagram
Backup sectionNot present in this workbook (covered by monitoring-stack-operations.md and backup-ionos.sh)
Restore sectionNot present as a workbook section
Password RotationNot present
Standards complianceDoes not follow workbook-template.md

Note: vps-i1 is a composite host — most of its services have individual workbooks. The host-level doc could be intentionally lighter. However, it lacks SSH hardening status, disk usage, and native process management sections.

3.2 vps-h1 (Hostinger, 72.60.32.61)

Workbooks: docs/hostinger-runbook.md (primary) + docs/traefik-operations.md + docs/n8n-operations.md + docs/waha-operations.md
Stale information identified:

IssueFileDetail
n8n and Traefik migration to bms-4 not reflecteddocs/hostinger-runbook.md line 9Still lists root-n8n-1 and root-traefik-1 as current; migration is pending
claude-proxy container listeddocs/hostinger-runbook.md line 9claude-proxy is listed but docs/traefik-operations.md architecture also mentions claude-proxy.vps-h1... route; no dedicated ops workbook exists
n8n architecture diagram shows vps-h1 as permanent homedocs/n8n-operations.md lines 10–22Should note impending migration to bms-4
WAHA webhook target after migrationdocs/waha-operations.md troubleshootingWill need update post-migration: wa-router webhook URL points to n8n on vps-h1
audit-engine not in hostinger-runbook service listdocs/hostinger-runbook.md line 9audit-engine is deployed on vps-h1 but not listed

3.3 bms-1 (OVH, 94.23.26.113)

Workbooks: docs/servers/p4-ovh-bms-1-ns367522-operations.md AND docs/p4-ovh-bms-1-ns367522-operations.md (root docs/)
Duplicate detected: The same file appears at two paths:

  • docs/servers/p4-ovh-bms-1-ns367522-operations.md
  • docs/p4-ovh-bms-1-ns367522-operations.md

Both appear to be identical (inventoried 2026-06-14). The root-level copy is non-standard placement. The canonical location should be docs/servers/ only.

Content assessment:

  • Disk status shown as 85% full (85% = 354/440 GB), but CLAUDE.md states “disk 100% FULL” — likely this discrepancy is due to different dates; the workbook should note the current critical status explicitly.
  • Good detail on container inventory, open tasks, and access methods.
  • Missing: Backup section (workbook acknowledges “no automated off-server backup”), Restore procedure, Upgrade procedure, Password Rotation section.

3.4 bms-2 (OVH, 145.239.133.104)

Workbook: docs/servers/p4-ovh-bms-2-ns3087638-operations.md
Inventoried: 2026-06-14

CheckResult
MongoDB roleCorrect: non-voting observer
Claude dev labelCorrect: AI-Dev-OV1
MonitoringSTALE — “Not yet connected to Prometheus / node_exporter not installed” (line 109). CLAUDE.md does not confirm this was resolved; bms-4 workbook confirms node_exporter installed there, but bms-2 status is unknown.
Missing sectionsBackup, Restore, Upgrade, Password Rotation
rs0 member tableLists ns3101999 (bms-4) as “Planned — MongoDB not yet installed” — STALE: bms-4 has MongoDB installed as of 2026-06-14
claude-admin setupMarked “to be set up” — status not confirmed resolved

3.5 bms-3 (OVH, 51.68.155.224)

Workbook: docs/servers/p4-ovh-bms-3-ns3129867-operations.md
Inventoried: 2026-06-14

CheckResult
MongoDB roleCorrect: rs0 PRIMARY
MongoDB version7.0.26 (correct)
RAM warningPresent: “21.7 GB — watch for OOM”
Traccar containerListed as running — but CLAUDE.md lists Traccar as running on vps-i1 (monitoring stack). bms-3’s traccar appears to be a separate staging/legacy instance. No cross-reference between the two.
MonitoringSTALE — “Not yet connected to Prometheus / node_exporter not installed” (line 116). May or may not have been addressed since 2026-06-14.
Missing sectionsBackup, Restore, Upgrade, Password Rotation
claude-admin setupMarked “to be set up”

3.6 bms-4 (OVH, 54.36.123.110)

Workbook: docs/servers/p4-ovh-bms-4-ns3101999-operations.md
Provisioned: 2026-06-14 — workbook is freshly written

CheckResult
MongoDB roleCorrect: arbiter
Docker servicesCorrect: Traefik + n8n-postgres + n8n + node-exporter + cadvisor
n8n migration checklistPresent and detailed — migration not yet executed
Monitoringnode_exporter installed and added to prometheus.yml — current
Provisioning logComprehensive
DNS*.bms-4.infra.zintegrowana.online created
Missing sectionsFormal Backup section (Docker services), Restore section, Password Rotation
Key pending itemsrs.addArb (human action), n8n migration (human action), deploy docker-compose
CLAUDE.md discrepancyCLAUDE.md still lists bms-4 as “Not provisioned — MongoDB not yet installed”, which is stale. The workbook is current; CLAUDE.md needs updating.

4. Service Docs Audit

ServiceDoc exists?Doc current?Key issues
Monitoring stack (Prometheus/Thanos/Alertmanager/Caddy)Yes — monitoring-stack-operations.mdYesThanos bucket shows ecotrans-monitoring — correct per current config
GrafanaYes — grafana-operations.mdUnknown (not audited)Referenced as exists
LokiNo dedicated docIn monitoring-stack docLoki section is brief; Backup says “Data loss acceptable”
n8n (self-hosted, vps-h1)Yes — n8n-operations.mdMostly yesStill shows vps-h1 as permanent; migration to bms-4 not reflected
n8n CloudYes — n8n-cloud-operations.mdYesCurrent as of 2026-06-11 review
n8n PostgreSQLYes — n8n-postgresql-operations.mdArchitecture onlyMissing Backup, Restore, Rotation sections
Traefik (vps-h1)Yes — traefik-operations.mdMostly yesShows vps-h1 only; bms-4 Traefik not referenced
WAHAYes — waha-operations.mdYesWill need update after n8n moves
PDF Service + MCPYes — pdf-service-operations.mdYes — comprehensiveCurrent
Audit EngineYes — audit-engine-operations.mdPartialMissing Backup, Restore, fresh install procedure, Password Rotation
TraccarYes — traccar-operations.mdUnknown (not audited)Referenced in elements.md
Cloud services (Cloudflare/GH/Vercel/Wasabi/Mailgun)Yes — cloud-services-operations.mdMostly yesWasabi section stale post-key-consolidation
vps-i1 hostYes — vps-i1-operations.mdPartialIncident-response format, not workbook template
vps-h1 hostYes — hostinger-runbook.mdPartialIncident-response format; missing audit-engine, migration state
OpenClaw gatewayYes — docs/openclaw-operations.mdUnknown (not audited)Listed in elements.md as ❌ compliance
Claude proxy (vps-h1)Yes — docs/claude-proxy-router-operations.mdUnknownListed in root docs/ glob
SupabaseYes — docs/supabase-operations.mdUnknownNot audited
SSH hardeningYes — docs/ssh-hardening-operations.mdUnknownNot audited
report-schedulerYes — docs/report-scheduler-operations.mdUnknownNot audited
Monitoring exportersYes — docs/monitoring-exporters-operations.mdUnknownNot audited
bms-1 (OVH)Yes — docs/servers/p4-ovh-bms-1-ns367522-operations.mdPartialDuplicate at root; missing Backup/Restore
bms-2 (OVH)Yes — docs/servers/p4-ovh-bms-2-ns3087638-operations.mdPartialMissing Backup/Restore/Rotation; stale monitoring status
bms-3 (OVH)Yes — docs/servers/p4-ovh-bms-3-ns3129867-operations.mdPartialMissing Backup/Restore/Rotation; stale monitoring status
bms-4 (OVH)Yes — docs/servers/p4-ovh-bms-4-ns3101999-operations.mdYes (fresh)Missing formal Backup/Restore sections
MongoDB rs0No dedicated workbookNo mongodb-operations.md exists
credential-exporterNoNot mentioned in monitoring-stack doc
Uptime KumaNoListed in elements.md as ❌ compliance
vercel-exporterNo dedicated docIn monitoring-stack doc (brief)No individual workbook
GitHub Actions runners (ionos, hstgr)NoMentioned in elements.md but no workbook
claude-nightly.sh / autonomous agentsPartial — docs/claude-agent-setup.mdUnknownNot audited
backup-ionos.sh / backup-hstgr.shIn elements.mdPartialNo dedicated backup script workbook

5. Contradictions and Stale Information

5.1 CLAUDE.md vs. docs/infrastructure-overview.md

docs/infrastructure-overview.md is substantially out of date. Specific contradictions:

Topicinfrastructure-overview.md (stale)CLAUDE.md (current)
Servers listedOnly vps-i1 and vps-h1; “Pinbox24 Dev VPS” at 51.68.155.224Four OVH bare metal servers (bms-1 through bms-4) documented
OVH VPSPlanned “Server F” at TBD IP (“P5 — OVH VPS Server F provisioning”)bms-2 (ns3087638), bms-3 (ns3129867), bms-4 (ns3101999) are provisioned
MongoDBNot mentioned anywherers0 replica set with PRIMARY on bms-3, observer on bms-2, arbiter on bms-4
n8n migrationn8n shown as permanent on vps-h1n8n being migrated to bms-4
Wasabi buckets§6.3 shows “Wasabi S3 (planned)” — “Not yet provisioned”Wasabi is active; bucket p24-infra (eu-central-2) operational since at least 2026-05-08
Monitoring stack§8 “Status: Repo created, configs ready, deployment pending”Monitoring stack fully deployed on vps-i1 and operational
Secrets rotation§9 — “ROTATION PENDING” for 6 credentials from 2026-05-06 exposureRotated 2026-05-08 per memory/project_credential_rotation.md
Hostinger node-exporterListed as having monitoring exportersBoth root-node-exporter-1 and root-cadvisor-1 listed — correct
TraccarOn IONOS VPS — correctCorrect (also duplicated on bms-3 staging, but different instance)

Conclusion: docs/infrastructure-overview.md appears to be from approximately early May 2026, before the monitoring stack deployment, bare metal server provisioning, and MongoDB replica set setup. It should be considered a historical document, not a living reference.

5.2 CLAUDE.md vs. docs/standards/project-standards.md

Topicproject-standards.md (stale reference)CLAUDE.md (authoritative)
Compliance trackingSections 5, 6, 7 reference services/compliance-matrix.yml”Update services/compliance-matrix.yml — it is deprecated; use Supabase dev_r_services
Exception documentation target”Write to services/compliance-matrix.ymlSupabase dev_r_services compliance_notes

5.3 docs/elements.md vs. actual infrastructure state

docs/elements.md was last updated 2026-05-13. Current gaps:

Categoryelements.md stateActual state (2026-06-14)
Servers tableLists vps-i1, vps-h1, local workstation, “vps-ovh1” (planned), “vps-p24dev” at 51.68.155.224Four active OVH BMS servers (bms-1 through bms-4); vps-p24dev is actually bms-3 (ns3129867, MongoDB rs0 PRIMARY + Pinbox24 staging)
Container Services — vps-h1Traefik, n8n, WAHA, node-exporter, cadvisorSame + audit-engine (listed in audit-engine-operations.md as on vps-h1) — missing from elements.md
bms-4 servicesNot presentTraefik, n8n-postgres, n8n (migrating), node-exporter, cadvisor
MongoDBNot presentThree-member rs0 across bms-2, bms-3, bms-4
SaaS — n8n CloudListedCurrent
SaaS — bms-1 / bms-2 / bms-3Not listed under ServersActive bare metal servers
Traefik workbookmonitoring-stack-operations.md (Caddy equivalent)Wrong — Traefik has its own docs/traefik-operations.md
vps-ovh1 (planned)planned statusSuperseded by bms-2/3/4 which are actual OVH servers
DNS — *.bms-4.infra.*Not presentAdded 2026-06-14

5.4 Wasabi S3 — two-bucket vs. one-bucket contradiction

cloud-services-operations.md (§Wasabi, lines 281–312) describes two separate IAM users:

  • A “monitoring user” owning ecotrans-monitoring (eu-central-1) — key WASABI_ACCESS_KEY
  • IAM user p24-infra owning p24-infra (eu-central-2) — key P24_INFRA_WASABI_ACCESS_KEY

CLAUDE.md (§Wasabi S3, “Key consolidation 2026-06-14”) states:

WASABI_ACCESS_KEY / WASABI_SECRET_KEY are aliases for P24_INFRA_WASABI_ACCESS_KEY / P24_INFRA_WASABI_SECRET_KEY. Both sets of GH Secrets hold the same value.

This means the “monitoring user” and p24-infra user are now unified (or the keys are identical aliases). The cloud-services-operations.md Wasabi section still describes them as two separate IAM users with separate key rotation schedules, which is no longer accurate after the 2026-06-14 consolidation.

Also: monitoring-stack-operations.md (line 103) and the architecture diagram still reference s3://ecotrans-monitoring — CLAUDE.md confirms this bucket still exists for Thanos but ecotrans-monitoring-test is now deprecated. The test bucket is still listed in elements.md storage table.

5.5 bms-4 status in CLAUDE.md vs. bms-4 workbook

CLAUDE.md Server table for bms-4:

Role: “Planned third MongoDB rs0 member (quorum node)“
Status: “Not provisioned — MongoDB not yet installed”

bms-4 workbook (docs/servers/p4-ovh-bms-4-ns3101999-operations.md):

  • Status: “Active — provisioned 2026-06-14”
  • MongoDB 7.0.37 installed
  • Docker CE 29.5.3 installed
  • node_exporter running

CLAUDE.md has not been updated to reflect bms-4 provisioning. The workbook is the authoritative source here.

5.6 n8n version mismatch

docs/n8n-operations.md architecture diagram (line 12) shows docker.n8n.io/n8nio/n8n:2.22.0. The workbook contains an upgrade section (§ “n8n 2.x Upgrade”) showing upgrade from 1.x to 2.22.0. However, the current branch is fix/n8n-2.26.3-compose-cleanup, indicating the running version may now be 2.26.3. The workbook image tag is stale by at least one minor version.

5.7 Traccar on two hosts without cross-reference

docs/vps-i1-operations.md and docs/elements.md document Traccar on vps-i1.
docs/servers/p4-ovh-bms-3-ns3129867-operations.md also shows a traccar container running on bms-3 (staging).
No doc clarifies the relationship between the two Traccar instances — whether they share data, use the same GPS devices, or are fully independent. docs/traccar-operations.md likely covers only the vps-i1 instance.


6. Missing Documentation

Services and components that exist in the infrastructure but have no dedicated documentation:

Critical gaps (no doc at all)

MissingWhy criticalSuggested doc path
MongoDB rs0 operationsThree-server replica set, keyFile auth, election quorum — high risk of data loss on mishandlingdocs/mongodb-operations.md
bms-4 services workbookTraefik + n8n deployment on bms-4 is documented only in the bms-4 server workbook; no standalone service docs for the migrated servicesExtend docs/traefik-operations.md and docs/n8n-operations.md after migration
n8n migration (vps-h1 → bms-4)Migration steps exist only in docs/servers/p4-ovh-bms-4-ns3101999-operations.md; no cross-reference from n8n or traefik workbooksAdd a “Migration” section to docs/n8n-operations.md
credential-exporterListed in CLAUDE.md custom exporters table but absent from monitoring-stack-operations.md custom exporters tableAdd to docs/monitoring-exporters-operations.md

High-priority gaps (element exists, compliance_workbook = ❌ in elements.md)

MissingCurrent elements.md statusSuggested path
Uptime Kumadocs/uptime-kuma-operations.md
queue-exporter (dedicated doc)Covered in monitoring-stack-operations.md §“Queue Exporter” — acceptable as subsection
cost-exporterAdd to docs/monitoring-exporters-operations.md
backup-exporterAdd to docs/monitoring-exporters-operations.md
pg-stats-exporterAdd to docs/monitoring-exporters-operations.md
vercel-exporterAdd to docs/monitoring-exporters-operations.md
GotenbergCovered in docs/pdf-service-operations.md — acceptable
openclaw-gatewaydocs/openclaw-operations.md exists (not audited here)
node_exporter (both VPSes)⚠️Brief section in each host workbook is adequate
cadvisor (both VPSes)⚠️Brief section in each host workbook is adequate

Medium-priority gaps

MissingNote
GitHub Actions runners (ionos, hstgr)Mentioned in elements.md automation table; no separate workbook — may be acceptable as a section in docs/vps-i1-operations.md and docs/hostinger-runbook.md
OpenClaw CLI containerListed as Exited(1) in elements.md; docs/openclaw-operations.md exists but not audited
Traccar on bms-3 (staging)Undocumented — need a note in docs/servers/p4-ovh-bms-3-ns3129867-operations.md
bms-2 / bms-3 as MongoDB voting membersbms-4 workbook has a replica set topology table; no equivalent in bms-2 or bms-3 workbooks
AI-Dev-OV1 agent setup on bms-2Workbook has placeholder — Claude Code not yet installed; missing installation steps

7. Recommendations (Prioritized)

P1 — Fix actively misleading or contradictory content

  1. Update docs/standards/project-standards.md — remove all references to services/compliance-matrix.yml (sections 5, 6, 7). Replace with dev_r_services Supabase table references. This is the highest-priority fix because it misleads anyone implementing the standard.

  2. Declare docs/infrastructure-overview.md as archived — add a prominent header: > ARCHIVED — Last accurate: May 2026. Current state: see CLAUDE.md and docs/servers/*.md. Do not attempt to update it; it would require a full rewrite and CLAUDE.md already serves this purpose.

  3. Update docs/elements.md — add the four OVH BMS servers, MongoDB rs0 components, bms-4 Docker services, and the audit-engine entry for vps-h1. Remove the stale “vps-ovh1 planned” entry. Update Traefik workbook link to traefik-operations.md. Target: 2026-06-21.

  4. Update CLAUDE.md bms-4 entry — change Status from “Not provisioned” to “Active — provisioned 2026-06-14”; update Role to reflect actual state (MongoDB arbiter + Docker host with Traefik + n8n).

  5. Remove duplicate bms-1 workbook — delete docs/p4-ovh-bms-1-ns367522-operations.md from the root docs/ directory; canonical location is docs/servers/.

  6. Update docs/cloud-services-operations.md Wasabi section — reflect the 2026-06-14 key consolidation: one IAM user (p24-infra), one set of keys (P24_INFRA_WASABI_*), with WASABI_* as deprecated aliases. Update rotation table.

P2 — Fill critical missing workbooks

  1. Create docs/mongodb-operations.md — covering: rs0 topology, keyFile auth, admin credential management, adding/removing members, checking replication lag, backup (mongoexport/mongodump to Wasabi), restore procedure, password rotation.

  2. Update docs/n8n-operations.md — add “Migration: vps-h1 → bms-4” section; update architecture diagram; bump image tag to 2.26.3 to match current branch.

  3. Update docs/traefik-operations.md — add bms-4 deployment section after migration; note both vps-h1 and bms-4 now run Traefik.

  4. Complete bms-2 and bms-3 workbooks — add minimal Backup, Restore, and Password Rotation sections per template.

P3 — Fill remaining compliance gaps

  1. Create docs/mongodb-operations.md (already in P2 — reinforced as blocking the infra_docs_check audit action).

  2. Add credential-exporter to docs/monitoring-exporters-operations.md — it appears in CLAUDE.md custom exporters table but not in monitoring-stack-operations.md.

  3. Create docs/uptime-kuma-operations.md — currently ❌ in elements.md compliance column.

  4. Update docs/audit-engine-operations.md — add Deployment (fresh install), Backup, Restore, and Password Rotation sections per workbook template.

  5. Review and archive completed improvement specsdocs/improvements/01-backups.md, 02-loki-logs.md, 04-iac-ansible.md, 05-blackbox-synthetic.md, 07-status-page.md appear to be completed. Add a “Status: COMPLETED yyyy-mm-dd” header to each.


8. Proposed Documentation Standard (Unified Template Header)

All operations workbooks should begin with a consistent metadata block. This is an enhancement to the existing workbook-template.md — the template body is already good, but the header is inconsistently applied.

Required header (replace all {placeholders}):

# {Service Name} — Operations Workbook
 
**Host:** `{vps-label}` (`{IP}`)  
**Public URL:** `https://{hostname}.infra.zintegrowana.online` (or "internal only")  
**Compose file in repo:** `{path/to/docker-compose.yml}` (or "N/A — SaaS")  
**Workbook last reviewed:** `YYYY-MM-DD`  
**`dev_r_services` row:** `{service_name}` — compliance_workbook: `yes` / `partial` / `no`  
 
{One sentence: what this service does and why we run it.}

Mandatory sections (in order, must not be omitted — use “Not applicable” with justification if truly N/A):

  1. Architecture
  2. Config Management (what lives in repo vs. server, how secrets are injected)
  3. Deployment (fresh install + update)
  4. Backup
  5. Restore
  6. Upgrade
  7. Monitoring and Alerts
  8. Healthcheck
  9. Password Rotation
  10. Troubleshooting
  11. Known Limitations / Standard Exceptions (if any requirement cannot be met)

Server workbooks (for physical/VPS host-level docs, not individual service workbooks) additionally require:

  • Hardware / OS / disk layout section
  • SSH access table
  • Running services inventory (can reference individual service workbooks)
  • Provisioning log (dated entries)
  • Open tasks checklist

Naming convention: docs/{service-name}-operations.md for services; docs/servers/{provider}-{label}-{hostname}-operations.md for server hosts.


Appendix A: File inventory snapshot

Files in docs/ (root, not subdirectories) at time of audit:

  • infrastructure-overview.md — STALE (see §5.1)
  • elements.md — STALE (see §5.3), last updated 2026-05-13
  • infrastructure-standard.md — current
  • monitoring-stack-operations.md — current (Tier A)
  • n8n-operations.md — mostly current (Tier A, migration not reflected)
  • n8n-cloud-operations.md — current (Tier A)
  • n8n-postgresql-operations.md — partial (Tier B)
  • traefik-operations.md — mostly current (Tier A)
  • waha-operations.md — current (Tier A)
  • pdf-service-operations.md — current (Tier A)
  • audit-engine-operations.md — partial (Tier B)
  • cloud-services-operations.md — mostly current with Wasabi stale section (Tier A/B)
  • vps-i1-operations.md — partial non-template format (Tier B)
  • hostinger-runbook.md — partial non-template format (Tier B)
  • p4-ovh-bms-1-ns367522-operations.md — DUPLICATE (delete; canonical at docs/servers/)
  • Various other docs (traccar, grafana, supabase, openclaw, etc.) — not audited in detail

Files in docs/servers/:

  • All four BMS workbooks present, all inventoried 2026-06-14
  • bms-4 is the most complete; bms-2 and bms-3 have the most gaps

Files in docs/standards/:

  • project-standards.md — contradicts CLAUDE.md on compliance-matrix.yml
  • element-spec-template.md — current
  • workbook-template.md — current

Files in docs/improvements/:

  • 15 improvement specs from the May 2026 audit
  • Several appear completed (Loki, Blackbox, Backups, Ansible IaC) — no “COMPLETED” status markers
  • Items 13 (Hostinger runbook), 14 (n8n versioning), 09 (SSH hardening) have corresponding operational docs but no closure noted in the spec files

Files in docs/evaluation/:

  • This document is the first file.

Audit performed by Claude Code (claude-sonnet-4-6) on 2026-06-14. Static analysis only — no live infrastructure queries were made.