P24-Infra Infrastructure Evaluation — June 2026
Produced: 2026-06-14 · Claude Code autonomous evaluation (5 parallel agents)
Working directory: C:\code_2026\p24-infra
This directory contains the full infrastructure evaluation for the p24-infra / Ecotrans / Pinbox24 platform.
Documents
| # | Document | Scope | Size |
|---|
| 00 | Documentation Audit + Standards | Doc consistency, stale docs, gaps, proposed standard | ~8 KB |
| 01 | Service Inventory + Distribution Plan | Full service catalog, vps-h1 load, bms-4 expansion, PDF services | ~15 KB |
| 02 | BMS Servers Modernization Plan | Phased roadmap bms-1 to bms-4, 32 issues P1–P4 | ~45 KB |
| 03 | AI-Dev-BMS4 + Nightly Ops + MongoDB | Agent design, nightly checklist, MongoDB maintenance | ~68 KB |
| 04 | Pinbox24 Map + DR Audit + Workbook Audit | Infrastructure map, DR score 2/10, backup gaps | ~20 KB |
Executive Summary
Critical Findings (Act Now)
| # | Finding | Risk | Document |
|---|
| C1 | MongoDB backup last: February 2026 — 4+ months stale | Data loss up to 4 months if bms-3 fails | 04 |
| C2 | bms-1 disk 100% full — Pinbox24 production server | Production crash or data corruption imminent | 02 |
| C3 | Untagged Docker images on bms-1 (v41-prod, v32-prod-socket, v32-prod-reso) | If containers stop, cannot be restarted (images lost) | 04 |
| C4 | rs.addArb() for bms-4 still pending | MongoDB rs0 has only 1 voting member + observer; no quorum resilience | 02 |
| C5 | bms-1 Ubuntu 20.04 EOL since April 2025 | No security patches for 14+ months on production server | 02 |
| C6 | bms-3 MongoDB using ~21.7 GB RAM on 32 GB server | OOM kill risk; rs0 PRIMARY could be lost | 02 |
| C7 | vps-h1 critically overloaded (n8n at 1.5/2.0 vCPU) | n8n + WAHA reliability at risk; WAHA = WhatsApp incidents | 01 |
Today’s Required Human Actions
- Export untagged bms-1 images to Wasabi NOW — before any container restart/update
- Run MongoDB dump on bms-3 NOW —
mongodump --out /tmp/dump-$(date +%Y%m%d) + upload to Wasabi
- Disk cleanup on bms-1 — identify & remove old container layers, logs, tmp files
- Locate MongoDB admin credentials — required for
rs.addArb() and all MongoDB maintenance
- Execute
rs.addArb("54.36.123.110:27017") and rs.remove("51.83.132.99:27017") on bms-3
Infrastructure Overview (Current State)
Servers
| Label | IP | OS | Role | Status |
|---|
| vps-i1 | 217.154.82.162 | AlmaLinux 9.7 | Monitoring stack (Prometheus+Grafana+Thanos+Alertmanager), Traccar, OpenClaw, GH Actions runner, AI-Dev-IO1 | ✅ Stable |
| vps-h1 | 72.60.32.61 | Ubuntu 24.04 | Traefik, n8n+PG, WAHA, exporters, promtail | ⚠️ Overloaded — n8n migrating to bms-4 |
| bms-1 | 94.23.26.113 | Ubuntu 20.04 EOL | Pinbox24 production (24 containers v31/v32/v41/v42) | 🔴 Critical — disk full + EOL |
| bms-2 | 145.239.133.104 | Ubuntu 24.04 | MongoDB rs0 observer (non-voting) + AI-Dev-OV1 | ✅ Good |
| bms-3 | 51.68.155.224 | Ubuntu 22.04 | MongoDB rs0 PRIMARY + Pinbox24 staging + traccar + mt5 | ⚠️ OOM risk, dual-purpose |
| bms-4 | 54.36.123.110 | Ubuntu 22.04 | MongoDB arbiter + Docker host (n8n migration target) | ✅ New — tasks pending |
SaaS
| Service | Purpose | Status |
|---|
Supabase (mwkqmgadqnkkihjdeqsi) | et-operational-platform DB + audit engine | ✅ Active |
| Vercel | et-operational-platform (prod+staging), p24-nextjs-v2026, portal | ✅ Active |
| Cloudflare | DNS (zintegrowana.online), CF Workers (waha-router) | ✅ Active |
Wasabi S3 (p24-infra, eu-central-2) | Thanos metrics, PDF storage, backups | ✅ Active |
| Convertio.ai | PDF→image conversion for Pinbox24 | ⚠️ External SaaS — scheduled replacement |
| AWS ECR | Pinbox24 production container registry | ✅ Active |
| Mailgun EU | Email alerts (Alertmanager) | ✅ Active |
Priority Roadmap
Phase 1 — Critical Security (This Week)
| Task | Server | Owner | Notes |
|---|
| Export untagged images → Wasabi | bms-1 | Human | Cannot be automated — images are untagged |
| Run MongoDB dump + upload Wasabi | bms-3 | Human | First real backup in 4+ months |
| Disk cleanup on bms-1 | bms-1 | Claude/Human | docker system prune safe on non-prod layers; check with human before prod containers |
| Set up automated MongoDB backup | bms-2 | AI-Dev-OV1 | Daily mongodump + rsync to Wasabi s3://p24-infra/mongodb/ |
Run rs.addArb() / rs.remove() | bms-3 (mongosh) | Human | Needs MongoDB admin password |
| Ubuntu 20.04 EOL migration plan | bms-1 | Plan session | Zero-downtime requires containerized migration strategy |
| MongoDB RAM alert in Prometheus | vps-i1 | AI-Dev-IO1 | Alert when bms-3 RAM < 2 GB free |
Phase 2 — Stability (Next 2 Weeks)
| Task | Server | Notes |
|---|
| Add bms-2 + bms-3 to Prometheus | vps-i1 | node-exporter must be installed on bms-3 first |
Set MongoDB wiredTigerCacheSizeGB: 16 | bms-3 | Limit cache to prevent OOM; needs mongod restart |
| n8n migration vps-h1 → bms-4 | bms-4 | See migration checklist in bms-4 workbook |
| Deploy bms-4 docker-compose | bms-4 | scp bms-4/docker-compose.yml root@54.36.123.110:/root/ |
| WAHA migration vps-h1 → bms-4 | bms-4 | After n8n stable; update webhook URLs |
| Portainer upgrade (v1 → v2) | bms-1 | Portainer CE v1 is EOL |
| MongoDB firewall: close port 27017 externally | bms-2/3/4 | Only inter-replica and admin access needed |
Phase 3 — Hardening (This Month)
| Task | Servers | Notes |
|---|
| Install fail2ban + SSH hardening | bms-1/2/3/4 | See docs/improvements/09-ssh-hardening.md |
| Enable unattended-upgrades | bms-1/2/3/4 | Security patches only |
| Trivy CVE scanning | All | Weekly scan via GH Actions |
| Create claude-admin user on bms-3/4 | bms-3/4 | For AI agent SSH access |
| Install AI-Dev-BMS4 agent | bms-4 | See 03-nightly-ops-and-mongodb.md §1–5 |
| Nightly operations automation | bms-4/GH Actions | See 03-nightly-ops-and-mongodb.md §6–13 |
Register all Pinbox24 services in dev_r_services | Supabase | See 04-pinbox24-map-dr-audit.md §14 |
Phase 4 — Modernization (Next Quarter)
| Task | Notes |
|---|
| bms-1 Ubuntu 20.04 → 24.04 migration | Zero-downtime: new server + DNS cutover strategy |
| bms-3 Ubuntu 22.04 → 24.04 (before April 2027 EOL) | Maintenance window required; MongoDB failover first |
| bms-4 Ubuntu 22.04 → 24.04 | Simplest — least traffic |
| Docker registry consolidation | private-registry.dev.pinbox24.com location TBD |
| PDF services on bms-4 | Gotenberg + pdf-to-jpg microservice replacing Convertio.ai |
| vps-h1 decommission | After n8n + WAHA migrate to bms-4; saves ~10€/month |
Documentation Gaps (from Audit)
- Update CLAUDE.md bms-4 entry — still says “Not provisioned”
- Update
docs/elements.md — add all 4 BMS servers, MongoDB rs0, bms-4 services
- Delete duplicate
docs/p4-ovh-bms-1-ns367522-operations.md (keep docs/servers/ version)
- Fix
docs/standards/project-standards.md — remove references to deprecated services/compliance-matrix.yml
- Archive
docs/infrastructure-overview.md — severely stale, create fresh overview
Missing Workbooks (by priority)
| Priority | Service | Notes |
|---|
| P1 | MongoDB rs0 operations | Critical 3-node replica set, zero documentation |
| P1 | Pinbox24 production (bms-1) | 24-container production system with no DR runbook |
| P2 | n8n on bms-4 | After migration from vps-h1 |
| P2 | Convertio.ai replacement (pdf-to-jpg) | New service to be deployed |
| P2 | MT5 on bms-3 | Unknown purpose, no docs |
| P3 | WAHA on bms-4 | After migration |
| P3 | AI-Dev-BMS4 agent | After installation |
Service Distribution — Target Architecture
┌─────────────────────────────────────────────────────────────────────┐
│ TARGET ARCHITECTURE (post-migration) │
├──────────────┬──────────────────────────────────────────────────────┤
│ vps-i1 │ Monitoring (Prometheus+Thanos+Grafana+Alertmanager) │
│ IONOS 8GB │ Caddy TLS · Traccar GPS · OpenClaw │
│ AlmaLinux │ GH Actions runner · AI-Dev-IO1 │
├──────────────┼──────────────────────────────────────────────────────┤
│ vps-h1 │ → DECOMMISSION after WAHA migration │
│ Hostinger 8G │ (saves ~10€/month) │
├──────────────┼──────────────────────────────────────────────────────┤
│ bms-1 │ Pinbox24 production (v31/v32/v41/v42) │
│ OVH 32GB │ nginx-proxy · portainer · mailgun │
│ → Ubuntu24 │ pdf-gen · wkhtml · git-deploy · AWS ECR │
├──────────────┼──────────────────────────────────────────────────────┤
│ bms-2 │ MongoDB rs0 SECONDARY (observer, non-voting) │
│ OVH 32GB │ AI-Dev-OV1 (4 Claude agents) │
│ Ubuntu 24 │ MongoDB backup runner │
├──────────────┼──────────────────────────────────────────────────────┤
│ bms-3 │ MongoDB rs0 PRIMARY │
│ OVH 32GB │ Pinbox24 staging · traccar · mt5 │
│ → Ubuntu24 │ (staging should move to bms-4 long-term) │
├──────────────┼──────────────────────────────────────────────────────┤
│ bms-4 │ MongoDB rs0 ARBITER (~75 MB) │
│ OVH 32GB │ Traefik TLS · n8n + PostgreSQL (migrated from h1) │
│ 1.8TB disk │ WAHA WhatsApp (migrated from h1) │
│ Ubuntu 22 │ Gotenberg + pdf-to-jpg (new — replaces Convertio) │
│ │ AI-Dev-BMS4 (Claude agent, max 4 parallel) │
│ │ node-exporter · cadvisor │
└──────────────┴──────────────────────────────────────────────────────┘
Key Risks
| Risk | Likelihood | Impact | Mitigation |
|---|
| bms-1 disk full → production crash | Imminent | Critical | Emergency cleanup + disk expansion |
| bms-3 OOM → MongoDB PRIMARY lost | High | Critical | Set WiredTiger cache limit NOW |
| Untagged images lost on restart | High | High | Export to Wasabi immediately |
| MongoDB 4+ month data loss on bms-3 failure | High | Critical | Automated daily backup NOW |
| vps-h1 n8n crash takes down WAHA | Medium | High | Accelerate bms-4 migration |
| bms-1 security breach (Ubuntu 20.04 EOL) | Medium | Critical | Migration plan + WAF in front |
| rs0 quorum loss if arbiter not joined | Current | High | Human action: rs.addArb() |
Generated by Claude Code (5 parallel evaluation agents) · 2026-06-14