WAHA Incident Router — Shadow Cutover Procedure

Status: Ready for execution Worker: waha.infra.zintegrowana.online (live) Shadow target: n8n wa-router workflow (vps-h1) Owner: Radek Konarski Related: waha-incident-router.md, waha.md


Overview

The shadow cutover runs both the production Worker and the legacy n8n wa-router in parallel for 7 days. All incoming WhatsApp messages are forwarded to both receivers simultaneously (dual webhook). The legacy receiver operates in log-only / shadow mode — it must not write incidents to Supabase or send WhatsApp replies. After the 7-day validation window, the n8n shadow webhook is removed.


Prerequisites

Before starting the dual-webhook phase, verify:

  • Worker is live and processing messages at waha.infra.zintegrowana.online
  • At least one real incident has been created via the Worker (confirming Supabase write path works)
  • Grafana WAHA Incident Router dashboard is visible and panels load without errors
  • n8n wa-router workflow is updated to shadow mode (no Supabase writes, no WA replies — log only)
  • WAHA session default on vps-h1 is active and authenticated

Step 1 — Enable n8n shadow mode

Open the n8n wa-router workflow and disable any nodes that write to Supabase or send WhatsApp messages. Add a Function node at the start that logs the payload to n8n execution history and returns without further processing.

Validate by sending a test message and confirming no Supabase row is created by n8n, while the Worker processes it normally.


Step 2 — Configure dual webhook in WAHA

WAHA supports multiple webhooks per session. Add the n8n webhook URL as a secondary receiver.

# SSH to vps-h1
ssh -i C:\Users\konar\.ssh\id_ed25519 root@72.60.32.61
 
# Check current WAHA webhook config
curl -s http://localhost:13000/api/sessions/default | python3 -m json.tool | grep -A 20 webhook
 
# Add secondary webhook (n8n endpoint)
curl -X PATCH http://localhost:13000/api/sessions/default \
  -H "Content-Type: application/json" \
  -H "X-Api-Key: ${WAHA_API_KEY}" \
  -d '{
    "config": {
      "webhooks": [
        {
          "url": "https://waha.infra.zintegrowana.online/webhook/waha",
          "events": ["message"],
          "hmac": { "key": "${WAHA_HMAC_SECRET}" }
        },
        {
          "url": "https://n8n.vps-h1.infra.zintegrowana.online/webhook/wa-router-shadow",
          "events": ["message"]
        }
      ]
    }
  }'

Verify both webhooks are registered:

curl -s http://localhost:13000/api/sessions/default | python3 -m json.tool | grep -A 5 url

Expected output: two URLs listed under webhooks.


Step 3 — Validation checklist (run daily, days 1–7)

Each day during the shadow period, verify:

CheckExpectedPass?
New incidents appearing in dev_r_incidents (Worker)Yes, for each real message
No duplicate incidents from n8nZero rows with source = 'n8n' in 24h window
No n8n WA replies sent to groupConfirm in WAHA sent-messages log
Worker error rate0 errors in Grafana WAHA Incident Router dashboard
Thread linking correctThread starters produce new dev_r_incidents row
MTTR values reasonableNon-null for resolved incidents

SQL to check for n8n-sourced incidents (should return 0 rows):

SELECT COUNT(*) FROM dev_r_incidents
WHERE source = 'n8n'
  AND opened_at >= NOW() - INTERVAL '24 hours';

SQL to compare message counts Worker vs expected throughput:

SELECT
  date_trunc('hour', created_at) AS hour,
  COUNT(*) AS messages,
  COUNT(*) FILTER (WHERE thread_id IS NOT NULL) AS linked,
  COUNT(*) FILTER (WHERE is_thread_starter = true) AS thread_starters
FROM whatsapp_messages
WHERE created_at >= NOW() - INTERVAL '24 hours'
GROUP BY 1
ORDER BY 1;

Step 4 — Cutover completion criteria

All of the following must be true before removing the n8n shadow webhook:

  • 7 days elapsed since dual webhook was enabled
  • Zero n8n-sourced incidents in Supabase during shadow period
  • Zero unintended n8n WA replies sent
  • Worker processed all messages (no gaps in whatsapp_messages hourly counts)
  • MTTR data present for at least 3 resolved incidents
  • Grafana dashboard shows clean data with no datasource errors

Step 5 — Remove shadow webhook

Once all criteria are met:

# SSH to vps-h1
ssh -i C:\Users\konar\.ssh\id_ed25519 root@72.60.32.61
 
# Remove n8n webhook — keep only the production Worker webhook
curl -X PATCH http://localhost:13000/api/sessions/default \
  -H "Content-Type: application/json" \
  -H "X-Api-Key: ${WAHA_API_KEY}" \
  -d '{
    "config": {
      "webhooks": [
        {
          "url": "https://waha.infra.zintegrowana.online/webhook/waha",
          "events": ["message"],
          "hmac": { "key": "${WAHA_HMAC_SECRET}" }
        }
      ]
    }
  }'
 
# Confirm single webhook
curl -s http://localhost:13000/api/sessions/default | python3 -m json.tool | grep -A 5 url

Step 6 — Post-cutover

  • Disable or archive the n8n wa-router workflow (do not delete — keep for reference 30 days)
  • Update docs/waha.md to note n8n wa-router is archived
  • Add entry to docs/secrets-rotation-log.md noting cutover completion date
  • Close GitHub issue #113

Rollback procedure

If the Worker fails during the shadow period:

  1. Update WAHA session to remove the Worker webhook and set n8n as the sole receiver:
curl -X PATCH http://localhost:13000/api/sessions/default \
  -H "Content-Type: application/json" \
  -H "X-Api-Key: ${WAHA_API_KEY}" \
  -d '{
    "config": {
      "webhooks": [
        {
          "url": "https://n8n.vps-h1.infra.zintegrowana.online/webhook/wa-router",
          "events": ["message"]
        }
      ]
    }
  }'
  1. Re-enable all Supabase write nodes in the n8n wa-router workflow.
  2. Open a GitHub issue describing the failure before investigating the Worker.
  3. The Worker can be re-enabled by reverting Step 1 above once the root cause is resolved.

Contacts

RoleNameContact
OwnerRadek Konarskiradieu@gmail.com
WhatsApp groupDE transport groupsee WAHA session config