Watchdog scripts, security options, PHP-FPM monitoring, and troubleshooting guides. These aren't needed for a basic install — but they're here when something goes wrong or you want to harden and automate further.
An optional monitor that runs every 5 minutes. If your site or tunnel goes down, it automatically recovers through three escalating phases before rebooting the Pi as a last resort.
Set CF_WATCHDOG_ENABLED=true in config.env and run bash install.sh --watchdog. That's it.
ntfy.sh alerts at every stage: first failure, each phase escalation, recovery, and "manual needed" if rate limited from rebooting.
Phase 3 reboots are capped to once every 6 hours. If the rate limit is hit, an alert fires instead and the watchdog exits cleanly.
When all PHP-FPM workers are exhausted, WordPress serves 504 errors — and WP-Cron goes silent at the worst moment. This monitor runs as a host cron (not inside Docker), so it fires regardless of PHP-FPM state.
Set FPM_AUTO_RESTART=true to automatically restart the WordPress container when saturation is detected. An ntfy notification confirms the restart. A cooldown (FPM_RESTART_COOLDOWN, default 20 min) prevents restart loops. Configure both in the Devtools plugin (Debug AI → PHP-FPM Monitor) or directly in config.env.
After a backup, the DB lock process is killed from inside the container so no orphaned SLEEP(86400) connections survive. The monitor also detects and kills any that slip through, with a 30-min alert cooldown.
The CloudScale Cyber & Devtools plugin (Debug AI tab → PHP-FPM Monitor) shows last saturation event, reason, and a pre-filled config.env snippet. Set FPM_CALLBACK_URL and FPM_CALLBACK_TOKEN to enable.
# config.env FPM_SATURATION_THRESHOLD=3 # checks before alerting FPM_WP_CONTAINER=pi_wordpress FPM_DB_CONTAINER=pi_mariadb FPM_AUTO_RESTART=true # restart container automatically FPM_RESTART_COOLDOWN=1200 # seconds between restarts (20 min) FPM_ALERT_COOLDOWN=1800 # seconds between repeat alerts
Install: crontab -e → add: * * * * * /home/pi/pi2s3/fpm-saturation-monitor.sh 2>/dev/null
Five layers that catch failures before they become disasters.
Each partition image is encrypted with GPG AES-256 before it leaves your Pi. Even full S3 bucket access is useless without the passphrase, which lives only in config.env, never in S3.
# config.env BACKUP_ENCRYPTION_PASSPHRASE="my-strong-passphrase"
Requires: sudo apt install gpg. Restore script detects encryption automatically.
Limit upload speed so nightly backups don't saturate your home or office connection. Uses pv in the upload pipeline. No AWS config changes needed.
# config.env AWS_TRANSFER_RATE_LIMIT="2m" # 2 MB/s
Requires: sudo apt install pv. Use 500k, 2m, or 1g format.
After every upload, the script re-lists S3 to confirm every uploaded file is non-zero bytes. Silent upload failures are caught immediately and reported via push notification.
# config.env (default: true) BACKUP_AUTO_VERIFY=true
Verification result is included in the ntfy success notification.
Before stopping Docker, the script checks: all containers are healthy, free disk space exceeds the threshold, and no recent I/O errors in dmesg. Warns (or aborts) before touching your running stack.
# config.env PREFLIGHT_MIN_FREE_MB=500 PREFLIGHT_ABORT_ON_WARN=false
A separate daily cron job checks S3 for a recent backup and sends a push alert if none is found within STALE_BACKUP_HOURS. Catches silent cron failures, including the Pi being offline.
# config.env (default: true) STALE_CHECK_ENABLED=true STALE_BACKUP_HOURS=25
An on_exit trap guarantees containers are restarted even if the backup script crashes mid-imaging. A separate post-backup cron job (30 min after backup) runs a second check to confirm containers came back up.
# config.env (default: true) POST_BACKUP_CHECK_ENABLED=true
Not running Docker? Stop any service (MariaDB, nginx, php-fpm) before imaging and restart it after. Works for native WordPress, Samba, or any systemd service. The crash trap also runs POST_BACKUP_CMD on failure, so services always come back up.
# native WordPress example PRE_BACKUP_CMD="systemctl stop nginx php8.2-fpm mariadb" POST_BACKUP_CMD="systemctl start mariadb php8.2-fpm nginx"
For MariaDB/MySQL setups, pi2s3 issues FLUSH TABLES WITH READ LOCK instead of stopping Docker. The lock is released after sync + drop_caches flush InnoDB dirty pages — typically under 10 seconds. Writes resume immediately; partclone images in the background. A background probe pings your site every 60 seconds to confirm it stays up.
# config.env DB_CONTAINER="auto" # auto-detect PROBE_LATEST_POST=true # probe latest WP post
Falls back to STOP_DOCKER automatically if no DB container is found or the lock fails.
Optional. Runs every 5 minutes as root. If your site or Cloudflare tunnel goes down, it recovers automatically through three escalating phases. No SSH needed.
| Phase | Trigger | Action |
|---|---|---|
| Phase 1 (0–20 min) | First failure detected | Restart cloudflared + start stopped containers |
| Phase 2 (20–40 min) | Still down after 4 attempts | Full docker compose down/up + cloudflared restart |
| Phase 3 (40+ min) | Still down after 8 attempts | Reboot Pi (rate-limited: max once per 6 hours) |
Three checks run every cycle: Docker containers stopped? Local HTTP probe returning 5xx or connection failure? cloudflared ha_connections > 0? Any failure starts the escalation ladder. Recovery is confirmed by re-running all checks after each action.
Push notifications via ntfy at every stage: first failure, each phase escalation, recovery, and stuck-down alert. Pre-reboot diagnostics dumped to /var/log/pi2s3-watchdog-prediag.log.
# config.env CF_WATCHDOG_ENABLED=true CF_SITE_HOSTNAME="yoursite.com" CF_HTTP_PORT=80 # Then install: bash ~/pi2s3/install.sh --watchdog
The three errors that account for most first run failures.
The installer tries to install pigz automatically. If it fails, install manually:
sudo apt install pigz
The backup falls back to single-threaded gzip if pigz is absent. It will still work, just slower.
The preflight check will print: Cannot reach s3://…. Verify credentials on the Pi:
aws s3 ls s3://your-bucket --region af-south-1
Confirm ~/.aws/credentials exists, or that AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY are set in the environment. Check the IAM policy includes s3:ListBucket on the bucket ARN (not just the /* path).
Error: partclone: device size mismatch or filesystem not found. Check the detected device:
lsblk # confirm nvme0n1 or mmcblk0
Set BOOT_DEV=/dev/mmcblk0 explicitly in config.env if auto detection picks the wrong device.
Pi 5 requires a 27W (5.1V / 5A) USB-C power supply. A marginal PSU or long USB-C cable causes the restore to crash 30–90 seconds in. The restore script warns at startup if voltage is low:
vcgencmd get_throttled # 0x0 = clean
Use the official Pi 5 PSU (SC1159) or any USB-C PD charger that negotiates 5V/5A. A shorter, thicker USB-C cable helps if the PSU is marginal.
Something went wrong? One command tells you exactly what broke and exactly how to fix it. Every failure prints a colour-coded verdict and a copy-paste fix command. A summary at the end lists every issue — no scrolling required.
sudo bash extras/diagnose-restore.sh
Saves full output to /var/log/pi2s3-diagnose-TIMESTAMP.log. Share that file when reporting an issue on GitHub.
If the Pi shows a solid red LED (no green ACT blink) after a restore, the SD card's
cmdline.txt likely has a missing or wrong PARTUUID.
Run this on your Mac with the SD card inserted:
bash extras/recover-sd-boot.sh
Auto-detects the SD card at /Volumes/bootfs, shows the broken cmdline.txt,
restores from the automatic backup if present, and prints step-by-step recovery instructions.
pi2s3 restore diagnostic
Generated: Mon Apr 28 08:00:00 SAST 2026
Host: andrewninja-pi-5
Uptime: up 2 hours, 14 minutes
────────────────────────────────────────────────────────────────
1. POWER & VOLTAGE
────────────────────────────────────────────────────────────────
get_throttled = 0x0
[OK] No under-voltage
[OK] CPU not throttled
[OK] Temperature OK
[OK] No historical under-voltage
core volt=0.8688V
temp=48.3'C
────────────────────────────────────────────────────────────────
3. WIFI & NETWORK
────────────────────────────────────────────────────────────────
[OK] Connected to 'Baker' — matches config.env
[OK] 'Baker' (SSID=Baker) — password OK (len=12)
────────────────────────────────────────────────────────────────
4. INTERNET & AWS REACHABILITY
────────────────────────────────────────────────────────────────
[OK] Gateway (192.168.0.1) — 0% packet loss
[OK] Internet (8.8.8.8) — 0% packet loss
[OK] AWS S3 (af-south-1) (s3.af-south-1.amazonaws.com:443)
[OK] s3://mybucket/ accessible (profile=personal)
────────────────────────────────────────────────────────────────
8. BOOT CONFIG
────────────────────────────────────────────────────────────────
[OK] root=PARTUUID=21b5924d-0d80-4332-a32c-3b24a6bde370 → /dev/nvme0n1p2
[OK] rootdelay present in cmdline.txt
────────────────────────────────────────────────────────────────
12. CLOUDFLARE TUNNEL
────────────────────────────────────────────────────────────────
[OK] cloudflared 2026.3.0
[OK] cloudflared.service active=active enabled=enabled
[OK] Credentials file present: /etc/cloudflared/84c2c36c-....json
────────────────────────────────────────────────────────────────
SUMMARY
────────────────────────────────────────────────────────────────
Ran in 38s. Report: /var/log/pi2s3-diagnose-20260428_080000.log
All checks passed — no issues found.
────────────────────────────────────────────────────────────────
1. POWER & VOLTAGE
────────────────────────────────────────────────────────────────
get_throttled = 0x50005
[FAIL] UNDER-VOLTAGE RIGHT NOW — restore WILL crash
fix: Use official Pi 5 PSU (5.1V / 5A / 27W, SC1159). Short/thin cable also drops voltage.
[WARN] Under-voltage event(s) since boot — PSU is marginal
fix: Upgrade to official Pi 5 PSU or 27W+ USB-C PD charger
────────────────────────────────────────────────────────────────
3. WIFI & NETWORK
────────────────────────────────────────────────────────────────
[FAIL] Connected to 'C_WIFI' but config.env WIFI_SSID='Baker'
Visible networks:
Baker signal=72
C_WIFI signal=68
fix: Update WIFI_SSID in config.env to 'C_WIFI', or connect to the right network:
fix: sudo nmcli dev wifi connect "Baker" password "<password>"
────────────────────────────────────────────────────────────────
SUMMARY
────────────────────────────────────────────────────────────────
Ran in 41s. Report: /var/log/pi2s3-diagnose-20260428_081200.log
2 issue(s) found:
✗ UNDER-VOLTAGE RIGHT NOW — restore WILL crash
△ Under-voltage event(s) since boot — PSU is marginal
✗ Connected to 'C_WIFI' but config.env WIFI_SSID='Baker'