Remote Support Start download

Disaster Recovery for SMBs: Minimizing Downtime

BackupSecurityProxmoxTrueNAS
Disaster Recovery for SMBs: Minimizing Downtime

An IT outage costs small and medium-sized businesses an average of €5,600 per minute. Server failure, ransomware, hardware defect, or natural disaster — the question isn’t if but when an incident occurs. Disaster recovery (DR) defines how quickly systems are restored and how much data loss is acceptable. A well-thought-out DR plan makes the difference between a manageable incident and an existential crisis.

Two Metrics That Determine Everything

RPO — Recovery Point Objective

RPO defines the maximum tolerable data loss. An RPO of 1 hour means: In the event of an outage, at most the last hour’s data may be lost.

  • RPO = 0: No data loss tolerated → Synchronous replication, cluster
  • RPO = 1 hour: Hourly backups or snapshots
  • RPO = 24 hours: Daily backup sufficient
  • RPO = 1 week: Weekly backup (for archive data)

RTO — Recovery Time Objective

RTO defines the maximum tolerable downtime. An RTO of 4 hours means: Systems must be operational again within 4 hours of the outage.

  • RTO = 0: No downtime tolerated → High availability cluster
  • RTO = 1 hour: Hot standby systems
  • RTO = 4 hours: Restore from local backup
  • RTO = 24 hours: Restore from offsite backup
  • RTO = 48+ hours: Hardware procurement required

RPO and RTO together determine the cost of the DR strategy. The shorter both values, the more expensive the infrastructure.

Risks and Their Impact

RiskLikelihoodTypical DowntimeData Loss
Hardware failure (disk, PSU)High2–8 hoursLow (with RAID)
RansomwareMedium-High1–7 daysHigh (without offsite backup)
Software bug/updateMedium1–4 hoursLow
Power outageMedium0–2 hoursMinimal (with UPS)
Human errorMedium1–24 hoursVariable
Fire/water damageLow1–4 weeksTotal (without offsite)
Provider outageLow2–24 hoursNone

Disaster Recovery Measures by Tier

Tier 1: Basic Protection (RPO 24h, RTO 24h)

The absolute baseline every business should implement:

Daily backup with offsite copy:

  • Proxmox Backup Server backs up all VMs and containers incrementally daily
  • TrueNAS replication transfers ZFS snapshots to a second location
  • Restore testing at least quarterly

UPS (Uninterruptible Power Supply):

  • Servers and network equipment on UPS
  • Minimum 15 minutes bridging time for clean shutdown
  • Automatic shutdown on extended outage

RAID redundancy:

  • No single-disk systems in production
  • RAID-Z2 or mirror for all drives
  • Spare drives (hot spare or in stock) available

Cost: Minimal — Proxmox Backup Server and TrueNAS are open source; hardware for a backup system starting at approximately €2,000.

Tier 2: Extended (RPO 1h, RTO 4h)

For businesses whose operations depend on IT:

Hourly snapshots:

  • ZFS snapshots on the production system (instant, space-efficient)
  • Snapshot retention: hourly for 24h, daily for 30 days, weekly for 12 months

Prepared replacement server:

  • A second server with Proxmox installation on-site
  • PBS backups can be restored directly on the replacement server
  • Alternatively: Proxmox cluster with 2 nodes

Documentation:

  • Runbook with step-by-step instructions for each recovery scenario
  • Network topology documented
  • Credentials in encrypted password manager (offline copy)

Cost: Moderate — second server (~€3,000–8,000), no additional software budget needed.

Tier 3: High Availability (RPO ~0, RTO <1h)

For mission-critical systems with no tolerable downtime:

Proxmox HA cluster:

  • 3-node cluster with quorum
  • Automatic VM migration on node failure
  • Shared storage (Ceph, TrueNAS iSCSI) or replicated local storage

Synchronous replication:

  • ZFS send/receive every few minutes between nodes
  • Proxmox storage replication between cluster nodes
  • TrueNAS replication with minimal latency

Geo-redundancy:

  • Second site with its own cluster
  • Asynchronous replication over WAN (RPO: minutes)
  • DNS failover or load balancing between sites

Cost: Significant — at least 3 servers, redundant network infrastructure, potentially a second location.

Disaster Recovery with Proxmox and TrueNAS

Proxmox Backup Server as the Backbone

PBS provides everything needed for DR:

  • Incremental backups: Only changed blocks are transferred — hourly backups are practical
  • Deduplication: Identical data blocks are stored only once — massive space savings
  • Encryption: Backups can be client-side encrypted — secure even on remote storage
  • Verify: Automatic integrity checking of all backups
  • Fast restore: Restore individual VMs or containers in minutes

TrueNAS as Offsite Target

TrueNAS with ZFS is excellently suited as an offsite backup target:

  • ZFS replication: Efficient block-level replication over SSH
  • Immutable snapshots: ZFS snapshots can be marked read-only — ransomware-proof
  • Compression: LZ4/ZSTD saves bandwidth and storage space
  • Alerting: TrueNAS alerts on replication failures

Example Setup for an SMB

Site A (Production):
├── Proxmox VE Cluster (2-3 nodes)
│   ├── Production VMs and containers
│   └── Local PBS → Hourly backups
└── TrueNAS → ZFS snapshots every 15 minutes

Site B (Offsite):
├── TrueNAS → Receives replication from Site A
│   └── Immutable snapshots (14-day retention)
└── PBS Offsite → Receives encrypted backup sync

RPO: 15 minutes (ZFS replication) to 1 hour (PBS) RTO: 1–4 hours (restore to Proxmox at Site A or B)

The DR Plan: What It Must Include

A written DR plan should cover the following:

  1. Contact list: Who is reachable in an emergency? (IT service provider, management, provider)
  2. Escalation levels: When is the DR plan activated?
  3. Prioritization: Which systems are restored first? (ERP before wiki, email before archive)
  4. Restore instructions: Step-by-step for each server/service
  5. Backup verification: Where are the backups? How are they accessed?
  6. Hardware procurement: Where is replacement hardware ordered in an emergency?
  7. Communication: Who informs customers and employees?
  8. Test interval: When is the DR plan tested?

Frequently Asked Questions

What does a disaster recovery plan cost?

The planning itself is an investment of 1–3 days of consulting. The infrastructure (second server, offsite storage) typically costs €5,000–15,000 one-time. Compared to the average cost of a multi-day outage (€50,000–500,000), this is a worthwhile investment.

How often should the DR plan be tested?

At least annually with a full restore test. Quarterly with a partial test (restoring a single VM). After every major infrastructure change, update and test the plan.

Is cloud backup sufficient as a DR strategy?

Cloud backup fulfills the offsite criterion of the 3-2-1 rule. But: Restoring from the cloud takes hours to days depending on data volume (bandwidth limitation). For short RTOs, local restore from PBS is significantly faster.

What’s the difference between backup and disaster recovery?

Backup secures data. Disaster recovery encompasses the entire restoration process: servers, network, services, access, communication. A backup without a DR plan is like insurance without knowing where the policy document is.


Want to create a disaster recovery plan for your business? Contact us — we analyze your infrastructure and implement a backup strategy matching your RPO/RTO requirements.

Need IT consulting?

Contact us for a no-obligation consultation on Proxmox, OPNsense, TrueNAS and more.

Get in touch