Remote Support Start download

TrueNAS ZFS Replication: Throttle Bandwidth Without Backup Gaps

TrueNASZFSBackupReplikation
TrueNAS ZFS Replication: Throttle Bandwidth Without Backup Gaps

ZFS replication is one of the strongest arguments for TrueNAS: block-level, incremental, cryptographically verified and schedulable at almost any frequency. In practice, rollouts rarely fail because of ZFS itself — they fail at the uplink. A 10 TB pool that generates a few hundred gigabytes of delta overnight quickly collides on a 100 Mbit link with VoIP, cloud backups and the first users who fire up Outlook at 6:30 a.m.

This article shows how to throttle a TrueNAS replication on purpose without blowing your RPO (Recovery Point Objective). We look at the built-in speed_limit, the classic pv, the buffering specialist mbuffer and the TrueNAS scheduler’s time window logic — each with concrete numbers for a typical SMB scenario.

The scenario: 10 TB pool, 100 Mbit offsite

Our reference case is a mid-sized customer near Neuburg/Donau: a primary TrueNAS SCALE 25.04 with 10 TB of working data (VM images, file shares, Veeam repository) replicated to a second TrueNAS in a colocation. The offsite link delivers 100 Mbit symmetrical, leaving roughly 11 MB/s after overhead. Daily delta: 80 — 250 GB, depending on Veeam jobs and database maintenance.

Without throttling, replication saturates the uplink completely. Users complain about slow cloud tools, VoIP calls drop, and the managing director shows up in the server room by day three. At the same time, replication must not be throttled so hard that the agreed four-hour RPO is breached.

Option 1: Built-in speed_limit in the TrueNAS replication task

TrueNAS SCALE has had a clean switch for this since 24.04 that covers most cases. In the web UI under Data Protection -> Replication Tasks -> Edit you’ll find the Limit (KiB/s) field. The value is passed to zettarepl and applied to the ZFS send stream before it is shipped over SSH.

Sensible defaults for our scenario:

Time windowLimitEffectiveLogic
06:00 — 18:002,048 KiB/s~2 MB/sBusiness hours, VoIP priority
18:00 — 22:006,144 KiB/s~6 MB/sRamp-up, fewer users
22:00 — 06:000 (unlimited)~11 MB/sFull bandwidth
Weekend0 (unlimited)~11 MB/sFull bandwidth

With this staircase, replication keeps running during the day, so the RPO never tears. If a nightly window drops out, the next night catches up without users noticing.

Important: speed_limit works per replication task. If you replicate three datasets in three separate tasks, the limits add up. Either bundle the datasets into one recursive task or split the limits explicitly.

Option 2: Three time windows instead of one limit

Elegant alternative to a single static value: multiple replication tasks driven by different cron expressions. The TrueNAS scheduler accepts any * * * * * combination, and each task can carry its own limit.

What works well in practice:

  • Task A “daytime”: every 30 minutes, limit 2,048 KiB/s, active 06:00 — 18:00, Mon — Fri.
  • Task B “evening”: hourly, limit 6,144 KiB/s, active 18:00 — 22:00.
  • Task C “night”: once at 22:30, no limit, runs until done.

This keeps the RPO under four hours even if the nightly full pull fails for some reason. Make sure all three tasks share the same snapshot namespace (same naming schema), otherwise the receiver won’t recognise the delta as continuous.

Option 3: pv and mbuffer for custom scripts

Not every customer replicates only through the TrueNAS wizard. If you fire ZFS send/receive from your own scripts — typical for cross-vendor setups towards Proxmox-ZFS or a plain Linux backup target — you reach for the Unix classics.

# Throttle to 2 MB/s with pv, plus 1 GB buffer via mbuffer
zfs send -I tank/data@2026-06-01_22:00 tank/data@2026-06-02_06:00 \
  | pv -L 2m -q \
  | mbuffer -q -m 1G -s 128k \
  | ssh -c aes128-gcm@openssh.com backup@offsite.example.com \
      "mbuffer -q -m 1G -s 128k | zfs receive -s -F backup/tank/data"

pv -L 2m caps throughput at 2 MB/s. mbuffer smooths out latency spikes and prevents the ZFS send from blocking on every network hiccup. The aes128-gcm@openssh.com cipher on the SSH side noticeably lowers CPU load without hurting confidentiality — a key lever on smaller NAS boxes with Atom CPUs.

If you want it cleaner, swap pv for trickle, which throttles per process rather than per pipe — letting you cap several parallel send streams against a shared budget.

QoS on the OPNsense as a second line of defence

Don’t rely on the NAS limit alone. If the replication daemon crashes and restarts, the last configuration may be gone — and the full pull will overrun the uplink. A shaper on the firewall catches that.

On OPNsense 25.04, you can define a pipe with a fixed bandwidth limit for SSH port 22 to the replication peer and bind it to a schedule that is only active between 06:00 and 22:00. That gives you a safety net: even if TrueNAS “forgets” to throttle, the firewall slows the flow.

A typical OPNsense configuration looks like this:

ElementValue
Pipe Bandwidth6 Mbit/s
Queue Weight10
Maskdst-ip
ScheduleMon — Fri 06:00 — 22:00
Rule Matchproto tcp, dst 203.0.113.42, dst-port 22

RPO check: how much throttling is safe?

A rough rule of thumb helps with sizing: daily delta in GB divided by available hours gives the minimum required bandwidth in MB/s times 3.6. Example: 150 GB delta over 14 available hours = 10.7 GB/h = roughly 3 MB/s sustained. If your throttle is below that, delta slides into the next night and the RPO slides with it.

Verify this in the TrueNAS web UI under Data Protection -> Replication Tasks -> Show Details: per task you see the last runtime, transferred data and effective speed. If you need more precision, export the values via the TrueNAS API into Backup reports or a Grafana dashboard.

Pitfalls from the field

Three issues that keep tripping up real-world projects:

  1. MTU mismatch: A throttled stream can mask an MTU bug for a long time. The sessions only break when the full nightly replication runs. Set a consistent 1500 or 9000 on both sides.
  2. Snapshot cleanup: Aggressive hold policies risk the sender deleting a snapshot before the delta has arrived on the other side. Keep at least 24 hours of buffer.
  3. SSH keepalive: At very low throttles (under 1 MB/s) with large records, the firewall may kill the session as “idle”. ClientAliveInterval 30 on the receiver prevents that.

Bottom line

Throttling bandwidth on TrueNAS is not rocket science — it’s about the right combination: speed_limit for the default case, multiple staggered tasks for finer control, pv and mbuffer for custom scripts, and a QoS shaper on OPNsense as a belt-and-braces fallback. If you size the RPO math properly, you can reliably mirror 10 TB pools offsite over a 100 Mbit link without disrupting the business.

DATAZONE supports SMBs in Neuburg, Ingolstadt and across Bavaria with the design, rollout and operation of TrueNAS replication paths — from initial pool planning through bandwidth sizing to disaster-recovery testing. If you want to harden or rebuild your replication, get in touch. We bring the hands-on experience from dozens of ZFS projects and find the right throttle mix for your line.

Need IT consulting?

Contact us for a no-obligation consultation on Proxmox, OPNsense, TrueNAS and more.

Get in touch