In nearly every Proxmox project for SMBs, the same fundamental question comes up sooner or later: Ceph or ZFS? Both solutions are mature, both are open source, both run natively on Proxmox VE 8.x. Yet the concepts differ fundamentally — and so do the requirements for hardware, network and day-2 operations. Get this wrong and you pay twice: once in hardware, once in operational overhead.

This article delivers a decision matrix for the typical SMB use case — three to five hypervisors, 10 to 50 VMs, one or two sites. We compare the architectures, show when each approach makes sense technically and economically, and highlight the pitfalls you will not find in any glossy datasheet.

Two fundamentally different concepts

ZFS and Ceph answer two different questions. ZFS is a filesystem with an integrated volume manager that runs on a single host. It combines the block layer, RAID, snapshots, compression and encryption in a tightly integrated stack. Ceph, by contrast, is a distributed object store that bundles multiple nodes into a single storage pool — with redundancy across hosts rather than just across disks.

Property	ZFS	Ceph
Architecture	Single-host, local	Scale-out, distributed
Redundancy	Across disks (mirror, RAIDZ)	Across hosts (replication, EC)
Minimum nodes	1	3 (better 4—5)
Shared storage	No (only via NFS/iSCSI export)	Yes (native)
Live migration in Proxmox	With replication every 1—15 min	Instant, zero delay
HA failure scenario	Seconds to minutes of data loss	No data loss
Single-thread performance	Very high	Medium
Distributed performance	Limited to one host	Scales linearly with nodes
Network requirement	1—10 GbE	25—100 GbE recommended
Minimum RAM per host	8—16 GB for storage	32—64 GB for storage
Operational complexity	Low	High

At first glance this table seems to clearly favour Ceph — until you look at the hardware requirements and the operational complexity.

When ZFS is the right choice

For the vast majority of SMB setups, ZFS is the economically and technically appropriate solution. Specifically whenever:

You run one to three hypervisors and no massive growth is planned
Your workloads need single-thread performance — databases, ERP, Exchange
You have tight budgets for storage networking (10 GbE is plenty)
Your team is Linux-savvy but not a storage specialist
Downtimes of a few minutes in a disaster scenario are acceptable

ZFS on Proxmox is in production within an hour. A typical pool layout for a hypervisor looks like this:

# Mirror of two NVMe for VMs (high IOPS demand)
zpool create -o ashift=12 -O compression=zstd -O atime=off \
  vmpool mirror /dev/nvme0n1 /dev/nvme1n1

# RAIDZ2 across six HDDs for bulk data and backups
zpool create -o ashift=12 -O compression=zstd -O atime=off \
  datapool raidz2 sda sdb sdc sdd sde sdf

# Special VDEV (metadata + small blocks) on NVMe
zpool add datapool special mirror /dev/nvme2n1 /dev/nvme3n1

For HA in a 2-node setup, Proxmox uses ZFS replication: every 1 to 15 minutes, an incremental zfs send is shipped to the second host. In a failover, the VM starts there from the latest snapshot — the data loss equals the replication interval. For most SMB workloads this is acceptable, especially when the application itself (database WAL, file locking) does not strictly require RPO=0.

We also use ZFS as the backend for TrueNAS — as a pure NAS appliance serving Proxmox via NFS or iSCSI. This gives a clean separation between compute and storage, without the complexity of a distributed system.

When Ceph is the right choice

Ceph plays to its strengths as soon as you need real shared storage across multiple hosts. Typical indicators:

Four or more hypervisors are planned in the cluster, with further growth on the horizon
Workloads must be live-migratable without delay (patch windows without VM stops)
You operate container platforms (Kubernetes, OpenShift) with dynamic PVCs
RPO=0 is a hard requirement — every write must be redundant
You can invest in a 25 GbE or 100 GbE cluster network
The ops team is ready to build Ceph expertise or to permanently buy in external know-how

A minimum-viable Ceph setup in a Proxmox cluster looks like this:

3 x Proxmox node, each:
  - 2x NVMe (OS, ZFS mirror)
  - 4-8x NVMe as Ceph OSDs (each 1.92-3.84 TB enterprise SSD)
  - 64-128 GB RAM
  - 2x 25 GbE for Ceph public/cluster network, separated
  - 2x 10 GbE for VM traffic and management

The rule of thumb for Ceph RAM: 1 GB per TB of OSD capacity, plus at least 4 GB per OSD daemon. A node with 8 OSDs at 3.84 TB therefore needs around 60 GB for storage alone — VMs come on top.

Replication in Ceph is typically configured as size=3, min_size=2. This means three copies of every block on three different nodes, with writes requiring at least two confirmed replicas. Usable capacity is therefore one third of raw capacity. Erasure coding can reduce this overhead but is rarely the right choice for VM workloads in Proxmox.

The hardware trap

The most common mistake in the SMB space is to deploy Ceph on hardware that was specced for ZFS. Symptoms only emerge under load:

Latencies above 20 ms in databases
“Slow ops” warnings in cluster status
High CPU load on nodes from RocksDB compaction
Inconsistent performance depending on VM placement

Concretely problematic: consumer SSDs without power-loss protection (massive performance degradation under sync writes), shared 10 GbE for VM and Ceph traffic, too few OSDs per node (at least four are sensible), and HDDs as Ceph OSDs without a separate DB device on NVMe.

ZFS, by contrast, is happy with much more modest hardware. A hypervisor with 64 GB RAM, four enterprise NVMe in RAIDZ2 and 10 GbE is perfectly adequate for 20—30 typical SMB VMs.

Performance in comparison

The following figures come from a recent DATAZONE benchmark on identical server hardware (AMD EPYC 9354P, 256 GB RAM, 8x Kioxia CD8-V 3.84 TB NVMe per node, 25 GbE Mellanox):

Workload	ZFS Mirror (1 host)	Ceph 3-node (size=3)
4K random read, 1 thread	195,000 IOPS	28,000 IOPS
4K random read, 64 threads	410,000 IOPS	1,350,000 IOPS
4K random write, 1 thread	88,000 IOPS	11,000 IOPS
4K random write, 64 threads	195,000 IOPS	720,000 IOPS
Sequential read, 1 MB	12 GB/s	18 GB/s (aggregate)
Average write latency	0.4 ms	1.8 ms

The message is clear: for individual, latency-critical workloads, ZFS wins. As soon as many parallel workers or distributed applications come into play, Ceph wins through the aggregate bandwidth of multiple nodes.

Operational effort and escalation paths

In day-2 operations, the two solutions differ dramatically. ZFS maintenance generally boils down to monthly zpool scrub, the occasional disk replacement, and monitoring SMART values. A well-built ZFS pool runs for years without intervention.

Ceph, on the other hand, is a living system that requires active care: upgrade paths have to be planned (monitor before OSD before MDS), rebalancing storms after node failures need to be steered, PG counts must match cluster size, and diagnosis in failure scenarios is significantly more complex. A forgotten ceph osd set noout before a reboot can trigger an hour of rebalancing — with a noticeable hit to VM performance.

Realistically, plan for at least 4 hours per month of active maintenance for a Ceph cluster, plus regular patch management and capacity planning. For ZFS, 30 minutes per month is closer to reality. This difference, multiplied by your admin’s hourly rate, is part of the total cost of ownership.

For customers who do not want to run Ceph themselves, we operate it as part of our virtualization services, including monitoring, patch management and escalation to Proxmox enterprise support.

Decision matrix: short and concrete

Your situation	Recommendation
1—2 hypervisors, classic SMB	Local ZFS
2 hypervisors with HA desire, RPO 5—15 min ok	ZFS + Proxmox replication
Central NAS, multiple PVE hosts	TrueNAS with ZFS, NFS/iSCSI
3+ hypervisors, RPO=0, container platform	Ceph (3 nodes minimum, better 4—5)
High-frequency OLTP database, one main server	ZFS on NVMe, dedicated DB host
Growth to 100+ VMs foreseeable	Ceph with a clear scale-out plan
Tight budget, no 25 GbE network	ZFS, Ceph makes no sense here

Hybrid setups are also common: Ceph as shared storage for most VMs, local ZFS for a latency-critical database VM that is explicitly pinned. The backup layout, too, benefits from ZFS on the PBS host even when production runs on Ceph.

Conclusion

There is no blanket answer to “Ceph or ZFS” — but there are clear indicators. ZFS is the right choice for 80 percent of typical SMB setups: simple in operation, predictable performance, low hardware requirements. Ceph only unfolds its value when real scaling, container workloads or RPO=0 are required — and when the organization is prepared to invest in matching hardware and know-how.

The most expensive option is always the one built for the wrong requirement: Ceph for a 2-node setup without 25 GbE brings only headaches; ZFS for a 6-node container cluster blocks sensible workflows. The right choice saves substantially more in total than the pure hardware delta.

DATAZONE supports you in the storage architecture of your Proxmox environment — from an honest requirements analysis through hardware selection to running Ceph clusters or ZFS-based TrueNAS systems. We have been building both worlds for SMBs for years and know the pitfalls that are missing from the glossy datasheet. Get in touch: /en/kontakt/.

Ceph vs. ZFS: When to Pick Which for SMBs

Two fundamentally different concepts

When ZFS is the right choice

When Ceph is the right choice

The hardware trap

Performance in comparison

Operational effort and escalation paths

Decision matrix: short and concrete

Conclusion

More articles

Handling ZFS Encryption Keys Right in TrueNAS Replication

Proxmox Live Migration Between Clusters: Moving VMs Without Downtime

Proxmox GPU Passthrough on Mid-Range Servers: Which Cards Are Worth It?

Need IT consulting?