Fast storage is expensive, slow storage is cheap. The challenge: most data is rarely accessed after creation, yet it permanently occupies valuable SSD space. ZFS Dataset Tiering solves this by moving data between performance storage (SSD/NVMe) and capacity storage (HDD) based on rules — automatically, without manual intervention.
The Concept: Hot, Warm, Cold
Dataset tiering classifies data by access frequency:
| Tier | Storage Type | Access Frequency | Example Data |
|---|---|---|---|
| Hot | NVMe/SSD | Daily to hourly | Active projects, databases, VMs |
| Warm | SSD (SATA) | Weekly to monthly | Completed projects, logs |
| Cold | HDD (Mirror/RAIDZ) | Rarely to never | Archives, compliance data, old backups |
The goal: frequently used data automatically resides on the fastest storage, while seldom-accessed data migrates to affordable capacity storage. Total cost per terabyte drops without sacrificing performance for active workloads.
ZFS Fundamentals for Tiering
Pools and Datasets
ZFS organizes storage into pools (physical devices) and datasets (logical filesystems). For tiering, you need at least two pools:
# Performance pool (SSD mirror)
zpool create ssd-pool mirror /dev/nvme0n1 /dev/nvme1n1
# Capacity pool (HDD RAIDZ2)
zpool create hdd-pool raidz2 /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf
Why Not a Single Pool with Mixed VDEVs?
ZFS distributes writes evenly across all VDEVs in a pool. A pool with SSD and HDD VDEVs would place data randomly on both types — with no way to control placement. For true tiering, you need separate pools and logic that moves data between them.
The exception is Special VDEVs (metadata tiering), which we cover below.
Metadata Tracking: When Was Data Last Used?
The central question for tiering: when was a file last read or written? ZFS provides several timestamps:
- atime: Last read access (updated on every
open()by default) - mtime: Last data modification
- crtime: Creation timestamp
Problem: atime generates a write on every read access — reducing SSD lifespan and degrading performance. The solution:
# relatime: only update atime when older than mtime
zfs set atime=on ssd-pool/active-data
zfs set relatime=on ssd-pool/active-data
With relatime, atime is only updated when the last access is older than the last modification — significantly less write overhead while maintaining sufficient accuracy for tiering decisions.
Implementing a Tiering Policy
Approach 1: Time-Based Tiering with zfs send/receive
The simplest approach moves entire datasets based on age:
#!/usr/bin/env python3
"""ZFS Dataset Tiering: Move datasets between SSD and HDD pools."""
import subprocess
import json
from datetime import datetime, timedelta
SSD_POOL = "ssd-pool"
HDD_POOL = "hdd-pool"
TIER_AFTER_DAYS = 90
DRY_RUN = True
def get_datasets(pool):
result = subprocess.run(
["zfs", "list", "-H", "-o", "name,used,creation", "-r", pool],
capture_output=True, text=True
)
datasets = []
for line in result.stdout.strip().split("\n"):
if line:
parts = line.split("\t")
datasets.append({
"name": parts[0],
"used": parts[1],
"creation": parts[2]
})
return datasets
def get_last_access(dataset):
"""Check the most recent file modification in the dataset."""
mountpoint = subprocess.run(
["zfs", "get", "-H", "-o", "value", "mountpoint", dataset],
capture_output=True, text=True
).stdout.strip()
result = subprocess.run(
["find", mountpoint, "-maxdepth", "2", "-type", "f",
"-printf", "%T@\n"],
capture_output=True, text=True
)
if result.stdout.strip():
timestamps = [float(t) for t in result.stdout.strip().split("\n")]
return datetime.fromtimestamp(max(timestamps))
return None
def migrate_dataset(source, target_pool):
"""Migrate dataset from source to target pool using send/receive."""
dataset_name = source.split("/", 1)[1] if "/" in source else source
target = f"{target_pool}/{dataset_name}"
snapshot = f"{source}@tiering-migrate-{datetime.now():%Y%m%d}"
commands = [
f"zfs snapshot -r {snapshot}",
f"zfs send -R {snapshot} | zfs receive -F {target}",
f"zfs destroy -r {source}",
]
for cmd in commands:
if DRY_RUN:
print(f"[DRY RUN] {cmd}")
else:
subprocess.run(cmd, shell=True, check=True)
def main():
cutoff = datetime.now() - timedelta(days=TIER_AFTER_DAYS)
for dataset in get_datasets(SSD_POOL):
if dataset["name"] == SSD_POOL:
continue
last_access = get_last_access(dataset["name"])
if last_access and last_access < cutoff:
print(f"Tiering {dataset['name']} -> {HDD_POOL} "
f"(last access: {last_access:%Y-%m-%d})")
migrate_dataset(dataset["name"], HDD_POOL)
if __name__ == "__main__":
main()
Approach 2: File-Based Tiering
For more granular tiering, you can move individual files instead of entire datasets. This requires a tracking system:
#!/bin/bash
# Find files not accessed in 90 days
find /mnt/ssd-pool/data -type f -atime +90 -size +1M | while read file; do
# Calculate relative path
rel_path="${file#/mnt/ssd-pool/data/}"
target_dir="/mnt/hdd-pool/archive/$(dirname "$rel_path")"
# Create target directory and move file
mkdir -p "$target_dir"
mv "$file" "$target_dir/"
# Create symlink for transparent access
ln -s "/mnt/hdd-pool/archive/$rel_path" "$file"
echo "Moved: $rel_path"
done
Approach 3: Policy-Based Tiering with Cron
Combine the approaches in a structured policy framework:
# /etc/cron.d/zfs-tiering
# Daily at 3:00 AM: run tiering
0 3 * * * root /opt/scripts/zfs-tiering.py --policy /etc/zfs-tiering/policy.json
# Weekly: generate report
0 6 * * 1 root /opt/scripts/zfs-tiering.py --report
Policy configuration:
{
"policies": [
{
"name": "project-data",
"source_pool": "ssd-pool",
"target_pool": "hdd-pool",
"datasets": ["ssd-pool/projects/*"],
"tier_after_days": 90,
"min_size_mb": 100,
"exclude_patterns": ["*.db", "*.sqlite"]
},
{
"name": "vm-backups",
"source_pool": "ssd-pool",
"target_pool": "hdd-pool",
"datasets": ["ssd-pool/backups/*"],
"tier_after_days": 30,
"min_size_mb": 0
}
]
}
Special VDEV: Metadata Tiering in ZFS
Since OpenZFS 2.0, ZFS offers the Special VDEV — a dedicated SSD VDEV within an HDD pool that automatically keeps metadata, small files, and DDT (dedup tables) on fast storage:
# Create HDD pool with Special VDEV
zpool create tank \
raidz2 /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf \
special mirror /dev/nvme0n1 /dev/nvme1n1
# Set small_blocks threshold (files up to 128KB on Special VDEV)
zfs set special_small_blocks=128K tank/dataset
The Special VDEV dramatically accelerates metadata operations (ls, find, stat) without requiring the entire pool to run on SSDs. For details, see our article on TrueNAS Hybrid Storage.
Comparison with Traditional Tiering
| Feature | ZFS Dataset Tiering | Enterprise SAN Tiering | Cloud Tiering (S3) |
|---|---|---|---|
| Cost | Free (open source) | License fees ($$$$) | Pay-per-use |
| Granularity | Dataset or file | Block level | Object level |
| Automation | Script-based | Policy engine | Lifecycle rules |
| Transparency | Symlinks / mount | Transparent (LUN) | API / gateway |
| Performance | ZFS ARC cache helps | Tiered cache | Latency on retrieval |
| Complexity | Medium | High | Low |
ZFS dataset tiering does not offer the block-level transparency of an enterprise SAN, but it is free, flexible, and can be fully automated with standard Linux tools.
Use Cases
Use Case 1: Media Production
Active projects reside on the NVMe pool (fast random I/O for video editing). After project completion, they are moved to the HDD pool (high capacity for archival). The tiering script checks whether project folders have been unchanged for 60 days.
Use Case 2: Backup Retention
Daily backups land on SSD storage for fast restores. After 14 days, they are moved to HDD storage where they are retained for 12 months. SSD capacity stays free for fresh backups.
Use Case 3: Compliance Archival
Business documents with retention requirements (10 years) are automatically moved to cold storage after the active usage period. ZFS snapshots ensure integrity, checksums prevent undetected bit rot.
Monitoring and Reporting
Monitor your tiering with a simple reporting script:
#!/bin/bash
echo "=== ZFS Tiering Report ==="
echo ""
echo "SSD Pool (Hot Tier):"
zfs list -o name,used,refer,mountpoint -r ssd-pool
echo ""
echo "HDD Pool (Cold Tier):"
zfs list -o name,used,refer,mountpoint -r hdd-pool
echo ""
echo "Pool Capacity:"
zpool list -o name,size,alloc,free,cap
DATAZONE Control provides extended tiering monitoring: automatic detection of datasets eligible for tiering, historical capacity trends, and cost calculation per tier.
Frequently Asked Questions
Can I move data back from the cold tier to the hot tier?
Yes. The same zfs send/receive works in both directions. Implement a “promote” mechanism that moves frequently accessed cold data back to SSDs.
How large should the SSD pool be?
As a rule of thumb: 10-20% of total capacity is sufficient for most workloads. If your active data exceeds 30% of total volume, the SSD pool is too small.
Does tiering work with encryption?
Yes. ZFS native encryption is preserved during send/receive. The encryption key must be available on the target pool.
Planning ZFS dataset tiering for your TrueNAS infrastructure? Contact us — we design and implement a customized storage tiering solution.
More on these topics:
More articles
Backup Strategy for SMBs: Proxmox PBS + TrueNAS as a Reliable Backup Solution
Backup strategy for SMBs with Proxmox PBS and TrueNAS: implement the 3-2-1 rule, PBS as primary backup target, TrueNAS replication as offsite copy, retention policies, and automated restore tests.
TrueNAS with MCP: AI-Powered NAS Management via Natural Language
Connect TrueNAS with MCP (Model Context Protocol): AI assistants for NAS management, status queries, snapshot creation via chat, security considerations, and future outlook.
ZFS SLOG and Special VDEV: Accelerate Sync Writes and Optimize Metadata
ZFS SLOG (Separate Intent Log) and Special VDEV explained: accelerate sync writes, SLOG sizing, Special VDEV for metadata, hardware selection with Optane, and failure risks.