Remote Support Start download

ZFS Dataset Tiering: Automatically Move Data Between SSD and HDD Storage

ZFSTrueNASStorage
ZFS Dataset Tiering: Automatically Move Data Between SSD and HDD Storage

Fast storage is expensive, slow storage is cheap. The challenge: most data is rarely accessed after creation, yet it permanently occupies valuable SSD space. ZFS Dataset Tiering solves this by moving data between performance storage (SSD/NVMe) and capacity storage (HDD) based on rules — automatically, without manual intervention.

The Concept: Hot, Warm, Cold

Dataset tiering classifies data by access frequency:

TierStorage TypeAccess FrequencyExample Data
HotNVMe/SSDDaily to hourlyActive projects, databases, VMs
WarmSSD (SATA)Weekly to monthlyCompleted projects, logs
ColdHDD (Mirror/RAIDZ)Rarely to neverArchives, compliance data, old backups

The goal: frequently used data automatically resides on the fastest storage, while seldom-accessed data migrates to affordable capacity storage. Total cost per terabyte drops without sacrificing performance for active workloads.

ZFS Fundamentals for Tiering

Pools and Datasets

ZFS organizes storage into pools (physical devices) and datasets (logical filesystems). For tiering, you need at least two pools:

# Performance pool (SSD mirror)
zpool create ssd-pool mirror /dev/nvme0n1 /dev/nvme1n1

# Capacity pool (HDD RAIDZ2)
zpool create hdd-pool raidz2 /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf

Why Not a Single Pool with Mixed VDEVs?

ZFS distributes writes evenly across all VDEVs in a pool. A pool with SSD and HDD VDEVs would place data randomly on both types — with no way to control placement. For true tiering, you need separate pools and logic that moves data between them.

The exception is Special VDEVs (metadata tiering), which we cover below.

Metadata Tracking: When Was Data Last Used?

The central question for tiering: when was a file last read or written? ZFS provides several timestamps:

  • atime: Last read access (updated on every open() by default)
  • mtime: Last data modification
  • crtime: Creation timestamp

Problem: atime generates a write on every read access — reducing SSD lifespan and degrading performance. The solution:

# relatime: only update atime when older than mtime
zfs set atime=on ssd-pool/active-data
zfs set relatime=on ssd-pool/active-data

With relatime, atime is only updated when the last access is older than the last modification — significantly less write overhead while maintaining sufficient accuracy for tiering decisions.

Implementing a Tiering Policy

Approach 1: Time-Based Tiering with zfs send/receive

The simplest approach moves entire datasets based on age:

#!/usr/bin/env python3
"""ZFS Dataset Tiering: Move datasets between SSD and HDD pools."""

import subprocess
import json
from datetime import datetime, timedelta

SSD_POOL = "ssd-pool"
HDD_POOL = "hdd-pool"
TIER_AFTER_DAYS = 90
DRY_RUN = True

def get_datasets(pool):
    result = subprocess.run(
        ["zfs", "list", "-H", "-o", "name,used,creation", "-r", pool],
        capture_output=True, text=True
    )
    datasets = []
    for line in result.stdout.strip().split("\n"):
        if line:
            parts = line.split("\t")
            datasets.append({
                "name": parts[0],
                "used": parts[1],
                "creation": parts[2]
            })
    return datasets

def get_last_access(dataset):
    """Check the most recent file modification in the dataset."""
    mountpoint = subprocess.run(
        ["zfs", "get", "-H", "-o", "value", "mountpoint", dataset],
        capture_output=True, text=True
    ).stdout.strip()

    result = subprocess.run(
        ["find", mountpoint, "-maxdepth", "2", "-type", "f",
         "-printf", "%T@\n"],
        capture_output=True, text=True
    )
    if result.stdout.strip():
        timestamps = [float(t) for t in result.stdout.strip().split("\n")]
        return datetime.fromtimestamp(max(timestamps))
    return None

def migrate_dataset(source, target_pool):
    """Migrate dataset from source to target pool using send/receive."""
    dataset_name = source.split("/", 1)[1] if "/" in source else source
    target = f"{target_pool}/{dataset_name}"

    snapshot = f"{source}@tiering-migrate-{datetime.now():%Y%m%d}"

    commands = [
        f"zfs snapshot -r {snapshot}",
        f"zfs send -R {snapshot} | zfs receive -F {target}",
        f"zfs destroy -r {source}",
    ]

    for cmd in commands:
        if DRY_RUN:
            print(f"[DRY RUN] {cmd}")
        else:
            subprocess.run(cmd, shell=True, check=True)

def main():
    cutoff = datetime.now() - timedelta(days=TIER_AFTER_DAYS)

    for dataset in get_datasets(SSD_POOL):
        if dataset["name"] == SSD_POOL:
            continue

        last_access = get_last_access(dataset["name"])
        if last_access and last_access < cutoff:
            print(f"Tiering {dataset['name']} -> {HDD_POOL} "
                  f"(last access: {last_access:%Y-%m-%d})")
            migrate_dataset(dataset["name"], HDD_POOL)

if __name__ == "__main__":
    main()

Approach 2: File-Based Tiering

For more granular tiering, you can move individual files instead of entire datasets. This requires a tracking system:

#!/bin/bash
# Find files not accessed in 90 days
find /mnt/ssd-pool/data -type f -atime +90 -size +1M | while read file; do
    # Calculate relative path
    rel_path="${file#/mnt/ssd-pool/data/}"
    target_dir="/mnt/hdd-pool/archive/$(dirname "$rel_path")"

    # Create target directory and move file
    mkdir -p "$target_dir"
    mv "$file" "$target_dir/"

    # Create symlink for transparent access
    ln -s "/mnt/hdd-pool/archive/$rel_path" "$file"

    echo "Moved: $rel_path"
done

Approach 3: Policy-Based Tiering with Cron

Combine the approaches in a structured policy framework:

# /etc/cron.d/zfs-tiering
# Daily at 3:00 AM: run tiering
0 3 * * * root /opt/scripts/zfs-tiering.py --policy /etc/zfs-tiering/policy.json

# Weekly: generate report
0 6 * * 1 root /opt/scripts/zfs-tiering.py --report

Policy configuration:

{
  "policies": [
    {
      "name": "project-data",
      "source_pool": "ssd-pool",
      "target_pool": "hdd-pool",
      "datasets": ["ssd-pool/projects/*"],
      "tier_after_days": 90,
      "min_size_mb": 100,
      "exclude_patterns": ["*.db", "*.sqlite"]
    },
    {
      "name": "vm-backups",
      "source_pool": "ssd-pool",
      "target_pool": "hdd-pool",
      "datasets": ["ssd-pool/backups/*"],
      "tier_after_days": 30,
      "min_size_mb": 0
    }
  ]
}

Special VDEV: Metadata Tiering in ZFS

Since OpenZFS 2.0, ZFS offers the Special VDEV — a dedicated SSD VDEV within an HDD pool that automatically keeps metadata, small files, and DDT (dedup tables) on fast storage:

# Create HDD pool with Special VDEV
zpool create tank \
  raidz2 /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf \
  special mirror /dev/nvme0n1 /dev/nvme1n1

# Set small_blocks threshold (files up to 128KB on Special VDEV)
zfs set special_small_blocks=128K tank/dataset

The Special VDEV dramatically accelerates metadata operations (ls, find, stat) without requiring the entire pool to run on SSDs. For details, see our article on TrueNAS Hybrid Storage.

Comparison with Traditional Tiering

FeatureZFS Dataset TieringEnterprise SAN TieringCloud Tiering (S3)
CostFree (open source)License fees ($$$$)Pay-per-use
GranularityDataset or fileBlock levelObject level
AutomationScript-basedPolicy engineLifecycle rules
TransparencySymlinks / mountTransparent (LUN)API / gateway
PerformanceZFS ARC cache helpsTiered cacheLatency on retrieval
ComplexityMediumHighLow

ZFS dataset tiering does not offer the block-level transparency of an enterprise SAN, but it is free, flexible, and can be fully automated with standard Linux tools.

Use Cases

Use Case 1: Media Production

Active projects reside on the NVMe pool (fast random I/O for video editing). After project completion, they are moved to the HDD pool (high capacity for archival). The tiering script checks whether project folders have been unchanged for 60 days.

Use Case 2: Backup Retention

Daily backups land on SSD storage for fast restores. After 14 days, they are moved to HDD storage where they are retained for 12 months. SSD capacity stays free for fresh backups.

Use Case 3: Compliance Archival

Business documents with retention requirements (10 years) are automatically moved to cold storage after the active usage period. ZFS snapshots ensure integrity, checksums prevent undetected bit rot.

Monitoring and Reporting

Monitor your tiering with a simple reporting script:

#!/bin/bash
echo "=== ZFS Tiering Report ==="
echo ""
echo "SSD Pool (Hot Tier):"
zfs list -o name,used,refer,mountpoint -r ssd-pool
echo ""
echo "HDD Pool (Cold Tier):"
zfs list -o name,used,refer,mountpoint -r hdd-pool
echo ""
echo "Pool Capacity:"
zpool list -o name,size,alloc,free,cap

DATAZONE Control provides extended tiering monitoring: automatic detection of datasets eligible for tiering, historical capacity trends, and cost calculation per tier.

Frequently Asked Questions

Can I move data back from the cold tier to the hot tier?

Yes. The same zfs send/receive works in both directions. Implement a “promote” mechanism that moves frequently accessed cold data back to SSDs.

How large should the SSD pool be?

As a rule of thumb: 10-20% of total capacity is sufficient for most workloads. If your active data exceeds 30% of total volume, the SSD pool is too small.

Does tiering work with encryption?

Yes. ZFS native encryption is preserved during send/receive. The encryption key must be available on the target pool.


Planning ZFS dataset tiering for your TrueNAS infrastructure? Contact us — we design and implement a customized storage tiering solution.

More on these topics:

Need IT consulting?

Contact us for a no-obligation consultation on Proxmox, OPNsense, TrueNAS and more.

Get in touch