Remote Support Start download

ZFS ARC and L2ARC: Cache Tuning for Maximum Storage Performance

ZFSTrueNASStorage
ZFS ARC and L2ARC: Cache Tuning for Maximum Storage Performance

The performance of a ZFS storage system depends heavily on its caching. The Adaptive Replacement Cache (ARC) in RAM is the central performance element of ZFS. Additionally, the Level 2 ARC (L2ARC) on an SSD can serve as a second cache tier. Understanding and properly configuring these two mechanisms is key to getting maximum performance from your ZFS pool.

How the ARC Works

The ARC is an intelligent read cache in main memory. Unlike simple LRU caches (Least Recently Used), the ARC uses the Adaptive Replacement Cache algorithm, which simultaneously optimizes for two access patterns:

  • MRU (Most Recently Used): Recently read data (temporal locality)
  • MFU (Most Frequently Used): Frequently read data (frequency-based locality)

The ARC dynamically adjusts the ratio between MRU and MFU based on the actual access pattern. A database server with many repeated accesses benefits from the MFU portion, while a file server with sequential access patterns favors the MRU portion.

ARC Structure in Detail

┌─────────────────────────────────────────────┐
│                    ARC (RAM)                │
├──────────────────────┬──────────────────────┤
│     MRU List         │     MFU List         │
│  (recently read)     │  (frequently read)   │
├──────────────────────┼──────────────────────┤
│   Ghost MRU List     │   Ghost MFU List     │
│ (evicted metadata)   │ (evicted metadata)   │
└──────────────────────┴──────────────────────┘

The ghost lists store metadata about recently evicted entries. If an entry from the ghost MRU list is requested again, the ARC enlarges the MRU list at the expense of the MFU list — and vice versa. This way, the cache continuously learns the optimal ratio.

ARC Tuning

arc_max: Maximum ARC Size

The most important tuning parameter is arc_max — the upper limit of the ARC in RAM. By default, ZFS uses up to 50 percent of total RAM for the ARC (on systems with more than 4 GB).

# Display current ARC size and limits
arc_summary | grep -A5 "ARC size"

# Or directly from /proc
cat /proc/spl/kstat/zfs/arcstats | grep -E "^(size|c_max|c_min)"

Rule of thumb for arc_max:

Use Casearc_max Recommendation
Dedicated NAS/file server75–80% of RAM
NAS + few VMs/containers50–60% of RAM
Hypervisor with ZFS storage30–40% of RAM
Mixed workloads50% of RAM (default)

Configuring arc_max

On TrueNAS SCALE (web GUI):

System > Advanced > Sysctl:
Name:  vfs.zfs.arc_max
Value: 17179869184    (16 GB in bytes)

On Linux (manual):

# Set temporarily
echo 17179869184 > /sys/module/zfs/parameters/zfs_arc_max

# Set permanently in /etc/modprobe.d/zfs.conf
echo "options zfs zfs_arc_max=17179869184" >> /etc/modprobe.d/zfs.conf

# Active after next reboot, or:
update-initramfs -u

arc_min: Minimum ARC Size

arc_min defines the lower bound below which the ARC should not shrink. This prevents memory-hungry applications from completely evicting the ARC.

# Set arc_min to 4 GB
echo "options zfs zfs_arc_min=4294967296" >> /etc/modprobe.d/zfs.conf

A reasonable arc_min size is 25–50 percent of arc_max. Values that are too high can starve applications of needed RAM.

Prioritizing Metadata Cache

ZFS stores both data and metadata (directory entries, inode information) in the ARC. For file servers with millions of small files, prioritizing metadata in the cache can be beneficial:

# Check metadata proportion
arc_summary | grep -A10 "ARC hash"

# Increase metadata limit (default: 25% of ARC)
echo "options zfs zfs_arc_meta_limit_percent=35" >> /etc/modprobe.d/zfs.conf

L2ARC: Second-Level Cache on SSD

The L2ARC extends the ARC with a second cache tier on an SSD or NVMe. Data evicted from the ARC can be written to the L2ARC and remains available there for subsequent read requests.

When Is L2ARC Worth It?

L2ARC is not always the right solution. The decision depends on several factors:

L2ARC is beneficial when:

  • Working set is larger than available RAM
  • Many random read operations
  • Slow primary storage media (HDDs)
  • Cost of additional RAM is prohibitive

L2ARC is NOT beneficial when:

  • Working set fits in the ARC (enough RAM available)
  • Predominantly sequential access (streaming, backup)
  • Predominantly write operations (L2ARC is only a read cache)
  • Less than 32 GB RAM in the system

Why 32 GB RAM as a Minimum?

The L2ARC requires RAM for its index structure. Each L2ARC entry consumes approximately 200 bytes in the ARC for metadata. For a 1 TB L2ARC SSD with an average block size of 128 KB:

1 TB / 128 KB = ~8 million entries
8,000,000 x 200 bytes = ~1.6 GB RAM for L2ARC index

This RAM is taken from the ARC itself. On systems with limited RAM, the L2ARC index can shrink the ARC to the point where overall performance decreases.

Setting Up L2ARC

# Add SSD as L2ARC device
zpool add tank cache /dev/nvme1n1

# Verify the result
zpool status tank

Example output:

  pool: tank
 state: ONLINE
config:
  NAME                    STATE
  tank                    ONLINE
    raidz1-0              ONLINE
      sda                 ONLINE
      sdb                 ONLINE
      sdc                 ONLINE
  cache
    nvme1n1               ONLINE

L2ARC Sizing

The L2ARC device size should be based on the working set:

ScenarioL2ARC Size
File server (100 TB pool)200–500 GB
Virtualization (50 TB pool)100–200 GB
Database (10 TB pool)50–100 GB

Bigger is not always better. An oversized L2ARC wastes RAM on index overhead and offers diminishing returns.

L2ARC Tuning Parameters

# Maximum write speed to L2ARC (bytes/s)
# Default: 8 MB/s (too low for NVMe)
echo "options zfs l2arc_write_max=104857600" >> /etc/modprobe.d/zfs.conf

# Boost during initial fill (bytes/s)
echo "options zfs l2arc_write_boost=209715200" >> /etc/modprobe.d/zfs.conf

# Headroom (multiplier for write speed)
echo "options zfs l2arc_headroom=8" >> /etc/modprobe.d/zfs.conf

# Persist L2ARC across reboots (since OpenZFS 2.0)
echo "options zfs l2arc_rebuild_enabled=1" >> /etc/modprobe.d/zfs.conf

The l2arc_rebuild_enabled=1 option is particularly important: without it, the L2ARC loses its contents on every reboot and must be populated from scratch.

Monitoring with arc_summary

Basic ARC Statistics

# Full ARC summary
arc_summary

# Output (excerpt):
# ARC size (current):                    14.2 GiB
# Target size (adaptive):                16.0 GiB
# Min size (hard limit):                  4.0 GiB
# Max size (high water):                 16.0 GiB
#
# ARC efficiency:                        94.31%
# Cache hit ratio:                       94.31%
#   Demand data hit ratio:               96.12%
#   Prefetch data hit ratio:             78.45%

Key Metrics

MetricMeaningTarget Value
Cache Hit RatioPercentage of reads served from cache> 90%
ARC Size vs MaxUtilization of available cache80–100%
L2ARC Hit RatioHits in the L2ARC> 50%
Eviction RateRate of cache evictionsLow

Continuous Monitoring

# Observe ARC statistics over time
arcstat 5
# Prints one line with key metrics every 5 seconds

# Output:
#     time  read  miss  miss%  dmis  dm%  pmis  pm%  mmis  mm%  size     c
# 09:15:01  1.2K    52   4.3%    32  3.1%   20  12%     0   0%  14.2G  16.0G
# 09:15:06  3.4K   123   3.6%    89  2.8%   34  15%     0   0%  14.2G  16.0G

L2ARC Statistics

# L2ARC-specific statistics
arc_summary | grep -A20 "L2ARC"

# Or directly:
cat /proc/spl/kstat/zfs/arcstats | grep l2_

# Important values:
# l2_hits:    Hits in the L2ARC
# l2_misses:  Misses in the L2ARC
# l2_size:    Current L2ARC fill level
# l2_hdr_size: RAM consumed by L2ARC index

Practical Example: Tuning a TrueNAS System

Scenario: TrueNAS SCALE with 64 GB RAM, 100 TB HDD pool, mixed workloads (SMB file server + 5 VMs).

# Recommended configuration:

# ARC: 40 GB (62.5% of RAM, rest for VMs)
vfs.zfs.arc_max = 42949672960

# ARC minimum: 16 GB
vfs.zfs.arc_min = 17179869184

# Metadata limit: 35% of ARC
vfs.zfs.arc_meta_limit_percent = 35

# L2ARC: 500 GB NVMe
# l2arc_write_max: 100 MB/s
# l2arc_rebuild_enabled: 1

The 24 GB of RAM outside the ARC is available for the operating system, VMs, and containers.

SLOG vs L2ARC: Do Not Confuse Them

A common misconception: the SLOG (Separate Log Device) is not a cache. It accelerates synchronous write operations (ZIL), while the L2ARC accelerates read operations.

DeviceFunctionWorkload
L2ARCRead cache (extends ARC)Random reads
SLOGSynchronous write log (ZIL)Synchronous writes (NFS, iSCSI, databases)

Both can reside on the same NVMe SSD — but on separate partitions:

# Partition NVMe: 32 GB SLOG + remaining for L2ARC
sgdisk -n 1:0:+32G -n 2:0:0 /dev/nvme1n1

# Add SLOG
zpool add tank log /dev/nvme1n1p1

# Add L2ARC
zpool add tank cache /dev/nvme1n1p2

Conclusion

The ZFS ARC is the most powerful cache in the storage world — and simultaneously the most frequently misconfigured one. The most important measure is sufficient RAM: every gigabyte of RAM in the ARC delivers more benefit than a gigabyte of SSD in the L2ARC. Only when RAM cannot be expanded further does the L2ARC make sense as a second cache tier. With the right tuning parameters and regular monitoring via arc_summary and arcstat, ZFS performance can be systematically optimized.

More on these topics:

Need IT consulting?

Contact us for a no-obligation consultation on Proxmox, OPNsense, TrueNAS and more.

Get in touch