ZFS Snapshot Replication Between TrueNAS Systems
A practical guide to setting up automated ZFS replication between two TrueNAS systems with asymmetric retention — short on the source, long on the destination.
If you manage any data you care about, a single copy on a single system isn't enough. ZFS makes it remarkably easy to replicate datasets between systems using zfs send and zfs receive over SSH — and TrueNAS wraps this in a decent GUI.
This post walks through setting up nightly incremental replication from one TrueNAS box to another, with asymmetric retention: the source keeps snapshots for just 3 days (saving disk space), while the destination holds them for 120 days (giving you months of recovery points).
After the initial full transfer, each nightly run only ships the blocks that changed since the last snapshot. For write-once workloads like media archives or backups, daily incrementals are tiny.
Prerequisites
- Source: TrueNAS CORE 12.x+ (the system pushing snapshots)
- Destination: TrueNAS or FreeNAS 11.x+ (the system receiving snapshots)
- SSH enabled on the destination
- Network connectivity between the two boxes
- A destination dataset with
readonly=on
Step 1: Prepare the Destination
Enable SSH
In the destination's web GUI, go to Services and enable SSH. Verify from a shell:
1sockstat -l4 | grep :22
You should see sshd listening on port 22.
Create the Receiving Dataset
1zfs create tank/replication2zfs set readonly=on tank/replication
The readonly=on property prevents accidental writes to your replica. ZFS replication bypasses this flag internally, so snapshots will still be received without issue.
Verify Pool Properties
1zfs get compression,atime,readonly tank tank/replication
I'd recommend compression=lz4 and atime=off on the parent pool if you haven't already.
Step 2: Create an SSH Connection on the Source
Generate an SSH Keypair
On the source, go to System > SSH Connections > ADD.
| Setting | Value |
|---|---|
| Name | Something recognizable (e.g., the destination hostname) |
| Setup Method | Manual |
| Host | Destination IP address |
| Port | 22 |
| Username | root |
| Private Key | Generate New |
| Cipher | Standard |
Save the connection.
Install the Public Key on the Destination
Extract the public key from the source's keychain and install it on the destination. From the source's shell:
1# Find the keypair ID 2midclt call keychaincredential.query | python3 -c " # [tl! focus:start] 3import sys, json 4for c in json.load(sys.stdin): 5 print(c['name'], 'ID:', c['id'], 'Type:', c['type']) 6" 7 8# Extract public key and push it to the destination 9midclt call keychaincredential.query '[[\"id\",\"=\",KEY_ID]]' \10 | python3 -c "11import sys, json12print(json.load(sys.stdin)[0]['attributes']['public_key'])13" | ssh root@DEST_IP \14 'mkdir -p /root/.ssh && cat >> /root/.ssh/authorized_keys && chmod 600 /root/.ssh/authorized_keys'
Replace KEY_ID with the ID from the first command, and DEST_IP with your destination's IP. You'll be prompted for the root password once — after that, key-based auth takes over.
Discover the Remote Host Key
Back in the source's web GUI, edit the SSH connection you just created and click Discover Remote Host Key. This populates the host key in the correct format.
Heads up: The host key must be stored without the hostname prefix. The
Discover Remote Host Keybutton handles this automatically. If you're setting the key via the API, store only the key type and data (e.g.,ecdsa-sha2-nistp256 AAAA...), not192.168.1.x ecdsa-sha2-nistp256 AAAA.... The middleware prepends the hostname on its own.
Verify Connectivity
1ssh -v root@DEST_IP "zfs list"
This should connect without a password prompt and list the destination's datasets.
Step 3: Create the Replication Task
On the source, go to Tasks > Replication Tasks > ADD.
What and Where
| Setting | Value |
|---|---|
| Source Location | On this System |
| Source Dataset | Your dataset (e.g., tank/data) |
| Recursive | Checked |
| Destination Location | On a Different System |
| SSH Connection | The connection from Step 2 |
| Destination Dataset | tank/replication on the destination |
| SSH Transfer Security | Encryption (or No Encryption for LAN) |
When
| Setting | Value |
|---|---|
| Replication Schedule | Run On a Schedule |
| Schedule | Daily at 00:00 |
| Destination Snapshot Lifetime | Same as Source (we'll change this next) |
Click START REPLICATION. This creates both the replication task and an associated periodic snapshot task on the source.
Step 4: Adjust Retention Policies
The wizard defaults to 2-week retention everywhere. For our setup, we want minimal footprint on the source and long retention on the destination.
Source: 3-Day Retention
1# List snapshot tasks 2midclt call pool.snapshottask.query | python3 -c " 3import sys, json 4for t in json.load(sys.stdin): 5 print(f'ID:{t[\"id\"]} dataset:{t[\"dataset\"]} lifetime:{t[\"lifetime_value\"]} {t[\"lifetime_unit\"]}') 6" 7 8# Update to 3 days 9midclt call pool.snapshottask.update TASK_ID \ 10 '{"lifetime_value": 3, "lifetime_unit": "DAY"}'
Why 3 days? Incremental replication needs at least one common snapshot between source and destination. Three days gives you a buffer — if replication fails one night, you still have two more days before the chain breaks. Bump this higher if your connectivity is unreliable.
Destination: 120-Day Retention
1# List replication tasks 2midclt call replication.query | python3 -c " 3import sys, json 4for t in json.load(sys.stdin): 5 print(f'ID:{t[\"id\"]} name:{t[\"name\"]} retention:{t[\"retention_policy\"]}') 6" 7 8# Update to 120-day custom retention 9midclt call replication.update TASK_ID \ 10 '{"retention_policy": "CUSTOM", "lifetime_value": 120, "lifetime_unit": "DAY"}'
Retention Summary
| System | Retention | Why |
|---|---|---|
| Source | 3 days | Minimize disk usage on the working system |
| Destination | 120 days | ~4 months of daily recovery points |
Step 5: Run the Initial Sync
From Tasks > Replication Tasks, expand the task and click RUN NOW.
The first run is a full zfs send of the entire dataset. Depending on size and network speed, this could take a while. Monitor the State column for progress.
After the initial sync, every subsequent nightly run is incremental and only transfers changed blocks.
Recovering Files from Snapshots
Single File Recovery
ZFS snapshots are accessible as read-only directories. On the destination, browse into:
1/mnt/tank/replication/.zfs/snapshot/
Each snapshot is a timestamped folder. Find the date before the unwanted change and copy the file out:
1# List available snapshots2ls /mnt/tank/replication/.zfs/snapshot/3 4# Grab a file from a specific snapshot5cp /mnt/tank/replication/.zfs/snapshot/auto-2026-03-15_00-00/path/to/file.tif \6 /tmp/recovered_file.tif
Full Disaster Recovery
The RESTORE button on the Replication Tasks page reverses the direction, pushing everything from the destination back to the source. Only use this if the source pool is completely lost and needs to be rebuilt.
Troubleshooting
"Invalid SSH host key" Error
The stored host key format is wrong. Edit the SSH connection and click Discover Remote Host Key again. Remember: the key must not include the hostname prefix — the middleware adds it automatically.
Replication Stuck or Failed
Check the middleware log on the source:
1grep -i "replication\|zettarepl" /var/log/middlewared.log | tail -30
Broken Incremental Chain
If the source's snapshots were pruned before replication ran (e.g., the destination was offline longer than the retention period), incremental replication fails because there's no common snapshot. Your options:
- Delete all snapshots on the destination dataset and re-run for a fresh full sync
- Manually create a common base snapshot
Testing SSH with the Keychain Key
1# Extract private key to a temp file 2midclt call keychaincredential.query '[[\"id\",\"=\",KEY_ID]]' \ 3 | python3 -c " 4import sys, json 5print(json.load(sys.stdin)[0]['attributes']['private_key']) 6" > /tmp/test_key 7chmod 600 /tmp/test_key 8 9# Test the connection10ssh -i /tmp/test_key root@DEST_IP "zfs list -r tank"11 12# Clean up13rm /tmp/test_key
Things Worth Knowing
readonly=ondoesn't block replication — ZFS handles this internally. Your destination stays protected from accidental writes while still receiving snapshots.- Semi-auto SSH setup won't work across versions — If the destination runs FreeNAS 11.x and the source is TrueNAS CORE 12+, the API versions differ. Use the manual method above.
- "No Encryption" is fast but unencrypted — Skipping SSH encryption on the transfer can be significantly faster on a LAN. Only do this on a trusted network.
- Snapshot overhead is proportional to change rate — For write-once data, 120 days of daily snapshots adds minimal overhead beyond the base dataset size. For heavy-write workloads, monitor your snapshot space usage.