Recently I decided to improve the reliability of my file system backups by using the data replication capabilities inherent in the FreeBSD Zettabyte File System (ZFS). ZFS provides a built-in serialization feature that can send a stream representation of a ZFS file system (Which ZFS refers to as a “dataset”) to standard output. Using this technique, it is possible to not only store the dataset(s) on another ZFS storage pool (zpool) connected to the local system, but also to send it over a network to another FreeBSD system. ZFS dataset snapshots serve as the basis for this replication, and the essential ZFS commands used for replicating the data are zfs send and zfs receive.
This post describes how I used this ZFS feature to perform replication of ZFS dataset snapshots from my home FreeBSD server to another FreeBSD machine located offsite. I’ll also discuss how I manage the quantity of snapshots stored locally and offsite, as well as a couple of options for recovering my files should it become necessary.
For purposes of example, I’ll refer to the FreeBSD system hosting the snapshots I want to send as “server”, and the offsite FreeBSD system that I will send snapshots to as “backup”. Unless otherwise noted, all steps were performed as the user root. However a non-root user, “iceflatline”, was created on both machines and is used for many of the commands. The versions for the software used in this post were as follows:
Configure server
On server I had created a simple mirror vdev for my zpool consisting of (2) two terabyte disks. The mirror and the zpool were created using the following commands:
1 2 3 4 5 6 7 |
gpart create -s gpt ada1 gpart create -s gpt ada2 gpart add -t freebsd-zfs -a 1m ada1 gpart add -t freebsd-zfs -a 1m ada2 zpool create pool_0 mirror /dev/ada1p1 /dev/ada2p1 |
As you can see, I created one large ZFS partition (-t freebsd-zfs) on each disk. Specifying the -a option, the gpart utility tries to align the start offset and partition size on the disk to be a multiple of the alignment value. I chose 1 MiB. The advantage to this is that it is a multiple of 4096 (helpful for larger, 4 kiB sector drives), leaving the leftover fraction of a megabyte at the end of the drive. In the future, if I have to replace a failed drive containing a slightly different number of sectors, I’ll have some wiggle room in case the replacement drive is slightly larger in size. After partitioning each drive I created the zpool using these partitions. I elected to use name “pool_0” for this zpool.
To improve overall performance and usability of any datasets that I create in this zpool, I performed the following configuration changes:
1 2 3 |
zfs set atime=no pool_0 zfs set compression=lz4 pool_0 zfs set snapdir=visible pool_0 |
The zfs command property atime controls whether the access time for files is updated when the files are read. Setting this property to off avoids producing write traffic when reading files, which can result in a gain in file system performance. The lz4 property controls the compression algorithm used for the datasets. lz4 is a high-performance replacement for the older the Lempel Ziv Jeff Bonwick (lzjb) algorithm. It features faster compression and decompression, as well as a generally higher compression ratio than lzjb. The snapdir property controls whether the directory containing my snapshots (pool_0/dataset_0/.zfs) is hidden or visible. I prefer the directory to be visible so I have another way to verify the existence of snapshots. These configuration changes were made at the zpool level so that any datasets I create in this zpool will inherit these settings; however, I could configure each dataset differently if desired.
The dataset on server that I back up offsite is called “dataset_0”, and was created using the following command:
1 |
zfs create pool_0/dataset_0 |
To ensure I have still have some headroom if/when the zpool starts to get full, I set the size quota for this dataset to 80% of zpool size (1819 GiB), or 1455 GiB:
1 |
zfs set quota=1455G pool_0/dataset_0 |
Since ZFS can send a stream representation of a dataset to standard output, it can be piped through secure shell (“SSH”) to securely send it over a network connection. By default, root user privileges are required to send and receive these streams. This requires logging into the receiving system as user root. However, logging in as the user root via a SSH is disabled by default in FreeBSD systems for security reasons. Fortunately, the necessary ZFS commands can be delegated to a non-root user on each system. The minimum delegated ZFS permissions I needed for user iceflatline to successfully send snapshots from server were as follows:
1 |
zfs allow -u iceflatline create,destroy,hold,mount,receive,send,snapshot pool_0 |
In this case I delegated the permissions at the zpool level, so any datasets I create in pool_0 will inherit them. Alternatively I could have delegated permissions at the dataset level or a combination of both if desired. There’s a lot of flexibility.
I’m able to verify which permissions were delegated anytime using the following command as either user root or iceflatline:
1 |
zfs allow pool_0 |
Finally, to avoid having to enter a password each time a backup is performed, I generated a SSH key pair as user iceflatline on server and copied the public key to /usr/home/iceflatline/.ssh/authorized_keys on backup.
Configure backup
I configured backup similar to server: a simple mirror vdev, and a zpool named pool_0 with the same configuration as the one in server. I did not create a dataset on this zpool because I will be replicating pool_0/dataset_0 on server directly to pool_0 on backup.
The minimum delegated ZFS permissions I needed for user iceflatline on backup to successfully receive these snapshots were as follows:
1 |
zfs allow -u iceflatline create,destroy,mount,mountpoint,quota,receive,send,snapdir pool_0 |
Using zfs send and receive
After configuring both machines it was time to test. First, I created a full snapshot of pool_0/dataset_0 on server using the following command as as user iceflatline:
1 |
zfs snapshot -r pool_0/dataset_0@snap-test-0 |
While not strictly needed in this case, the -r option will recursively create snapshots of any child datasets that I may have created under pool_0/dataset_0.
Now I can send this newly created snapshot to backup, which was assigned the IP address 192.168.20.6. The following command is performed as user iceflatline:
1 |
zfs send pool_0/dataset_0@snap-test-0 | ssh iceflatline@192.168.20.6 zfs receive -vudF pool_0 |
The zfs send command creates a data stream representation of the snapshot and writes it to standard output. The standard output is then piped through SSH to securely send the snapshot to backup. The -v option will print information about the size of the stream and the time required to perform the receive operation. The -u option prevents the file system associated with the received data stream (pool_0/dataset_0 in this case) from being mounted. This was desirable as I’m using backup to simply store the dataset_0 snaphots offsite. I don’t need to mount them on that machine. The -d option is used so that all but the pool name (pool_0) of the sent snapshot is appended to pool_0 on backup. Finally, the -F option is useful for destroying snapshots on backup that do not exist on server.
zfs send can also determine the difference between two snapshots and send only the differences between the two. This saves on disk space as well as network transfer time. For example, if I perform the following command as user iceflatline:
1 |
zfs snapshot pool_0/dataset_0@snap-test-1 |
A second snapshot pool_0/data_0@snap-test-1 is created. This second snapshot contains only the file system changes that occurred in pool_0/dataset_0 between the time I created this snapshot and the previous snapshot, pool_0/dataset_0@snap-test-0. Now, as user iceflatline, I can use zfs send with the -i option and indicate the pair of snapshots to generate an incremental stream containing only the data that has changed:
1 |
zfs send -R -i pool_0/dataset_0@snap-test-0 pool_0/dataset_0@snap-test-1 | ssh iceflatline@192.168.20.6 zfs receive -vudF pool_0 |
Note that sending an incremental stream will only succeed if an initial full snapshot already exists on the receiving side. I’ve also included the -R option with the zfs send command this time. This option will preserve the ZFS properties of any descendant datasets, snaphots, and clones in the stream. If the -F option is specified when this stream is received, any snapshots that exist on the receiving side that do not exist on the sending side are destroyed.
By the way, I can list all snapshots created of pool_0/dataset_0 using the following command as either user root or iceflatline:
1 |
zfs list -t snapshot |
After testing to make sure that snapshots could be successfully sent to backup, I created an ugly little script that creates a daily snapshot of pool_0/dataset_0 on server; looks for yesterday’s snapshot and, if found, sends an incremental stream containing only the file system data that has changed to backup; looks for any snapshots older than 30 days and deletes them on both server and backup; and finally, logs its output to the file /home/iceflatline/cronlog:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 |
#!/bin/sh ### BEGIN INFO # PROVIDE: # REQUIRE: # KEYWORD: # Description: # This script is used to replicate incremental zfs snapshots daily from one pool/dataset(s) to another using ZFS send and receive. # The number of snapshots to retain is defined in the variable retention. # Note that an initial full snapshot must be created and sent to destination before this script can be successfully used. # Author: iceflatline <iceflatline@gmail.com> # # OPTIONS: # -R: Generate replication stream recursively # -i: Generate incremental stream # -v: Be verbose # -u: Do not mount received stream, # -d: Use the full sent snapshot path without the first element (without pool name) to determine the name of the new snapshot # -F: Destroy snapshots and file systems that do not exist on the sending side. ### END INFO ### START OF SCRIPT # These variables are named first because they are nested in other variables. snap_prefix=snap retention=30 # Full paths to these utilities are needed when running the script from cron. date=/bin/date grep=/usr/bin/grep mbuffer=/usr/local/bin/mbuffer sed=/usr/bin/sed sort=/usr/bin/sort xargs=/usr/bin/xargs zfs=/sbin/zfs src_0="pool_0/dataset_0" dst_0="pool_0" host="iceflatline@192.168.20.6" today="$snap_prefix-`date +%Y%m%d`" yesterday="$snap_prefix-`date -v -1d +%Y%m%d`" snap_today="$src_0@$today" snap_yesterday="$src_0@$yesterday" snap_old=`$zfs list -t snapshot -o name | $grep "$src_0@$snap_prefix*" | $sort -r | $sed 1,${retention}d | $sort | $xargs -n 1` log=/home/iceflatline/cronlog # Create a blank line between the previous log entry and this one. echo >> $log # Print the name of the script. echo "zfsrep.sh" >> $log # Print the current date/time. $date >> $log echo >> $log # Look for today's snapshot and, if not found, create it. if $zfs list -H -o name -t snapshot | $sort | $grep "$snap_today$" > /dev/null then echo "Today's snapshot '$snap_today' already exists." >> $log # Uncomment if you want the script to exit when it does not create today's snapshot: #exit 1 else echo "Taking today's snapshot: $snap_today" >> $log $zfs snapshot -r $snap_today >> $log 2>&1 fi echo >> $log # Look for yesterday snapshot and, if found, perform incremental replication, else print error message. if $zfs list -H -o name -t snapshot | $sort | $grep "$snap_yesterday$" > /dev/null then echo "Yesterday's snapshot '$snap_yesterday' exists. Proceeding with replication..." >> $log $zfs send -R -i $snap_yesterday $snap_today | ssh $host $zfs receive -vudF $dst_0 >> $log 2>&1 #For use in local snapshots #$zfs send -R -i $snap_yesterday $snap_today | $zfs receive -vudF $dst_0 >> $log 2>&1 echo >> $log echo "Replication complete." >> $log else echo "Error: Replication not completed. Missing yesterday's snapshot." >> $log fi echo >> $log # Remove snapshot(s) older than the value assigned to $retention. echo "Attempting to destroy old snapshots..." >> $log if [ -n "$snap_old" ] then echo "Destroying the following old snapshots:" >> $log echo "$snap_old" >> $log $zfs list -t snapshot -o name | $grep "$src_0@$snap_prefix*" | $sort -r | $sed 1,${retention}d | $sort | $xargs -n 1 $zfs destroy -r >> $log 2>&1 else echo "Could not find any snapshots to destroy." >> $log fi # Mark the end of the script with a delimiter. echo "**********" >> $log # END OF SCRIPT |
To use the script, I saved it to /home/iceflatline/bin with the name zfsrep.sh and, as user iceflatline, made it executable:
1 |
chmod +x /home/iceflatline/zfsrep.sh |
Then added the following cron job to the crontab under the user iceflatline account. The script runs every day at 2300 local time:
1 2 |
# Run backup scripts every day at 2300 0 23 * * * /home/iceflatline/bin/zfsrep.sh |
The script works is working pretty well for me, but I soon discovered that if it missed a daily snapshot or could not successfully send a daily snapshot to backup, say because either server or backup were offline or the connection between the two was down, then an error would occur the following day when the script attempts to send a new incremental snapshot. This is because backup was missing previous day’s snapshot and so the script could not send an incremental stream. To recover from this error I needed to manually send those missing snapshots. Say, for example, I had the following snapshots on server:
pool_0/dataset_0@snap-20150620
pool_0/dataset_0@snap-20150621
pool_0/dataset_0@snap-20150622
Now say that the script was not able to create pool_0/dataset_0@snap-20150623 on server because it was offline for some reason. Consequently, it was not able to successfully replicated this snapshot to backup. The next day, when server is back online, the script will successfully create another daily snapshot pool_0/dataset_0@snap-20150624 but will not be able to successfully send it to backup because pool_0/dataset_0@snap-20150623 is missing. To recover from this problem I’ll need to manually perform an incremental zfs send using pool_0/dataset_0@snap-20150622 and pool_0/dataset_0@snap-20150624:
1 |
zfs send -R -i pool_0/dataset_0@snap-20150622 pool_0/dataset_0@snap-20150624 | ssh iceflatline@192.168.20.6 zfs receive -vudF pool_0 |
Now both server and backup have the same snapshots and the script will function normally again.
File recovery
Having now a way to reliably replicate the file system offsite on daily basis, what happens if I need to recover some files? Fortunately, there are a couple of options available to me. First, because I chose to make snapshots visible on server, I can easily navigate to /pool_0/dataset_0/.zfs/snapshot and copy any files up to 30 days in the past (given the current retention value in the script). I could also mount pool_0/dataset_0 on backup and copy these same files from there using a utility like scp if desired.
I could also send snapshot(s) from backup to back to server. To do this I would create a new dataset on pool_0 on server. In this example, the new dataset is named receive:
1 |
zfs create pool_0/receive |
Why is creating a new dataset necessary? Because there exists already the dataset pool_0/dataset_0 on server. If I tried to send pool_0/dataset_0@some-snapshot from backup back to server there would be a conflict. I could have avoided this step if I had created a dataset on pool_0 on backup and replicated snapshots of pool_0/dataset_0 to that dataset instead of directly to pool_0.
Okay, now, as user iceflatline I can send the snapshot(s) I want from backup to server:
1 |
ssh iceflatline@192.168.20.6 zfs send pool_0/dataset_0@snap-20150620 | zfs receive -vduF pool_0/receive |
After the stream is fully received I switch to user root and mount the dataset:
1 |
zfs mount pool_0/receive/dataset_0 |
This will result in pool_0/dataset_0@snap-20150620 sent from backup to be mounted read only to pool_0/receive/dataset_0 on server. Now I can navigate to /pool_0/receive/dataset_0 and copy the files I need to recover, or I can clone or clone and promote pool_0/receive/dataset_0@snap-20150629, whatever.
Conclusion
Well, that’s it. A long and rambling post on how I’m using the replication features in FreeBSD’s ZFS to improve the reliability and resiliency of my file system backups. So far, it’s working rather well for me, and it’s been a great learning experience. Is it the best or only way? Likely not. Are there better (or at least more professional) utilities or scripts to use? Most assuredly. But for now I’ve met my most important requirement: reliably backing up my data offsite.
References
ZFS(8)
https://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/zfs.html