Updated ZFS Replication and Snapshot Rollup Script
Thanks to the efforts of Ryan Kernan we have an updated ZFS replication and snapshot rollup script. Ryan’s OpenIdiana/Solaris/Illumos community contribution improves the script to allow for a more dynamic source to target pool replication and changes the shapshot retention method to a specific number of snapshots rather than a Grandfather Father Son method.
Regards,
Mike
Site Contents: © 2011 Mike La Spina
If zfs filesystems with similar name exists in input list the replication fails.
Added @ sign when matching snapshots to allow for similar names.
Regards,
Leif
patch:
*** zfs-replication.sh 2012-02-07 13:06:30.729053500 +0100
— /tmp/zfs-replication.sh 2012-02-07 13:06:09.781167636 +0100
***************
*** 104,110 ****
ssh -n $dhost pfexec zfs create -p $zfspath
ssh -n $dhost pfexec zfs set mountpoint=none $zfspath
! last_snap_shost=$( pfexec zfs list -o name -t snapshot -H | grep $zfspath | tail -1 )
echo $(date) “->” $last_snap_shost Initial replication start. >> replicate.log
pfexec zfs send -v -R $last_snap_shost | ssh $dhost pfexec zfs recv -v -F -d $zfspool
echo $(date) “->” $last_snap_shost Initial replication end. >> replicate.log
— 104,110 —-
ssh -n $dhost pfexec zfs create -p $zfspath
ssh -n $dhost pfexec zfs set mountpoint=none $zfspath
! last_snap_shost=$( pfexec zfs list -o name -t snapshot -H | grep “${zfspath}@” | tail -1 )
echo $(date) “->” $last_snap_shost Initial replication start. >> replicate.log
pfexec zfs send -v -R $last_snap_shost | ssh $dhost pfexec zfs recv -v -F -d $zfspool
echo $(date) “->” $last_snap_shost Initial replication end. >> replicate.log
***************
*** 165,171 ****
if [ “$zfspath” != “” ]
then
! pfexec zfs list -o name -t snapshot | grep $zfspath > $snap_list
while read snaps
do
— 165,171 —-
if [ “$zfspath” != “” ]
then
! pfexec zfs list -o name -t snapshot | grep “${zfspath}@” > $snap_list
while read snaps
do
Hello,
Thank you for putting together such a useful script. I do have 2 questions though.
1. has the above patch been applied to the script that is available for download? if not, how do I apply the patch? is the patch required?
2. What does a sample input file look like?
Thank you.
Allan
Allan,
The patch is applied to the linked script and I have updated the comments to indicate what the input file format should be.
Regards,
Mike
Hi Mike
I want to just replicate/send a snapshot to a different pool on the same system, can i do that with this script, please excuse the ignorance i’m new to ZFS
thanks
Paul
Hi Paul,
This script and it’s associated ssh key pair setup are really intended for an external host over the network. For a local system it is not required and you can simply issue the command locally
For example:
pfexec zfs send somepool/somefilesystem@snapname | pfexec zfs recv sameorotherpool/somefilesystem
will provide the same result without the overhead.
The script will not work unmodified.
You could remove all the ssh portions and hard code the target pool in the script and it would provide the result as well.
Regards,
Mike
Looking through the code, I notice there’s a source $zfspath, but no destination, which means the pool/tank name on both source and destination must be the same?
Would it be better to do something like this?
input file: pool1/zfsstore,host1,pool2/replicationstore,host2
This means you’d need to have a $zfspoolSrc, $zfspathSrc and corresponding $zfspoolDest and $zfspathDest, starting with function parse_rep_list().
And without even more modification, I believe the zfs send would create a $zfspoolDest/$zfspathDest/basename($zfspathSrc), e.g. if you are replicating “pool1/share1/filesystem1”, with destination “pool2/share2/filesystem2”, you’d end up with “pool2/share2/filesystem2/filesystem1”?
Hi Yu-Phing,
That would certainly enable a more dynamic capability for the script and yes it would create the zfs file system but not always the way you want.
This method can create issues if there are clones in the source ZFS files system.
I would recommend keeping the source path intact on the destination e.g SrcPool/SrcName -> DestPool/SrcName.
You also need to be carefull with NFS mounts.
For example if the target host already contained an NFS mount point of /export/vol1 to some portion of the replicated ZFS file system then any source NFS share with the same mount point would conflict.
NFS properties will be inherited as a property of the ZFS file system.
Regards,
Mike
there a typo in the file line 79
$$ZFS should be $ZFS
Corrected.
Thanks Nicolas!
Here’s a diff to get this script working correctly on a FreeBSD 9.0 system. /usr/ports/shells/bash is required. Rudimentary locking is in place as well.
— /tmp/zfs-replication.sh 2012-04-22 09:02:50.000000000 -0500
+++ /root/bin/zfs-replication.sh 2012-05-18 16:06:35.000000000 -0500
@@ -1,4 +1,4 @@
-#!/bin/bash
+#!/usr/local/bin/bash
#
# Author: Ryan Kernan
# Date: September 23, 2011
@@ -24,18 +24,23 @@
#
#
#
–
-ZFS=/usr/sbin/zfs
-DATE=/usr/gnu/bin/date
+ZFS=/sbin/zfs
+DATE=/bin/date
SSH=/usr/bin/ssh
-PFEXEC=/usr/bin/pfexec
-export PATH=/usr/gnu/bin:/usr/bin:/usr/sbin:/sbin # Set Paths.
+PATH=/sbin:/bin:/usr/sbin:/usr/bin:/usr/games:/usr/local/sbin:/usr/local/bin:/root/bin
rep_list=$1
keep_snaps=7
snap_list=snaplist.lst
+if [ -e /tmp/zfs-rep.lock ];
+then
+ exit
+else
+ touch /tmp/zfs-rep.lock
+fi
+
#######################################################################################
####################################Function###########################################
#######################################################################################
@@ -75,8 +80,8 @@
last_snap_dhost=””
last_snap_shost=””
– last_snap_shost=$( $PFEXEC $ZFS list -o name -t snapshot | grep $zfspath | tail -1 )
– last_snap_dhost=$( $SSH -n $dhost $PFEXEC $ZFS list -H -o name -r -t snapshot | grep $zfspath | tail -1 )
+ last_snap_shost=$( $ZFS list -o name -t snapshot | grep $zfspath | tail -1 )
+ last_snap_dhost=$( $SSH -n $dhost $ZFS list -H -o name -r -t snapshot | grep $zfspath | tail -1 )
true
}
@@ -96,7 +101,7 @@
dhost_fs_exists() {
dhost_fs_name=””
– dhost_fs_name=$($SSH -n $dhost $PFEXEC $ZFS list -o name -H $zfspath | tail -1)
+ dhost_fs_name=$($SSH -n $dhost $ZFS list -o name -H $zfspath | tail -1)
if [ “$dhost_fs_name” = “” ]
then
@@ -119,11 +124,11 @@
dhost_create_fs() {
– $SSH -n $dhost $PFEXEC $ZFS create -p $zfspath
– $SSH -n $dhost $PFEXEC $ZFS set mountpoint=none $zfspath
– last_snap_shost=$( $PFEXEC $ZFS list -o name -t snapshot -H | grep “{$zfspath}\@” | tail -1 )
+ $SSH -n $dhost $ZFS create -p $zfspath
+ $SSH -n $dhost $ZFS set mountpoint=none $zfspath
+ last_snap_shost=$( $ZFS list -o name -t snapshot -H | grep “$zfspath\@” | tail -1 )
echo $($DATE) “->” $last_snap_shost Initial replication start. >> replicate.log
– $PFEXEC $ZFS send -v -R $last_snap_shost | $SSH $dhost $PFEXEC $ZFS recv -v -F -d $zfspool
+ $ZFS send -v -R $last_snap_shost | $SSH $dhost $ZFS recv -v -F -d $zfspool
echo $($DATE) “->” $last_snap_shost Initial replication end. >> replicate.log
}
@@ -141,7 +146,7 @@
snap_date=”$($DATE +%m-%d-%y-%H:%M)”
echo $($DATE) “->” $zfspath@$snap_date Snapshot creation start. >> replicate.log
– $PFEXEC $ZFS snapshot $zfspath@$snap_date
+ $ZFS snapshot $zfspath@$snap_date
echo $($DATE) “->” $zfspath@$snap_date Snapshot creation end. >> replicate.log
}
@@ -160,7 +165,7 @@
incr_repl_fs() {
echo $($DATE) “->” $last_snap_dhost $last_snap_shost Incremental send start. >> replicate.log
– $PFEXEC $ZFS send -I $last_snap_dhost $last_snap_shost | $SSH $dhost $PFEXEC $ZFS recv -d -F $zfspool >> replicate.log
+ $ZFS send -I $last_snap_dhost $last_snap_shost | $SSH $dhost $ZFS recv -d -F $zfspool >> replicate.log
echo $($DATE) “->” $last_snap_dhost $last_snap_shost Incremental send end. >> replicate.log
}
@@ -182,28 +187,24 @@
if [ “$zfspath” != “” ]
then
– $PFEXEC $ZFS list -o name -t snapshot | grep „${zfspath}\@‰ > $snap_list
+ $ZFS list -o name -t snapshot | grep $zfspath\@ > $snap_list
while read snaps
do
– stringpos=0
– let stringpos=$(expr index “$snaps” @)+1
– SnapDate=$( $PFEXEC expr substr $snaps $stringpos 8 )
– let stringpos=$($PFEXEC expr index “$snaps” @)+10
– SnapTime=$( $PFEXEC expr substr $snaps $stringpos 5 )
+ SnapDate=`echo $snaps | cut -d @ -f 2`
SnapDay=”$(echo $SnapDate | cut -d- -f2)”
SnapMonth=”$(echo $SnapDate | cut -d- -f1)”
SnapYear=”$(echo $SnapDate | cut -d- -f3)”
SnapDate=”$SnapMonth-$SnapDay-$SnapYear”
– if [ “$($DATE +%m-%d-%y –date=”$keep_snaps days ago”)” = “$SnapDate” ]
+ if [ “$($DATE -v-${keep_snaps}d +%m-%d-%y)” = “$SnapDate” ]
then
echo “Destroying $SnapDate snapshot $snaps on $shost” >> replicate.log
– $PFEXEC $ZFS destroy $snaps
+ $ZFS destroy $snaps
echo “Destroying $SnapDate snapshot $snaps on $dhost” >> replicate.log
– $PFEXEC $SSH -n $dhost $PFEXEC $ZFS destroy $snaps
+ $SSH -n $dhost $ZFS destroy $snaps
fi
done > replicate.log # Get the start and stop snaps.
@@ -260,4 +260,7 @@
done < $rep_list
+rm /tmp/zfs-rep.lock
+
exit 0
+
Thanks for your community contribution!
Here is a online file version http://blog.laspina.ca/misc/zfs-repl-freebsd9.sh
minor problem in the existing script:
last_snap_shost=$( $PFEXEC $ZFS list -o name -t snapshot -H | grep "{$zfspath}\@" | tail -1 )
should be:
last_snap_shost=$( $PFEXEC $ZFS list -o name -t snapshot -H | grep "${zfspath}\@" | tail -1 )
Thanks for the bug check. The online script is now updated to v1.2
I known the reason,i need to create a file as input,which contain format is zpool/fs,sourcehost,desthost.
I found you are using zfs send -R for the initial send,but why not using zfs send -R -i for the incremental send? I have a zpool/project,the sub-preject created by zfs create zpool/projectX.
The code uses -i to define a starting and ending point in the ZFS send stream. e.g. Send the delta between two time points. -R send all subsequent snapshots after a time point which gives a undesired result when you only require a specific range. Thus the subroutine is designed to handle a specific range or can be set to the current time point.
Thank you for your work, but i read you says its updated to version 1.2 but the download reads version 1.1 in the comments. and is the freebsd 9.0 patch also included in this version?
Thanks in advance
mccs,
Have a look at http://blog.laspina.ca/ubiquitous/updated-zfs-replication-and-snapshot-rollup-script/comment-page-1#comment-962
I am having a great deal of problems with this input file. I guess I just do not understand what it needs to be named and placed. I also have no idea how to format the file correctly. Any help would be greatly appreciated.
There are only three parameters you need to address.
1. The pool and zfs path which are represented as:
pool/fsname
2. The source host name or ip
3. The target host name or ip
On the target host you need to create the same pool name as the source.
Do not create the fsname as it will be created for you.
The three params must be separated by comma’s.
pool/fsname,host1,host2
That is the basics of the script.
I understand the format that the paramaters need to be in but am having some issues understanding where to put them. It almost looks like the script points to a file to find the config. Is this correct. So do I just create a file with the correct parameters in it and then alter the script to point to that file?
File named: list
with the following in it pool/storage,zfs1,zfs2 assuming storage is the name of the pool and the zfs machines are named zfs1 and zfs2.
Do i then alter the script here and like this:
rep_list=list
keep_snaps=7
snap_list=snaplist.lst
Or am i completely off?
Hi Chuck,
You’re close, the variable $1 is the first parameter input for the script. Your file name can be what ever you wish and does not become a hard coded element in the script. For example if your file name was zfsrepl.list you would invoke the script as follows:
pfexec ./zfs-replication.sh zfsrepl.list
The file contents becomes the input values within the script with $1 as the file handle.
Regards,
Mike
Got it thanks!
Hi Mike, I’m using zfs on linux with ubuntu in some production servers and based in your script I made a new one with several new features (I keep some credits to you in the header):
– support running it every minute using a new snapshot name format
– support several cron schedules at the same time, using snapshot prefix for the name
– support different protocols for replication: SSH SSH+GZIP NETCAT SOCAT NETCAT+SOCAT allowing transfer up to 250MB/s over 10gb networks
– source and target dataset could have different name
– has an option to clean snapshots in target that are not in source
– can send mail on any error
– has an option to compare source and target with checksuming after replication (for zfs on linux, because it still release candidate)
I publish it in github https://github.com/kattunga/zfs-scripts.git
At this moment it’s designed for Ubuntu Linux, but if anybody is interested it can be modified to run in BSD, OpenIndiana, etc.
I’m still developing it but now is working very well.
Regards
Looks great, thanks for contributing your improvements.
Regards,
Mike
I’m Back
OK finally got around to trying this out.
Here is the setup both running openindiana w/ Napp-it
1st machine: ZFS1, 192.168.16.9
2nd machine: ZFS2, 192.168.16.8
I have a pool/ZFSFolder on ZFS1 named ZFS1/ZFS1
I created a pool on ZFS2 names ZFS1 No ZFSFolder
I place the script and list file (hosts.list) in /root using winscp and made sure they were executable.
Contents of list file are:
ZFS1/ZFS1,192.168.16.9,192.168.16.8
I then SSH’d into 192.168.16.9 and invoked the script with “pfexec ./zfs-replication.sh hosts.list”
It just goes right back to the command prompt and doesn’t run anything.
Do I need to establish a handshake with ssh certs? If so do you have a good link explaining this process in openindiana. Couldn’t find anything to clear in Google.
Did I leave anything else out?
Any help is greatly appreciated.
Hey Chuck,
You do need to provision ssh connectivity between the hosts as well you should use a dedicated account.
Follow the latter of this blog entry http://blog.laspina.ca/ubiquitous/encapsulating-vt-d-accelerated-zfs-storage-within-esxi for creating the account assigning permissions and granting a rsa certificates to the accounts.
Regards,
Mike
One Step Closer. SSH is now working without a password prompt. I am just using this in a test environment so I set it up under root.
The script is still exiting without looking like it is running. Looks like it should spit out a log file but I see nothing in the folder where the script resides. Thanks again. Hate to be a pain but hopefully all my newbie questions help someone else also.
Hi all
Is the retention period for the snapshots, only for the source or both source and target hosts?
Thanks in advance 🙂
Hi Nicko,
Just the source is processed.
Regards,
Mike
Dear Mike
The freebsd9 version is sadly very broken. The patch must have been applied incorrectly, as only half of check_last_snap() is there.
Any change the freebsd changes could make it upstream in your script?
Hugs,
Sandra =)
Sandra,
Here is the link to the FreeBSD version, it’s untested. http://blog.laspina.ca/misc/zfs-repl-freebsd9.sh
Regards,
Mike
Dear Mike,
Try have a look at like 50 in your link, where half of a function is missing.
I patched it by hand
http://pastebin.com/HYB23rMV
The only difference is that the paths are different, no other changes were made. So wrapping a couple of if-statements around those, should give FreeBSD support in your script.
Hugs,
Sandra =)
Oh my, that’s very broken … OK I have updated it temporarily. When time permits it just needs an “if statement” that sets the path variables. Setting $PFEXEC to null along with the other path changes should make it work for either platform.
Thanks for the contribution.
Regards,
Mike