The ability to perform file system, database and volume snapshots grants us many data protection benefits. However there are some serious problems that can occur if we do not carefully architect snapshot based storage infrastructures. This blog entry will discuss some of the issues with data noise induction and data integrity when using point in time data snapshot activities and how we can reduce the negative aspects of these data protection methods.
With the emergence of snapshot technology in the data center data noise induction is an unwanted by product and needs to addressed. Active data within a file store or raw volume will have significant amounts of temporary data like memory swaps and application temp files. This type of data is required for system operations but unfortunately it is completely useless within a point in time snapshot and simply consumes valuable storage space with no permanent value within the scope of system data protection. There are many sources of this undesirable data noise that we need to consider and define strategies to isolate and eliminate them where possible.
In some cases using raw stores eg. iSCSI, FC etc. we will have duplicate snapshot functionality points in the storage stream and this further complicates how we approach a solution to noise induction issues. One common example of snapshot functionality duplication is Microsoft Windows 2003 Volume Snapshot Service aka. VSS. If we enable VSS and an external snapshot service is employed then we are now provisioning snapshots of snapshots which of course is less than optimal because much of the delta between points in time are just redundant encapsulated data. There are some higher level advantages to allowing this to occur like provisioning self service end user restores and VSS aware file system quiescence but for the most part it is not optimal from a space consumption or performance and efficiency perspective.
If we perform snapshots at multiple points in the data storage stream using VSS we will have three points of data delta. The changed data elements on the source store of the primary files, a copy-on-write set of changed blocks of the same primary store including it’s meta data and finally the external snapshot delta and it’s meta data. As well if the two snapshot events were to occur at the same time it creates a non-integral copy of a snapshot and meta data which is just pure wasted space since it is inconsistent.
With the co-existing use of VSS we need to define what functionality is important to us. For example VSS is limited to 512 snapshots and 64 versions so if we need to exceed these limits we have to employ an external snapshot facility. Or perhaps we need to allow a user self service file restore functionality from a shared folder. In many cases this functionality is provided by the external snapshot provisioning point. OpenSolaris, EMC and NetApp are some examples of storage products that can provide such functionality. Of course my preference is Custom OpenSolaris storage servers or the S7000 series of storage product which is based on OpenSolaris and is well suited for the formally supported side of things.
Solely provisioning the snapshots externally verses native MS VSS can significantly reduce induced data noise if the external provider supports VSS features or provides tools to control VSS. VSS copy on write snapshot capability should not be active when using this strategy so as to eliminate the undesirable snapshot echo noise. Most environments will find that they require the use of snapshot services that exceed the native MS VSS capabilities. Provisioning the snapshot function directly on a shared storage system is a significantly better strategy verses allowing a distributed deployment of storage management points across your infrastructure.
OpenSolaris and ZFS provides superior depth in snapshot provisioning than Microsoft shared folder snapshot copy services. Implementing ZFS dramatically reduces space consumption and allows snapshots up to the maximum capacity of the storage system and OpenSolaris provides MS SMB client access to the snapshots which users can manage recovery as a self service. By employing ZFS as a backing store, snapshot management is simplified and snapshots are available for export to alternate share points by cloning and provisioning the point in time to data consumers performing a multitude of desirable tasks such as audits, validation, analysis and testing.
If we need to employ MS VSS snapshot services provisioned on a storage server that uses snapshot based data protection strategies then we will need prevent continuous snapshots on the storage server. We can use features like snap mirror and zfs replication to provision a replica of the data however this would need to be strictly limited in count e.g. 2 or 3 versions and timed to avoid a multiple system snapshot time collision. Older snapshots should be purged and we only allow the MS VSS snapshot provisioning to keep the data deltas.
Another common source of snapshot noise is temporary file data or memory swaps. Fortunately with this type of noise the solution is relatively easy to solve as we simply isolate this type of storage onto storage volumes or shares that are explicitly excluded from a snapshot service function. For example if we are using VMFS stores we can place vswp files on a designated VMFS volume and conversely within an operating system we can create a separate vmdk disk that maps to a VMFS volume which we also exclude from the snapshot function. This strategy requires that we ensure that any replication scheme incorporates the creation or one time replication of these volumes. Unfortunately this methodology does not play well with storage vmotion so one must ensure that the a relocation does not move the noisy vmdk’s back into the snapshot service provisioned stores.
VMware VMFS volume VM snapshots is a significant source of data noise. When a snapshot is initiated from within VMware all data writes are placed on delta file instances. These delta files will be captured on the external storage systems snapshot points and will remain there after the VM snapshot is removed. Significant amount of data delta are produced by VM based snapshots and sometimes mulitple deltas can exceed original vmdk size. An easy way to prevent this undesirable impact is to clone the VM to a store outside the snapshot provisioned stores rather than invoking snapshots.
Databases are probably the most challenging source of snapshot noise data and requires a different strategy than isolation because the data within a specific snapshot is all required to provide system integrity. For example we cannot isolate SQL log data because it is required to do a crash recovery or to roll forward etc. We can isolate temp database stores since any data in those date stores would not be valid in a crash recovery state.
One strategy that I use as both a blanket method when we are not are able to use other methods and in concert with the previously discussed isolation methods is a snapshot roll-up function. This strategy simply reduces the number of long term snapshot copies that are kept. The format is based on a Grand Father, Father and Son (GFS) retention chain of the snapshot copies and is well suited for a variety of data types. The effect is to provide a reasonable amount of data protection points to satisfy most computing environments and keep the captured noise to a manageable value. For example if we were to snapshot without any management cycle every 15 minutes we would accumulate ~35,000 delta points of data over the period of 1 year. Conversely if we employ the GFS method we will accumulate 294 delta points of data over the period of 1 year. Obviously the consumption of storage resource is so greatly reduced that we could keep many additional key points in time if we wished and still maintain a balance of recovery point verses consumption rate.
Let’s take a look at a simple real example of how snapshot noise can impact our storage system using VMware, OpenSolaris and ZFS block based iSCSI volume snapshots. In this example we have a simple Windows Vista VM that is sitting idle, in other words only the OS is loaded and it is power on and running.
First we take a ZFS snapshot of the VMFS ZFS iSCSI volume.
zfs snapshot sp1/ss1-vol0@beforevmsnap
Now we invoke a VMware based snapshot and have a look at the result.
root@ss1:~# zfs list -t snapshot
NAME USED AVAIL REFER MOUNTPOINT
sp1/ss1-vol0@beforevmsnap 228M - 79.4G -
Keep in mind that we are not modifying any data files in this VM if we were to change data files the deltas would be much larger and in many cases with multiple VMware snapshots could exceed the VMs original size if it is allowed to remain for long periods of weeks and longer. The backing store snapshot initially consumes 228MB which will continue to grow as changes occurs on the volume. A significant part of the 228MB is the VMs memory image in this case and of course it has no permanent storage value.
sp1/ss1-vol0@after1stvmsnap 1.44M - 79.5G -
After the initial VMware snapshot occurs we create a new point in time ZFS snapshot and here we observe some noise in the next snapshot and again we have not changed any data files in the last minute or so.
sp1/ss1-vol0@after2ndvmsnap 1.78M - 79.5G -
And yet another ZFS snapshot a couple of minutes later shows more snapshot noise accumulation. This is one of the many issues that are present when we allow non-discretionary placement of files and temporary storage on snapshot based systems.
Now lets see the impact of destroying a snapshot that was created after we delete the VMware based snapshot.
root@ss1:~# zfs destroy sp1/ss1-vol0@beforevmsnap
root@ss1:~# zfs list -t snapshot
NAME USED AVAIL REFER MOUNTPOINT
sp1/ss1-vol0@after1stvmsnap 19.6M - 79.2G -
sp1/ss1-vol0@after2ndvmsnap 1.78M - 79.2G -
Here we observe the reclamation of more than 200MB of storage. And this is why GFS based snapshot rollups can provide some level of noise control.
Well I hope you found this entry to be useful.
Til next time..
Site Contents: © 2009 Mike La Spina
I was conversing with William Lam about my blog entry Protecting ESX VMFS Stores with Automation and we exchanged ideas on the simple automation script that I originally posted. William is well versed in bash and has brought more functionality to the original automation script. We now have a have a rolling backup set 10 versions deep with folder augmented organization based on the host name, store alias, date label and the rolling instance number. The updated script is named vmfs-bu2 linked here.
Thanks for your contribution William!
Site Contents: © 2009 Mike La Spina
Some time ago I shared some interesting information about VMFS volumes that I found using direct analysis in my blog named Understanding VMFS volumes. This spawned some discussions on the VMware Community forums and it became apparent that an automated backup of the critical VMFS info could be useful in the event of an undesirable security event that impacts our system availability. By creating a simple backup script process we can provide the ability to recover much more quickly from such events. In this howto guide we will enable this process with a cron job using the existing /etc/cron.daily/ job location directory. We simply need to copy an automation script to this location and it will run daily. Or if your change rate is less frequent maybe the /etc/cron.weekly location is more suitable.
I have provided a simple script named vmfs-bu linked here which creates VMFS header file backups using dd for all the VMFS volumes visible on an ESX host. The script will output it’s backups to the /var/log directory. Here is a sample of what the backups look like from one my lab hosts.
-rw-r–r– 1 root root 2097152 Mar 28 21:01 vmfs-header-backup-sda.hex
-rw-r–r– 1 root root 2097152 Mar 28 21:01 vmfs-header-backup-sdb.hex
-rw-r–r– 1 root root 2097152 Mar 28 21:01 vmfs-header-backup-sdc.hex
-rw-r–r– 1 root root 2097152 Mar 28 21:01 vmfs-header-backup-sdd.hex
-r——– 1 root root 8388608 Mar 28 21:01 vmfs-metadata-sda-vh.sf.bu
-r——– 1 root root 8388608 Mar 28 21:01 vmfs-metadata-sdb-vh.sf.bu
-r——– 1 root root 8388608 Mar 28 21:01 vmfs-metadata-sdc-vh.sf.bu
-r——– 1 root root 8388608 Mar 28 21:01 vmfs-metadata-sdd-vh.sf.bu
Also be sure that the script permissions are configured as executable using the follwing.
chmod 755 /etc/cron.daily/vmfs-bu
With these simple easy steps we can create a proactive failsafe for VMFS recovery in the event of those undesirable moments we hope will never occur.
In addition to this basic method we can extend it’s functionally to recovery from undesirable VM deletions by automating the experimental vmfs-undelete tool. Before I demonstrate this functionally I must advise you that this tool may have some negative impacts to certain configurations. The vmfs-undelete tool is a set of python scripts that basically read vmks block allocations and outputs them to an archive file. In order to read the vmdks the scripts will invoke VM snapshots and this will not work if a current snapshot exists. This is not a big issue since you should not be leaving any snapshots around anyway as a best practice. However this action can negatively impact VCB or other backup agents that require snapshoting to grant vmdk access, Thus if you do not carefully time the script invocation when we have other snapshot dependant activities it could cause failures of those functions.
With this important information considered we can proceed to look at how to invoke the additional protection feature. The scripts that were originally developed by VMware are designed to be user interactive and cannot be used as originally coded therefore I have modified them in order to provision some basic automation. You can access the modified scripts named vmfs-undelete-auto-script and menuauto.py here.
The modified menuauto.py script needs to be placed within the /usr/lib/vmware/python2.2/site-packages/vmware/undeletemods directory. While I could have just modified the existing menu.py script it is subject to change so this method prevents potential conflict issues. The vmfs-undelete-auto-script script location is optional and can be placed where ever you find apropriate. I chose to place it in the /root directory. The script requires a single argument which is to direct it output location with a path and file name. Since there is a potention for conflict with other snapshot based services the script should be invoked using a cron job outside of the daily or other predefined jobs. This cron job can be implemented using the crontab facility. Here is an example of how to create it while logged in as root.
This command will create a cron entry in /var/spool/cron/root and will load vi as a editor you will need to create the following cron details to invoke the vmfs-undelete script. You can find more info on cron editing at http://en.wikipedia.org/wiki/Cron.
05 22 * * * python /root/vmfs-undelete-auto-script /var/log/vmfs-undelete-archive
This entry will execute at 10:05PM every day. Again be sure to set the security of the script file as executable as shown here.
chmod 755 /root/vmfs-undelete-auto-script
When the script executes it can take several minutes to complete.
Hope you found this information to be useful.
Site Contents: © 2009 Mike La Spina
As a follow on to my blog entry Provisioning Disaster Recovery with ZFS, iSCSI and VMware I created this snapshot rollup script to help maintain the growing snapshots and minimize disk consumption. The script is an add-on to the zfsadm account cron jobs and runs under the security privileges of the zfsadm user detailed in that blog. An input text file is used to specify what ZFS path’s will be rolled up to a Grandfather Father Son backup scheme. All out of scope snapshots are destroyed leaving the current day’s and week’s snapshots, Friday weekly snapshots of the current month, each month’s end and as well, in time the year end snapshots. The cron job needs to run at minimum on target host but it would be prudent to run it on both systems. The script is aware of the possiblity that a snapshot may be cloned and will detect and log it. To add the job is simply a matter of adding it to the zfsadm users crontab.
# crontab –e zfsadm
0 3 * * * ./zfsgfsrollup.sh zfsrollup.lst
Hint: crontab uses vi – http://www.kcomputing.com/kcvi.pdf “vi cheat sheet”
The key sequence would be hit “i” and key in the line then hit “esc :wq” and to abort “esc :q!”
The job detailed here will run once a day at 3:00 AM which may need to be extended if you have a very slow link between the servers. If you intend to use this script as shown you should follow the additional details for adding a cronjob found in the original blog, items like time zone and the likes of are discussed there.
As well the script expects the gnu based versions of date and expr.
Here are the two files that are required
Hopefully you will find it to be useful.
Site Contents: © 2008 Mike La Spina