Why Snapshots Are Good - and Why They Are Bad

The Geeksultant's picture

Anyone using hypervisors to virtualize their server infrastructure, is keenly aware of the snapshot feature. Yet most, don't really understand how they work. Snapshots are "point in time" backups of the server hard disks. I use the term backup as a reference only, as you should never use snapshots for backups. It's a bad practice. (Even though most backup software use snapshots to perform backups - more on this later in this post). When you first create a new virtual machine, you will have a virtual disk attached. In Windows, this would be your C: drive. This disk is your base disk. When you create a snapshot, a new disk is created. The base disk goes into read only mode. No more data is written to that disk. All new data is written to the snapshot. You now have a "point in time" backup, so to say. Data is read from both the base disk and the snapshot disk. This works very well when you install new updates or new software. For example, you need to install the monthly windows updates. To start, you create a snapshot and call it "Before Windows Updates". Once the snapshot has been created, you proceed to install your updates. After installation you find that there is an issue with one of the applications that's not happy about one of the Windows updates that was installed. The server is not usable as you had expected after the updates. What to do? You simply roll back to the base disk prior to to snapshot and voila, you are back to where you were prior to the updates. The snapshot did what you expected it to do. It saved the day! However, let's assume that the opposite occurred. You had no failures. Unfortunately, you left the snapshot in place and months went by. Eventually, the snapshot becomes as large as the original virtual base disk. Worse, you take another snapshot a month after the first. Now, both the base disk and the first snapshot are read only and the new snapshot is where all new changes are stored. After another few months you start to have performance issues with the virtual server. What's happening is that the virtual machine is now handling 3 separate virtual disks instead of just one. Reading from three to find data and writing to one to store data. So what happens when you delete the snapshots? Well, that all depends which one is the active one. If you reverted to the base disk and delete snapshots, it does just that, removes them from the system and you keep running with your base disk that you started with. But let's say you are using the 2nd snapshot. When you delete the snapshots (1 and 2), the system takes the base disk and merges it with the 1st snapshot (both are read only at this point) and then merges this new base disk with the 2nd snapshot until a new base disk is created from all three. This process can take quite some time depending how large the original base disk is and how large each subsequent snapshot has grown. Based upon industry standard best practices you should delete snapshots as soon as possible. Best within 1 to 3 days, but no longer than a week. Sometimes you may need to perform several updates at the same time. It's ok to create multiple snapshots, one for each update. That way, if something does go wrong, you can roll back one snapshot at at a time until you find the one that is causing the issue. Do keep in mind that once you roll out 5 or more snapshots, the virtual machine's performance will be affected. As before, delete the snapshots as soon as you are sure everything is running as expected.

Earlier I mentioned don't use snapshots for backups. Hopefully, after reading this article you'll realize why it would not be a good practice. I did say that many backup programs use snapshots. Here's how. The backup program takes a snapshot and then immediately backs up the base disk, which is read only at this point. This way, it can back up the virtual machine while it's running without new data being written to the machine causing the backup to hang. Once the backup of the static base disk is complete, the backup program deletes the snapshot which merges the base disk with any new changes that occurred during the backup window.

So, use snapshots. Use them wisely and don't let them hang around for a long time.

Please note, that this explanation isn't 100% technically correct and is only meant to demonstrate the process in simple terms. If you want to dig into the weeds of how snapshots work, check out this VMware article --> http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd...