Thursday, November 28, 2013

MariaDB Database Disk snapshot backups on Windows

Using disk snapshots to perform MariaDB backups has become more and more common, but this seems to have been limited to Linux and Cloud environments. There seems that there has been a notion that Snapshots cannot be done on native Windows, the way we can do snapshots using LVM on Linux for example. This is actually False, if you use the NTFS filesystem snapshoting is built in and has been so for a long time, the base for it is the Volume Snapshot Service (VSS) which was introduced as far back as Windows XP.

Using Snapshots to backup on Linux or Windows works well, assuming that the storage engine you use can recover on startup, the backups created this was are physically consistent but not logically, but InnoDB will recover this. Also note that any warnings about using snapshots with MyISAM also applies to the mysql database, i.e. the database metadata. Usually though, this is not an issue, but if you want to protect yourself even more then use a FLUSH TABLES WITH READ LOCK and backup the MyISAM tables separately somehow. But this is no different between Windows and Linux in this case, this is an attribute of any kind of disk snapshot backups.

VSS is a services that run on Windows and to utilize it some programming is needed, but luckily that has already been done for us, there is a simple utility called VSHADOW.EXE available from Microsoft that can be used to access the VSS service. Some version of Windows, such as Windows Server 2008, Windows Server 2012 and also with a version of Windows that is very little used, probably there are more users of Windows 3.10, than of Windows 8, come with DISKSHADOW.EXE, a utility that does the same thing, basically, but has a different interface. In Windows 7, that I tested on, there is VSSADMIN.EXE instead, but as VSHADOW.EXE can be added, I downloaded that one as it is more functional (It is part of the Windows SDK). All this mess (c) Microsoft.

So, how does this work then? As I said, this is different than on Linux (what? Windows being different than Linux? No kidding?). Using the API you can create two types of snapshots, persistent and non-persistent. The difference is that a persistent snapshot is one that remains until specifically removed, whereas a non-persistent one goes away when the program that created it exists. As we are about to take a backup here, I guess you think we want a permanent one, but that is not so, as VSHADOW can run a command file for us, before it exists.

I will use an example here, for that I have placed my MariaDB data on the E-drive in a directory called \data, which means my my.ini has this line in it:

Next question then is where the snapshot is placed? Well, the snapshot ends up as a Windows device with a name along the lines of \\?\Volume{5b3610ce-d2cb-11e2-a649-2cd05a8ad0c9}\ which is just as easy to remember and type as the novel War and Peace (Yes, Tolstoy was a Windows guy, I guess you didn't know that!). Yes, again, thank you Microsoft, Windows is SOO user friendly! Anyway, yet again VSHADOW helps us out, as it can create a command file for us that will contain the names of the devices and some other things. It can look something like this and I have called it vss-setvar.cmd:
@echo [This script is generated by VSHADOW.EXE for the shadow set {c8fcb322-ce02-409a-b609-7fc04311fad1}]

SET SHADOW_SET_ID={c8fcb322-ce02-409a-b609-7fc04311fad1}
SET SHADOW_ID_1={11243eaf-0234-4784-92e1-9bc24668b5b4}
SET SHADOW_DEVICE_1=\\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy8

OK then, so this is generated for us, how do we use that? Well, what we do is create a command script that is called by VSHADOW as I said before, and this calls the generated scripts to get all the details. An example script, that I call e-backup.cmd, can look like this:
call vss-setvar.cmd
@echo off
dosdev T: %SHADOW_DEVICE_1%
robocopy T:\Data C:\Backup /MIR
dosdev /r /d T:

There are still a few missing bits here though. The call command is calling the generated file as above (you can call this anything you want, see the example command line below) and what about the command DOSDEV? What is that, that is a simple little utility you can find on the net (Google is your friend) that allows you to create a DOS device from a device descriptor, and it can also remove them. There are other ways of dealing with this, but I choose this path my my example here. Robocopy you probably know already and the /MIR option is used to tell robocopy to mirror a full directory tree.

So what the command script above will do is create a DOS device called T: for my snapshot, copy that to a directory called C:\Backup and then remove the T: device.

Now we have a command script (not two, the first one I showed was an example of a generated one) and an idea what it does, to do a shadow copy now, we just need to call vshadow, and it will do the work for us. We pass three parameters:
  • The name of a script with the device names and things to generate.
  • The name of a script to run
  • The name of the drive to create a shadow device of.
Using the example above to create a backup of the E-drive, where I have my MariaDB datafiles in the \data directory, this is the command I use:
vshadow -script=vss-setvar.cmd -exec=e-backup.cmd E:
So that's about it, now you know how to create an online snapshot backup of a MariaDB database on Windows.



Justin Swanhart said...

1) If you hold a FTWRL while you grab the snapshot, then all MyISAM data will be written to disk (at least as far as the OS cache) when the read lock is held. Because the snapshot will not miss OS cache data this means that you do not need to back up the MyISAM data separately, it is completely valid in the snapshot, and as a bonus it will be consistent with the InnoDB data.

2) FTWRL does not, however, affect InnoDB (except to block new transactions). So your concerns about cached data are valid, but only up to a point. InnoDB by default fsyncs the InnoDB transaction log at every commit. This makes sure the LOG write makes it to at all the way to platter.

3) When performing checkpoints (background writing) InnoDB uses the doublewrite buffer which is fsynced before writing background pages. This ensures that random background writes write complete pages.

Because of the combination of 2 and 3, InnoDB recovers from a snapshot exactly the same way as it would from a power loss. The logs are in the snapshot.

So there are a few requirements:
a) Your InnoDB log files must be on the same partition as the data files.

b) The FTWRL is necessary for MyISAM tables but not for InnoDB.

c) The FTWRL need only be held for the duration of the creation of the SNAPSHOT. The lock can then be released. Reads from the snapshot will see files as they were during the FTWRL.

And some warnings:
Snapshots add a LOT of overhead especially for random IO. It is wise to throttle your backup to avoid negatively penalizing the underlying volume.

Justin Swanhart said...

Just in case I wasn't clear:
1) FTWRL (wait until it completes)
2) take snapshot
3) release FTWRL
4) backup snapshot to somewhere else (throttle transfer speed if necessary)

Anders Karlsson said...


Yes, a FTWRL is needed for MyISAM, but not for InnoDB, and I focus on InnoDB here. What is sometimes forgotten is that the mysql database also uses MyISAM. If you don't have any "moving" MyISAM data and only InnoDB, it's usually a good idea to do a Flush Tables before the snapshot.

Also, thanx for the comments on performance of snapshots, this is very true (I just tried it on Windows though, and it wasn't too bad). And the whole thing requires NTFS, but I assume noone in their right mind puts database data on some other format disk on Windows these days.


Thomas pedro said...

When you have limited the rundown down to the organizations that offer you the least expensive auto credit rates, look at them on the Better Business Bureau site. In spite of the fact that the organizations that appear on the initial couple of pages of an online hunt are typically entirely respectable, it never damages to complete some additional examination to ensure they truly are a trusted organization with a decent history of client administration and solidness. Payday Loans Chicago

Teju Teju said...

It is nice blog Thank you provide important information and I am searching for the same information to save my time Big Data Hadoop Online Course India

Rainbow Training Institute said...

Thank you for providing such an awesome article and it is a very useful blog for others to read.

Oracle ICS Online Training

Rainbow Training Institute said...

Thanks for providing a useful article containing valuable information. start learning the best online software courses.

Workday Online Training