Friday, July 29, 2011

Yikes.. Backing up a sharded MongoDB is no fun!

Backing up databases has never been fun, not as fun as having a cool English Ale on the balcony on a hot summar day anyway, but MongoDB takes this one step further when it comes to annoyances.

In general, I often feel that many Open Source projects start with good intentions for what the project should do and how, but then more stuff is added as you require it, and suddenly what started out as a simple and fast application for a narrow usecase, suddenly turns into a bit of a mess. And the issue might well be that building fast, compact software for a specialized usecase, as they start out, is not the same as writing generic software, with a wide range of use cases, code that can easily be maintained and enhanced as we go along. And why should it not be like that? In many cases, this is just fine and the limited usecase is just what the project sets out to do, and it does it well. But sometimes this turns into something really annoying, and at the same time useful.

I think MySQL is partly like this. There are many things in MySQL that work really well, so many things that in a small set of code achives so much useful stuff. And then there are things with MySQL that are just outright wrong and yet another bunch of things that are just plainly annoying. Largely, MySQL is very much developer focused more than DBA focused, although this is improving (and this is also a personal opinion of mine).

And then we have MongoDB, one of the big contenders on the NoSQL side in the NoSQL vs. SQL battle (which is a silly battle, but lets ignore that for now). Now, MongoDB is supposed to be a database. One that is faster, more compact and more targeted towards general database needs than MySQL. A database that can scale and replicate and shards automatically! Brilliant. And then this boring old DBA comes around with his bitterness and boredom and ignorance of the "new" database system. And he says things like: "Can you do a backup"... Yikes! Never thought of that. And what boooring guy that DBA is!

Yes, with MongoDB, backups is clearly an afterthought. And although this is again my personal opinion, I base it on something, namely this: Backing up a sharded Cluster. This is just plain silly. The way these steps are taken is in no way consistent, some of the operations are asynchronous, which means you have to wait for them (write code. To do a friggin' backup? Who came up with THAT daft idea?). You have to backup config servers cold, i.e. shut down. Dead. Who came up with that? And in the end, you don't even get a consistent backup. And yes, if you use replicas, you have to physically back up the replicas also! What?

Whoever figured out how to shard and replicate with MongoDB did a reasonable job, it actually works OK. But the person in question apparently forgot that databases are to be backed up. And before you ask: No, I am NOT going to take a mongodump of 1.5 Tb data!

This said, most aspects of MongoDB are OK, but backups are a mess. Read the page I linked to above, and you also realize that it is not well documented, to say the least, how to backup and what happens if the steps aren't followed? What happens if I do not backup the stopped config server? Can I do a mongodump of the config server instead? Why in heavens name can't I:
  • Flush and lock the config servers?
  • Flush all the dataservers in one go? No, you have to do it in one dataserver at the time.
  • Flush and lock the config servers?
  • And yes, why can't you flush and lock the config servers.
Also, locking is weird here. You have can lock serveral times, and then you have to unlock as many times. I really do not know why. Also, I have yet to figure out how to know if it is locked, without at the same time unlocking, but it might be that I missed something in the docs (which aren't very good I'm afraid, but this is not the first Open Source project with lacking documentation).

And having written all this, I have now created a script that will do all this for us, in unattended fashion. Yes, backups are supposed to be able to run unattended! No, I do not want any manual checking in the midst of a backup process! I do not want to be up at 3AM!

And Yes! I want a way to verify my backups, with some ease! No, I'm not going to set up a cluster of a 8 nodes with 4 Tb disk, just to verify a backup! And this is usually much less of a problem with MySQL, as it is not sharded / distributed. But for anything that IS sharded / distributed, for heavens sake, make sure there are tools to support this. In particular Backup tools!

/Karlsson

5 comments:

Thilak Nathen said...

I think you're missing the point. Replicas __ARE__ your backup.

Baron said...

Thilak Nathen, millions of kittens die every time you think that. Backups survive incidents such as a malicious person deleting every document from the primary. Wake up.

Anonymous said...

++.

There is a theory that major engineering disasters happen every 30-40 years. The basic pattern is: 1) Disaster happens 2) Engineers learn volumes from said disaster and come up with ways to avoid it 3) They have success and people become complacent - "We have licked that entropy thing!" 4) Over reliance / over confidence in knowledge leaves gaps for next disaster...especially as a new generation that hasn't had their hand bitten by disaster steps up. Wash, rinse, repeat.

http://discovermagazine.com/2007/aug/man-who-predicted-the-bridge-collapse

Comments like that makes one wonder what other nuggets of safe engineering have become 'quaint' and ignorable ; )

Unknown said...

Agree completely. Care to share that backup script you wrote?

Corey said...

See this:

http://eric.lubow.org/2011/databases/mongodb/ec2-consistent-snapshot-with-mongo/