Wednesday, January 18, 2012

Database Innovation, pleeease!

I think you have heard me say it before, but in this case I think repetion is needed: We should be much more innovative in the database world. And no, I am not talking NoSQL here, not at all. For all the good things with the NoSQL technologies and the movement itself, it's not really innovative. Rather, in my mind, NoSQL sacrifices functionality for performance, largely. The schema-less design of most of these technologies is probably the one aspect I would consider innovative, the rest is just RAM based storage, sharding, key-based lookups and good, old B-Trees.

Talking about B-Trees, isn't it time we retired them soon? There should be better ways if indexing data. Look at something like Mongo. With MongoDB, you really want to have your indexes in-memory, all of them, without that, performance will be awful (there are exceptions though, but in general this is true). Now, a B-Tree is an index mechanism that has worked well, as the structure of it lends itself to good performance be it on disk or memory, although in general, a B-Tree is built for disk-based storage with caching; for in-memory use, there are better, more efficient, indexing (or access) methods. So if an index in Mongo is supposed to be in memory, why choose a disk-oriented indexing mechanism? T-Trees are there, they are optimized for in-memory use and has been around for ages? I guess the answer is tradition.

Tell you what, tradition is a BAD BAD argument for anything in an industry that changes as fast as the IT-industry. Would anyone suggest that Facebook base their hardware platform in Motorola 6800 CPUs? I think not. But the B-Tree predates the 6800 by far.

Which is not to say that the B-Tree is so bad (or that the Motorola 6800 is either), it's not, but we have much more diverse needs these days, so there should be more diverse access methods in use, but the B-Tree persists, despite that.

And look at SSD-disks. Yeah, the future, right? A largely random access style memory hooked onto an interface designed for electro-mechanical harddisks in the 1970's. Innovative? I think not. Apple got it right in attaching Flash on the Mobo and PCI-based Flash is growing and coming down in rice, so it seems things are moving there at least.

But in any case, Flash / SSD isn't an electromechanical disk with cylinders and sectors, despite what the SSD interface tells us. And if the B-Tree works well on disk, we talk electro-mechanical disks. Where is the access methods designed specifically to reap the benefits of direct attached Flash?

And to be honest, the SQL-Based RDBMS, something which I have spent my career with, in one shape or te other, for 25'is years, is hopelessly outdated, but that is not why I'm no big fan of the NoSQL movement. Rather, my problem is just that the NoSQL movement really doesn't represent something new or is a disruptive technology in any way. Where, my friends, is the disruptive database technology? A Technology built for (you are sitting down now, I hope) the 21st century, If you missed it, we are there now, since 11 years back actually, so start inventing.

And yes, I know about the different MySQL variations with sharding, storage engines etc. etc. But that is not terribly innovative or new. The closest we get to a disruptive technology in the database world recently, is the column based storage databases. But thses are not gereric enough in my mind, and also, most of them have a SQL based interface tucked onto them. And I understand why they want SQL, they need this to be able to sell it, as all consumers of database products (most at least) wants SQL to integrate it with some tool or infrastructure. And I understand this too, but it brings up a question. Where is the customer or end-users who is willing to sacrifice using a query language as old as Led Zepplin to instead get the benefits from some new disruotive database technology?

But this is, I'm afraid, a bit of and chicken-and-egg-situation. The customer isn't requesting innovative products as that technology doesn't exist much, and the products aren't developed and research isn't much done as the customers aren't there. This really has to change soon, and I am sure it will. If for no other reason so that I can retire in peace, knowing that my SQL skills are truly outdated and I will not have to work, because noone wants my skills!

/Karlsson
Lookiing forward to retirement

Tuesday, January 17, 2012

MongoDB for MySQL Folks part 4 - Sharding

Welcome part four of this series of blog-posts on MongoDB, where we previously looked at:
These were introducing some basic concept when it comes to querying MongoDB and to show some simple use cases. By now you realize that MongoDB is different from MySQL, but you probably knew that already, but why would you move from MySQL to MongoDB? Well, you know the answer to that: Performance and Scalability. If MongoDB didn't provide pretty seamless sharding, scaling out over a large number of nodes, then MongoDB wouldn't be that interesting.

With MongoDB, it's performance depends on having large portions of data in RAM, and this is no different from MySQL, but it's even more true with MongoDB. But if you were running on a single machine, the amount of RAM you can use is limited, there is a limit to how much RAM you can (c)RAM into a single box. This is will limit performance of course, and is again not much different from MySQL. What is different is that MongoDB has a solution: transparent sharding (yes, I am aware of the different transparent sharding implementation of MySQL, like Scalebase, but that is a different story, that I will get into at a later stage).

Without Sharding, we would not have gone into using MongoDB here at Recorded Future. Why anyone would use MongoDB in the versions before when sharding was introduced, is beyond me.

MongoDB sharding from above

From a high-level view, this is how sharding works in MongoDB:
  • Data is distributed across one or more mongo servers automatically. A server may be either a single MongoDB server or a replica set (more on this in a later post).
  • Data is distributed in ranges in a user-defined shard-key. The shard key is, as is obvioious, a unique key. It may well be the unique _id identifier that MongoDB assigns to each document, but that is not necessary. Each such range is called a chunk and is some 64 Mb is size by default.
  • Each MongoDB server holds a number of these chunks.
  • Balancing a sharded setup involves moving chunks between the servers and this is automatic. In a perfectly balanced MongoDB shard setup, each involved server has the same number of chunks.
So far so good. Nothing incredibly complex, right, and also useful, right? Yes,, useful and workable, but you have to know what you are doing here.

There are actually three kinds of servers involved here:
  • Mongo shard servers. These are the MongoDB servers or replicasets that holds the actual sharded data. These are the same servers as for a non-sharded MongoDB setup, and there is no stopping you connecting to it just like that, accessing the sharded data in a non sharded way. Also, there is no stopping you adding databases, collections and documents to this server. The server may well hold both sharded and non-sharded data. The latter seems like an advantage, but actually is not and should probably be avoided. The number of mongo shard servers or shard replica sets determines how much data is distributed obviously, with 2 shard servers, each server will hold about 50% of the data etc.
  • Mongo "router" mongos. The mongos process is the process that the application connects to, and it is responsible for distributing a query between to the appropriate shard servers for a particular query. This is a different program than the other mongo servers, and it also has the role of performing the automatic balancing. The interface to it, from the point-of-view of the poor old application, is the same though, so an application that works with a non-sharded MongoDB should also work with a sharded one, but it connects to mongos instead of mongod. You may have as many mongos servers as you please.
  • The Mongo "config server". This is process that runs the usual mongod server, but it has a special role. Instead of storing the actual data, the mongoc manages the metadata, in other words, the config server data tells mongos in which mongo shard server an actual document is located, based on some query. It might seem like this server is a single point of failure, but it is not, you may have 3 of them, in which case they are replicated using a special mechanism.
To summarize this: The application talks to mongos. Mongos need to know where that data that the application is located, so it asks the mongoc. Once it has this data, it goes ahead and asks the mongo shard server for the actual data. And you might think that the extra roundtrip that the mongos has to do do the config server will slow things down and that the config server(s) is a potential bottleneck, but this is not so as the mongos process caches the config data. If the mongos data gets outdated (stale in MongoDB terms), the cache is refreshed.

Setting up MongoDB sharding


There are a couple of weird issues when you configure MongoDB for sharding, and some which are not so obvious. What I have described above is how the MongoDB server processes interact, and setting that up is difficult enough, although far from as difficult as it used to be.

You might be tempted to think that once the above setup is running, all data will be sharded just like that. And that would be nice and that would be the natural way for things to work, but hey, this is MongoDB, so it was close, but no cigar! No, you still have to tell the system that a particular database is sharded, and then enable sharding and determine the key to use for each and every individual collection in that database.

And again, as often is the case with mongo, some things are controlled from the commandline, some things are in configuration files and some things are stored in MongoDB itself. Also, in many cases, even though you have to set certain things in a configuration file, they end up in the database itself anyway. Why this is so you have to ask someone else than yours truly.

So, to get thing straight, this is how you set up sharding with Mongo in not so few and not so easy steps:
  • Start your mongod daemons that stores the actual data. Start them with the --shardsvr option enabled. Apparently this is no longer needed, but I think it's a good thing to put it there anyway.
  • Start your mongoc servers, you have 1 or 3 of these. These are mongod servers running with the --configsvr option.
  • Wait until the config servers are up and running. All of them! This is not well documented, but mongos will not start unless at least 1 config server is running, and the first time around, ALL config servers must be running!
  • Start your mongos servers (routers). Make sure that they actually start, if the config servers cannot be contacted, then mongos will just fail with a message in the logfile.
  • Now, we must tell the config servers where our shards are, lets say we have two of them, "datasvr" and "datasvr2", both running on port 33010. Then enter the mongo shell, connecting to the mongos host and port (in this case mongos_host and 33011 respectively). First you must be in the admin database, and then you can add the shards:
    $ mongo --host mongos_host:33011
    mongo> use admin
    mongo> db.runCommand({addShard: "datasvr1:33010"});
    mongo> db.runCommand({addShard: "datasvr2:33010"});
  • Now we can make sure that we shard the data we have, so enable sharding for the database in question, lets call the database "mydb". Enter the mongo shell and connect with one of the mongos servers, again in the admin database:
    mongo> use admin
    mongo> db.runCommand({enablesharding: "mydb"});
  • And then we have one step left: enable sharding on the collections we want to shard, again from the mongo commandline connected to mongos. Here we specify the name of the collection and the key used for sharding. In this case, we use the always present unique _id column as the shard key and the collection is called mycoll:
    mongo> use admin
    mongo> db.runCommand({shardcollection: "mydb.mycoll", key: {_id: 1}});

Does the commands above that you run from mongos look weird to you? Like the key specification key: {_id: 1}? Yes, at first it looks awkward, even to me. But after a while you get used to it and can start to appreciate the strict Java Script / JSON syntax used all over the place in MongoDB, it is actually quite powerful and easy to use, once you get used to it.

Now, we have enabled sharding for one colletion here. If you have been through all this fuzz to create a sharded setup, I guess you would want all tables in that setup to use the powerful sharding mechanism? Like, automatically? Nope, can't do. You have to enable sharding manually for each MongoDB database and collection you create.

I plan to dig deeper into mongo in a later post, to show some monitoring features, how to use Java Script, how to set up replication and some other things. But for now, this is it!

Cheers
/Karlsson