Karlsson on databases and stuff: June 2012

Friday, June 15, 2012

JSON Loader version 1.1 available!

Yo!

Time for a new release of my MySQL JSON loader. I have added a bunch of features and fixed some bugs since last. So nothing revolutionary there. The program is still a C-program, using autotools to build and set up. It has been tested on 64-bit Linux (Ubuntu) but should work on other platforms too. One bugfix in particular makes a difference, I didn't really handle auto-commit the way I should in 1.0, but that is fixed now. Among the features I think that SQL retry is a useful one, as is the ability to skip over leading JSON records and the ability to run arbitrary SQL before starting to load.
Download from sourceforge!
Stay tuned, as I will release 1.0 of my JSON unload / export tool real soon. I have used it for a while, I'm in the process of writing some docs and cleaning up the code a bit. Once that is done, it's releasetime, which should be within a couple of days.

/Karlsson

Thursday, June 14, 2012

Is query_cache really a good idea? Maybe not...

OK, the query cache is a pretty smart idea, right? When the same query is executed many times by many threads, and the data in the relevant tables isn't touched so the query cache is invalidated, then the query cache is a pretty good idea, right? And even when not used, it really shouldn't be in the way, right? It's in memory after all, so a check for a query in the cache should be quick, right?

Maybe, but far from always, and fact is that the query cache is becoming less and less useful. Let me tell you why. We have more and more memory in our machines, and disk is less used for storage, and more for persistence. Even though the size of our databases we process is getting bigger and bigger, the data we actually access isn't growing that much, or in other words, we keep storing more and more historical data, for a number of reasons.

At the same time as the amount of RAM we have access to, we do not get faster CPUs at the same rate. What we do get is MORE CPUs, or rather more cores. In other words, we must do more and more multithreading to support the faster CPUs and allow them to access the RAM we have. My reasoning is this: We have more and more RAM, one of the major uses for RAM is to cache data from disks and in databases. This will speed up access to the data that is on disk enormously. But when we have so much RAM that the data we use is actually most there, in RAM, then what limits performance? And the answer is, the way we access that RAM.

This is what I wanted to test, how does the possibly inefficient multithreading affect performance, once the bottleneck that disk access represents is out of the way? This is my little experiment: 100 millions rows in a single table. This is part of my tests to use MySQL as a key value store, so there are no secondary indexes, no foreign keys and as I want to test this with MySQL Cluster, I have also removed any VARCHAR fields. As it stands, the table has one BIGINT one INT and one DOUBLE column, and the BIGINT is the PRIMARY KEY. This is real data though, but I have removed some fields only.

My first test was with a cold database, just started. Noone else was using it and I had an InnoDB cache of 8 Gb using half of what was in the machine. Then I had a benchmark that reads 1 millions rows from this table, using the BIGINT column for lookup, and the 1 IDs to look up are spread over 30 threads (I have 8 cores in this machine). With a cold database, it took 869 seconds to read all this data. The next round, with data in memory, it was a lot faster, but not as fast as you might think, just about half the time, 432 seconds. Now, during that last round, show status showed a lot of "Waiting for query cache lock", so I decided to try without the query cache? Fast or slower, what do you think. The answer is faster. A LOT faster! some 30 seconds! As compared to 432 seconds! This calls for further investigation, but my initial response to this is: If you use MySQL in a way that your data in in memory, then the query_cache isn't just useless, it is severely limiting performance! But I'll have some more data soon on this, this is getting fun!

/Karlsson

A little something on the basics of UTF-8

Some years ago, in 2009, I wrote a couple of blog posts on the subject of Character sets and UNICODE and how that works. In a not unusual move, I included a bit on the history of character sets and stuff like that. And another recurring theme was that I promised a third part on collations, something that somehow didn't really happen. You can read these posts here and here.

Ken Thompson

Now, 3 years later, we expect things to have moved on. But not really. Except that UTF-8 is much more persistent these days, you really cannot escape it, don't even try! For example, not so long ago I wrote about JSON and a JSON string is UTF-8, there is no other choice. And I will follow up on my previous posts with something on COLLATION, but not right now, instead, I'll do a slightly deeper dive into the UTF-8 waters and explain how it works in some more detail, and some of the neat side effects of it. It is pretty smart, but as it was invented by Ken Thompson (together with Rob Pike when at a restaurant), this is expected.

When you look at UTF-8 encoding of UNICODE, there are some useful things you should know about it. This encoding is pretty smart in that characters 0 - 127 are the same as 7-bit ASCII, which makes UTF-8 easy for us oldtimers. But there is more to it than that. The first bits in every byte tells you the length of the character. If the first bit is 0, you know that this is a "single-byte" character that is the same as in 7-bit ASCII then. But then comes the smart part.

UTF-8 representation explained in short

For any character which requires more than 7 bits to represent in UNICODE, the first bits in the first byte of a UTF-8 encoded string tells you the number of bytes required in the UTF-8 encoding (i.e. not the number of UNICODE bytes, but how many bytes the character makes up when represented as UTF-8). If the two bit are 1, you know you need as many bytes as there are leading 1's. The net effect of this is that the first byte either has the first bit set to 0, in which case this is 7-bit ASCII, or that the two first bits are 11. Or to put it differently:

The first byte has the pattern 0xxxxxxx - single byte UTF-8, i.e. 7-bit ASCII.
The first byte has the pattern 110xxxxx - Two byte UTF-8.
The first byte has the pattern 1110xxxx - Three byte UTF-8.
etc.

Now, in the case where there are 2 or more bytes to the UTF-8 representation, any bytes except the first has the pattern 10xxxxxxxx. If you look at this way of representing UNICODE, you probably think that this is hardly the most compact way or representing UNICODE, and it's not. But instead it is incredibly useful, in particular when you look at the side effects (surely most of them intentional) of this representation. Let's have a look at some of them.

7-bit ASCII

As we have already seen, UTF-8 is fully compatible with 7-bit ASCII, which means that any character that can be represented as 7-bit ASCII has exactly the same representation in UTF-8. I have said this many times now, so maybe you are getting bored, but it is actually more to it than meets the eye.
If you look at the scheme above, you see that the first bit of ANY byte, not only the first, is NEVER 0, unless when this is a single byte 7-bit ASCII character. Which means what? It means that if you pick any byte anywhere in a UTF-8 encoded string, and that byte has the first bit 0 (or in other words, a byte with a value in the range 0 - 127) you know this is a 7-bit ASCII character! It cannot be the first byte of a multi-byte UTF-8 character and NOR can it be the second or later byte in a multi-byte character (as these always has the first bit set hence has a value of 128 or higher).
The control characters we IT folks play around with, like carriage return, line-feed or the classic C-style end of string 0, are all in the 7-bit ASCII character range. Which in turns means that you you want to, say, exchange all cr/lf (or \n) for a null (or \0) in a string, the way you would do that is NO DIFFERENT with UTF-8 than with 7-bit ASCII. Sure, any bytes excluding these characters have different representation, but we can ignore that, it will work anyway!
This is the reason that, say, strlen(), strcat() etc still work with UTF-8. As long as you understand that strlen() will return the length of the string in bytes, not characters, it works as usual. And strcat() works exactly as before!

Navigating a UTF-8 string

Another thing you might want to do is navigate among the bytes in a UTF-8 string. In many schemes of variable length item compaction, you would need to start from the beginning of the string to know where you are, i.e. get the first byte, figure out the length of the item, get to the next item etc. Not so in UTF-8! The first byte in UTF-8 character EITHER has the first bit set to 0 OR the first TWO bits set to 1! Or to put it differently, any byte that is NOT the first byte of a character has the two first bits set to 11!
So for any byte inside a string, you can always find what character it is or is part of by looking at the first bits:

If the first bit is 0, you are done: This is a simple 7-bit ASCII character.
If the first two bits are 10, this byte is part of a multi-byte UTF-8 character, so to figure out what character this REALLY represents, move "back" in the string until you find a byte which is the first byte of this character, i.e. it has the two highest bits in the byte set to 11.
If the first two bits are 11, then this is the first byte in a multi-byte UTF-8 character.

In conclusion

Complex? Well, yes, but much less so that if we had to convert all ASCII-7 to full UNICODE! None of the old functions would work. Our old code would break completely! The issue with UTF-8 is that it is a bit too smart for it's own good, us IT folks gets a bit lazy and care less about the cases where we really DO need to do special processing to support UTF-8, as it is so compatible and mostly works, like 7-bit ASCII, at least for the intents and purposes for your average programming project. But many you know a bit more now.

Cheers
/Karlsson

Saturday, June 9, 2012

No MySQL Cluster, the table isn't bl**dy full! Not even close!

OK, to begin with, I'd really like to try MySQL Cluster as a Key-Value Store. Yes, I have already recognized that MySQL Cluster stores variable length data in fixed length format. And someone should think about how good an idea this really is, in my mind it's just as mindbogglingly efficient, modern and start as storing data on punched cards. Which I assume how the MySQL Cluster sourcecode is managed.

yes, I really dislike the idea of fixed length strings, and more on this later. But think now, I have 2 fields defined as VARCHAR(256) and this is a Web application, then how much space will be allocated on disk for MySQL Clsuter? As many bytes are there are in the string r 256 bytes? Neither actually, this is a web-app, and I know it comes as a surprise to some of you, but Unicode is here to stay, and UTF-8 is way to go. And MySQL assumes that no one will use more than a 3-byte Unicode character (which is an incorrect assummpion, but I can live with it. And 4-byte Unicode charsets are available actually). Anyway, as each character in that case may occupy 3 buyes, MySQL Cluster will allocate 768 bytes for a VARCHAR(256) UTF-8 encoded field. Yes. Yuck. As I have written in many blogs before, disk space isn' much of an issue these days, but writing all that data may well be. So it I put 10 7-bit ASCII characters in a VARCHAR(256) fields, I'm not writing 10 bytes, I'm writing 768 bytes, that makes a difference in performance.

All this aside, I really want to load my data into MySQL Cluster to test it's claims for performance and it's claims as a great Key-Value Store. And now again, it tells me that "The table is full". That MySQL Cluster tablespaces stores data in fixed length format is bad enough, that it insists on storing it in fixed length data files (that takes forever to create) makes it even worse. But I have never been able to store more than about 60.000.000 rows to my MySQL Cluster setup. The error I constantly get is "The table '' is full. Yes, there is no table there, the table name is empty. I have 4 data nodes. I look at my storage and I see this:

mysql> select * from ndbinfo.memoryusage;
+---------+--------------+-----------+------------+------------+-------------+
| node_id | memory_type | used      | used_pages | total      | total_pages |
+---------+--------------+-----------+------------+------------+-------------+
|       2 | Data memory | 497647616 |      15187 | 524288000 |       16000 |
|       2 | Index memory | 129687552 |      15831 | 1048838144 |      128032 |
|       3 | Data memory | 498073600 |      15200 | 524288000 |       16000 |
|       3 | Index memory | 129802240 |      15845 | 1048838144 |      128032 |
|       4 | Data memory | 497745920 |      15190 | 524288000 |       16000 |
|       4 | Index memory | 129777664 |      15842 | 1048838144 |      128032 |
|       5 | Data memory | 497680384 |      15188 | 524288000 |       16000 |
|       5 | Index memory | 129679360 |      15830 | 1048838144 |      128032 |
+---------+--------------+-----------+------------+------------+-------------+

OK, what about my data store on disk, maybe I am running out on space there:

mysql> select SUM(FREE_EXTENTS * EXTENT_SIZE) / 1024 / 1024 As "Free MB", SUM(INITIAL_SIZE) / 1024 / 1024 "Size MB", (SUM(INITIAL_SIZE) - SUM(FREE_EXTENTS * EXTENT_SIZE)) / 1024 / 1024 AS "Used MB" FROM information_schema.files WHERE FILE_TYPE = 'DATAFILE';
+-----------------+-----------------+----------------+
| Free MB         | Size MB         | Used MB        |
+-----------------+-----------------+----------------+
| 327935.00000000 | 381472.00000000 | 53537.00000000 |
+-----------------+-----------------+----------------+

Nothing there either, there seems to be plenty of space available. What makes this table "full", please tell me if you know, I'd really like to test this beast!

And then, before I end this post, let me tell you one thing: For the intent and purposes that MySQL Cluster was once designed, Telco-applications, I'm sure it's great. Fact is, I know it is and few beats it. Secondly, even in Web-space, there are great uses for MySQL Cluster, I know that from first hand experience. But as, say, a substitute for, say, MongoDB as a document store, it's not even close. Not necessarily only in terms of raw performance, but it's just not a good fit for the use case. For one thing, a document is variable in size, for another a simple KVS document store is just that, simple, and if there is one thing that MySQL Cluster is not it's simple. If you want the Ferrari of databases, then MySQL Cluster might be it. Possibly. But if you want the Volvo KVS, then MySQL Cluster isn't it.

And now I'm close to closing, but there is more more thing: No, I'm not giving up. No way! But I just want to know why the table is "full", because MySQL Cluster is wrong, it's not (and when creating the table I told it to hold 150.000.000 rows with and avg-row-size just what it in reality is, so if MySQL Cluster has a problem with this table, it could have said so from the start, when I created the table. Right? If it ignores what I am telling it, why is the syntax there at all?).

/Karlsson

Monday, June 4, 2012

Setting up MySQL Cluster 7.2

I decided to try the claim that MySQL Cluster is a great Key-Value store. I have been thinking about trying this for some time now, my JSON import tool for MySQL was a starting point for being able to cram some JSON data from our MongoDB Key-Value store into MySQL.

I tried this tool with an InnoDB table, and that worked great, importing 100.000.000+ records at about 10k records / s, all CPUs going at full speed. As JSON is native to MongoDB, I assumed it would be faster and consume less resources there, and that was true, 22k records / s were imported into MongoDB, but I still think my tool worked well.

Now, having done all that, and having all the necessary tools ready, it was to to set up MySQL Cluster on this box. I have 8 cords and 16 Gb RAM on it, so a 20 Gb table should be too bad, at least not if I use Cluster disk storage. But it was some time ago since I used MySQL Cluster, so after downloading Cluster 7.2.6 I decided to follow the manual. After all, I was running MongoDB in a pretty standard off-the shelf config, and that worked fine, as did MySQL Community edition, so MySQL 7.2.6 shouldn't really have much issues here. Right?

Well, I was wrong. Or rather, the docs were wrong. And still the defaults for many of the MySQL Cluster parameters seems to indicate that this is meant to run on an Andriod based Cellphone rather than a mid-range home-brew server. All this not considering the incorrect syntax used in many example.

After downloading MySQL Cluster 7.2.6. I followed the manual for installing the thing, and even at that state I fealt that I was in for a bumpy ride here. MySQL 5.5 Section 17.2 MySQL Cluster installation says. "This section describes the basics for planning, installing, configuring, and running a MySQL Cluster.". Fact is, it does not, rather it gives and overview of some of the architectural aspects and some hardware requirements for MySQL Cluster, and covers nothing on planning or installing MySQL Cluster. This is not to say this section is bad, it's just that it's not about what it say it should be about.

OK, I'll live with that, lets move on. I downloaded a tar-file, and I know how to deal with that, but just to figure out if there are any specifics, I have a look in the docs, as this is Ubuntu Linux I look in 17.2.1.1. Installing a MySQL Cluster Binary Release on Linux. This is some of the things mentioned on here that I take to heart. Not. If I did, stuff would work. Like this little command:
tar -C /usr/local -xzvf mysql-cluster-gpl-7.1.23-linux-i686-glibc23.tar.gz
Why the manual wants me to unpack a MySQL 7.1 tar file is beyond me, I don't think that would end up being a 7.2 Cluster. The Management nodes are set up like this, according to this document:

tar -zxvf mysql-5.1.61-ndb-7.1.23-linux-i686-glibc23.tar.gz
cd mysql-5.1.61-ndb-7.1.23-linux-i686-glibc23
cp bin/ndb_mgm* /usr/local/bin

This is just wrong, again. MySQL 5.1 has nothing to do with this. And why would I necessarily want the executables in /usr/local/bin.

Again, I can live with it, but I don't much like it, as someone else, who actually CAN benefit from MySQL Cluster would be put off due to issues like that. But as for myself, I move on to section 17.2.3. Initial Configuration of MySQL Cluster. Here, the config.ini settings are again incorrect, among the issues I found was that the portnumber setting (for the TCP DEFAULT section) is now obsolete, and for the management node the configdir setting, which is required, is missing.

Having fixed all this and possibly a few more issues that I have forgotten about, it was time to create some tables. As I was planning to store data on disk, I had a few more steps before I could create the table to load data into, I had to create a LOGFILE GROUP and a TABLESPACE for my disk data, where there is a working example in section 17.5.12.1. MySQL Cluster Disk Data Objects. This section has a sample configuration section where a disk based is set up. OK, fine. Time for setting up a LOGFILE GROUP:

CREATE LOGFILE GROUP lg_1
    ADD UNDOFILE 'undo_1.log'
    INITIAL_SIZE 16M
    UNDO_BUFFER_SIZE 2M
    ENGINE NDBCLUSTER;

Well that didn't work at all, you cannot state 16M (for 16 MegaBytes) at all it seems in practice, although the manual says differently. OK, again I can live with it, and I hope that any other newcomers to MySQL can live with some syntax errors in the manual. 16Mb also seems to be very small for a undo logfile. Having fixed all that, I was now able to create the LOGFILE GROUP and the same issue applies to creating TABLESPACEs, but by now I knew how what was wrong and how to fix it. I an then progress to create my table, using the usual CREATE TABLE command, but using the STORAGE DISK and TABLESPACE options. Hey, it worked, I have a MySQL Cluster disk based table!

I started loading data in MySQL Cluster. MyJSON loader failed, running it, it seemed to hang MySQL Cluster, I have yet to determine what the issue is, but I'll get to that real soon. Instead, I decidedd to load data into MySQL Cluster using the native MySQL method, by first mysqlexport data from my INNODB table, and then doing a mysql command line import (without the CREATE TABLE statement in the export file of course). This is running now, and it is dead slow. I'll have to figure out why. But the DataMemory and IndexMemory have been set to 2Gb and 1 Gb respectively, which isn't much but should be enough. Also, I have configured MySQL Cluster without replication to speed things up. I will eventually get this data loaded, but this method is way too slow, so I have to think up something else. Now I get a "table is full" after som 2 million rows (I have 100 million rows to load). Memory usage isn't high, and the data files on disk should be big enough (32 G) so I need to investigate what is going wrong. One thing is clear though, getting MySQL Cluster going is WAY more complicated than either MySQL or MongoDB. And note I have done it before, although it was some time ago since I did it last.

Hence the story of this benchmark continues in another blogpost.

Cheers
/Karlsson

Friday, June 1, 2012

JSON to MySQL - mysqljsonload 1.0 Ready for download!

I have spent some free time developing this tool that loads JSON data into MySQL. And yes, I know there are some other means of doing this, but in my case, yes, this was done with a particular purpose in mind, I needed something different. Something that could load a lot of JSON data real fast.

The program is a C program, it uses the Jansson library for JSON parsing (I really like this library by the way. It's available here: http://www.digip.org/jansson/) and it is multi threaded (configurable number of loading threads) and uses MySQL INSERT arrays.

You want to try it? It's available on sourceforge download. The program comes with some basic documentation, which is also available as a separate download, and uses the usual configure / make process to build it.

There are a bunch of configuration options, as usual these can be either in a configuration file or on the command line, to support things like:

Ignoring specific columns.
Mapping column names, i.e. in the normal case attributes in the JSON data is mapped to a database column with that same name, but this allows you to change that.
Set attribute defaults. If an attribute doesn't exist in JSON, by default it is set to NULL, but you may set it to some other value using this option.
Column fixed values. Sometimes you want to assign a column a fixed value, independent of what it is set to in the JSON file or if it is set att all. That is what this option does for you.
NULL handling of embedded objects and arrays. In the normal case, objects and array embedded in the JSON data records get loaded as string, but you may turn this off and load them as NULL instead.

There is a bunch more things this program can do, but this is the basic stuff. I tested the program by loading a 23 Gb JSON file containing about 100.000.000 records into an InnoDB table. The loading went ahead with about 10500 records per second, which I think is reasonable on this mid-size machine (1 CPU, 8 cores, 16 Gb RAM, MySQL running on a single 1Tb disk, the same as I was loading data from). MySQL was configured with InnoDB lazy logwrites and a few other things, but nothing fancy.

So, if you are still with me, you might have a reason for that. Like a MongoDB export file (with JSON data) that you want to cram into a MySQL database? This this tool may be for you.

Also, this is version 1.0. There is probably a whole bunch of bugs, some not so nice code that I want to clean up and some obvious features to add. Let me know what you think! And if you want a quick introduction to JSON, look at my previous post on that subject.

/Karlsson
Yes, that Karlsson not Json