Karlsson on databases and stuff: Is query_cache really a good idea? Maybe not...

Thursday, June 14, 2012

Is query_cache really a good idea? Maybe not...

OK, the query cache is a pretty smart idea, right? When the same query is executed many times by many threads, and the data in the relevant tables isn't touched so the query cache is invalidated, then the query cache is a pretty good idea, right? And even when not used, it really shouldn't be in the way, right? It's in memory after all, so a check for a query in the cache should be quick, right?

Maybe, but far from always, and fact is that the query cache is becoming less and less useful. Let me tell you why. We have more and more memory in our machines, and disk is less used for storage, and more for persistence. Even though the size of our databases we process is getting bigger and bigger, the data we actually access isn't growing that much, or in other words, we keep storing more and more historical data, for a number of reasons.

At the same time as the amount of RAM we have access to, we do not get faster CPUs at the same rate. What we do get is MORE CPUs, or rather more cores. In other words, we must do more and more multithreading to support the faster CPUs and allow them to access the RAM we have. My reasoning is this: We have more and more RAM, one of the major uses for RAM is to cache data from disks and in databases. This will speed up access to the data that is on disk enormously. But when we have so much RAM that the data we use is actually most there, in RAM, then what limits performance? And the answer is, the way we access that RAM.

This is what I wanted to test, how does the possibly inefficient multithreading affect performance, once the bottleneck that disk access represents is out of the way? This is my little experiment: 100 millions rows in a single table. This is part of my tests to use MySQL as a key value store, so there are no secondary indexes, no foreign keys and as I want to test this with MySQL Cluster, I have also removed any VARCHAR fields. As it stands, the table has one BIGINT one INT and one DOUBLE column, and the BIGINT is the PRIMARY KEY. This is real data though, but I have removed some fields only.

My first test was with a cold database, just started. Noone else was using it and I had an InnoDB cache of 8 Gb using half of what was in the machine. Then I had a benchmark that reads 1 millions rows from this table, using the BIGINT column for lookup, and the 1 IDs to look up are spread over 30 threads (I have 8 cores in this machine). With a cold database, it took 869 seconds to read all this data. The next round, with data in memory, it was a lot faster, but not as fast as you might think, just about half the time, 432 seconds. Now, during that last round, show status showed a lot of "Waiting for query cache lock", so I decided to try without the query cache? Fast or slower, what do you think. The answer is faster. A LOT faster! some 30 seconds! As compared to 432 seconds! This calls for further investigation, but my initial response to this is: If you use MySQL in a way that your data in in memory, then the query_cache isn't just useless, it is severely limiting performance! But I'll have some more data soon on this, this is getting fun!

/Karlsson

7 comments:

Sheeri K. Cabral said...: Have you tried using query_cache_type=2 (the "on demand" mode) and seeing if it has a similar performance limitation?

I tell people that due to the overhead of checking if a query is in the query cache, query_cache_type=1 isn't useful. But if you DO have a use-case for the query cache, use the DEMAND mode, and in your selects, instead of

SELECT col FROM tbl WHERE...

you do

SELECT SQL_CACHE col FROM tbl WHERE....

But it'd be interesting to see if there was some overhead to the non-query-cache queries there.; June 14, 2012 at 3:59 PM
Justin Swanhart said...: Sheeri,
Using the query_cache in DEMAND (mode 2) mode won't help.

There is a single mutex that protects the query cache, so a multithreaded application will usually encounter a LOT of contention on this hot mutex. As long as the query cache is enabled, the mutex is called for every query, even those without the SQL_CACHE hint.

Percona Server(and MySQL 5.5) have added the THD->status indicating that the mutex is being waited on so that you can see the bottleneck easier without having to use something like pt-pmp.

Percona Server (and MySQL 5.5) will completely eliminate the mutex when query_cache_type=0, but this means that you can't turn the query cache on at runtime if you disable it in your my.cnf; June 14, 2012 at 5:57 PM
Justin Swanhart said...: Very useful advice:
http://dom.as/tech/query-cache-tuner/

:); June 14, 2012 at 5:59 PM
x said...: For this reason, I have _always_ disabled the query cache at configure/build time so that the code is not compiled at all.

See my opinion from a few years ago:

http://permalink.gmane.org/gmane.comp.db.mysql.devel/31872; June 14, 2012 at 7:30 PM
Anders Karlsson said...: Anthony!

Looking at my data, that seems like a good idea. But honestly, there should be a better way of dealing with it, the query cache isn't that a bad idea, but the implementation seems to be lacking. I have an idea on how to fix it, let's see if I have the time to try that.

/Karlsson; June 14, 2012 at 7:49 PM
Garp said...: Depends on the nature of the queries. We've some complex queries that get run against a database on a frequent basis. Even on a modern 8 core fast system with the data cached in RAM it takes several seconds to piece the data together. We get some significant advantage by having the query_cache in place.

There is some argument for switching to on-demand caching for sure but I think that'd involve a long argument with the vendor/developer.; June 14, 2012 at 11:28 PM
Karlsson said...: Garp!

Yes, there are use-cases for query-cache. My case is that, because of the mutex protection scheme, it is getting less and less useful the more CPUs you have working in parallell. I don't think that piecing data together faster is that much of an advantage if you have to wait for a mutex. Agai, as I said, I'll try to test that.

/Karlsson; June 15, 2012 at 8:25 AM