Tuesday, November 23, 2010

Monitoring MySQL SQL statements the way it SHOULD be done!

You may have read a previous post of mine, back in April this year, where I wrote about using the MySQL 5.5 Audit interface to SQL Statement monitoring. There was a bunch of comments and some ideas, but not much happened. Until now that is.

Hereby I release the first version of SQLStats, which is a much enhanced version of what I described in the previous post. This is a MySQL Plugin for MySQL 5.5 that allows you to monitor the most recently executed, and the most frequently executed statements using 2 INFORMATION_SCHEMA tables. The thing is not very complex, to be honest, but it does do the job. So what was the job then? Well, looking at what this plugin does, it goes something like this:
  • Allows you to monitor ALL SQL statements executed by the server.
  • The SQL statements are "normalized", meaning that literals / constants are removed before comparison.
  • Data is saved in memory. No disk access and very little overhead.
  • Data is retrieved from INFORMATION_SCHEMA tables, just a simple SELECT and you know what is going on in the server.
I have done some basic testing of the overhead of this plugin, and it wasn't even noticable. If you have used the general query log for this stuff before, know what overhead I'm talking about. With SQLStats, the overhead is close to 0.

There is no need for MySQL Proxy or anything like that. There is no need to change something in the Client or in the Connector. To be honest, there are a couple of things I want to add to the plugin eventually, but this is a starting point at least. To use it: download it, install the tomcat / mysql monitoring server, install the monitoring agent and ... No wait, that was MySQL Enterprise Monitor, this is how you do it: download, build, install and use it. That's it.

So where can I get it, you ask (or maybe you don't, but I'm gonna tell you anyway). It's on sourceforge, and you can download it from here:
http://sourceforge.net/projects/sqlstats/

There are two files to download: A simple PDF documents with some basic usage and configuration tips, and a source-code package (which also includes the same PDF).

What would I want from you? Ideas for future development, bug reports and a few beers, that's it, not too much to ask, eh?

/Karlsson
Also, did I mention that the overhead is VERY low...

13 comments:

Antony said...

Yay! Someone started using the audit framework!

BTW, rather than testing the event_class, you can specify in the audit descriptor the bitmask of event classes you want... which in your case is MYSQL_AUDIT_GENERAL_CLASSMASK.

Site Blogger said...

Going to test this soon. From your descriptions, it's exactly what the doctor ordered.

Compiling the plugin was not too straightforward for CentOS/RHEL 5. Stuff like my_config.h are installed in "/usr/include/mysql".

Had to manually include the last line in Makefile to get it to find the my*.h files.

AM_CPPFLAGS = -I$(MYSQL_SRC)/include -I$(MYSQL_SRC)/regex -I$(MYSQL_SRC)/sql -DMYSQL_DYNAMIC_PLUGIN -I/usr/include/mysql

Site Blogger said...

Two more issues:

1) The install.sql and uninstall.sql are not part of the package.

2) There is a typo in the Sqlstate PDF.

The line which contain instruction to manually config my.cnf

"plugin-load=sqlstats=sqlstats.so;sqlstats_topsql=sqlstats.so;sqlstats_lstsql=sqlstats.so"

sqlstats_lstsql should be "sqlstats_lastsql"

Karlsson said...

Thanx for the comments. A new version will be on i's way real soon.

Karlsson said...

A new version is now available, 1.1. This fixes the distribution on the install/uninstall files, the typo and hopefully the source directory issue, at least temporarily.
The latter should also be fixable by stating --with-mysqlsrc=/usr when running configure. But I have tested on Fedora only so far, so god knows (also I am using tarballs, not RPMs)

mleith said...

Oh, so we should never time SQL statements runtimes when monitoring them.

Now I know where we've been going wrong with MEM's Query Analyzer. Silly us.

;)

Karlsson said...

Mark!

Well, no, I didn't say that. What I am saying is that the MySQL Server should be instrumented to do that. And frankly, what you see in Proxy is how long the query took to return from the server, which isn├Ąt necessarily long long it took to execute.
But once the server is instrumented to keep track of what is really going on in the server, then OK. And SQLStats can do a couple of things that MEM cannot, as it has access to the THD. But that is a different issue altogether.
Times SQL statements isn't everything either. We can do this today, without SQLStats and without MEM, using the slow-query log which is now dynamic. The more troublesome queries are those quick but frequent ones, which SQLStats DOES trap.
Come to think of it, maybe I should add ROWS_EXAMINED to SQLStats? That would be a reasonable approximation of the performance impact of a query.
And none of this is to say that the nice MEM GUI isn't useful or anything. Or the advisors. Or the history of events. But what MEM is missing is support from a properly instrumented server, something I try to overcome with SQLStats.

/Karlsson

mleith said...

Of course, I agree, to an extent.. :)

The audit interface is entirely the wrong interface to do this in the server though, so I disagree that this is how it SHOULD be done.

*This* is how it SHOULD be done from the server side:

http://forge.mysql.com/worklog/task.php?id=2515

One other note though - from an application perspective (think page view load time), the thing that people really *do* care about is how long the statement took to return to the client. So even given server based instrumentation, there is still value in recording (some) likewise data from a client perspective as well..

And don't get me wrong, I'm not knocking what you have here - it's a good intermediate step. :)

Site Blogger said...

ROWS_EXAMINED would be a great improvement in that this value directly translates to the efficiency of the query, even more important than time of execution since a bad query can be fast but a query with high rows_examined value is always bad.

To mleith's point to the planned way to do instrumentation.

I read the wl#2515 and it doesn't normalize the sql statements like sqlstate plugin. Unless I misread it but it's huge problem on a dynamic system where sys admin need see the overall picture in an avalanche of the same queries with different input values.

Karlsson said...

At your service folks, NUM_ROWS_EXAMINED, as returned in the MySQL THD, is now added to the INFORMATION_SCHEMA tables, as well as NUM_ROW_EXAMINED_TOTAL. Again, this is picked up from the THD, so there is no magic here, if the THD doesn't have the info or it is wrong, well then so be it.

Download SQLStats 1.2 for this:
http://sourceforge.net/projects/sqlstats/

/Karlsson

mleith said...

I agree that normalization is important (it's how we do it in MEM too).

Normalization can happen at a higher level than the instrumentation layer though (even though I too would like to see a summary table by normalized statement as well)..

Mark Callaghan said...

As Mark mentions, sometimes we want the client-perspective especially for response time. That combined with server-side stats lets you figure out whether changes in response time are caused by MySQL. The network and application server are frequent sources of latency that are often overlooked.

Gabi D said...

Hello Anders,
Some screenshots of the monitor would be great, just to get a feeling what it's about.
Thanks in advance,
Gabi