Sunday, March 28, 2010

More fun with the MySQL Audit Plugin API

The Audit API has more uses that you may think! When a statement is executed in the server, the notification function in this API will be called, and we can use that do some interesting things!

Like: Ever wanted to know what the most executed query in your running system is? I mean the information is in there somewhere, right, it's just q question of how to get at it? And frankly, I don't think the queries:
SELECT * FROM user_data WHERE user_id = 57;
and
SELECT * FROM user_data WHERE user_id = 113;
Should count as different queries? I mean, hey, the issue is that the application may be issuing the same query too many times, with different arguments mostly, but still the query:
SELECT * FROM user_data WHERE user_id = <somevalue>

Is executed too often? And if so, how can I figure it out? Well, using MySQL 5.5 with a combination of the Audit Plugin API and the INFORMATION_SCHEMA API, you can do this. The principle is simple:

The logging of queries
This is handled by the notification function in the Audit API. In the plugin, we have a an array of queries, and for every query we see, we "normalize it", in the sense that all literals are replaced by question marks, i.e. the query above would be represented as:
SELECT * FROM user_data WHERE user_id = ?
Then we look for this query among the queries I already know about. If the query is found, I just increment teh execution cound for it, and if it is not found, I add it. I keep track of X number of queries, and I log the time each query is added, so when I add a new query, and there already are X queries logged, I get rid of the one that was executed longest ago. And LRU list in short.
In the sample I code I prove evetually, I also log some other data, but this is the basics. And the list is stored in memory by the way.

Seeing the latest queries
For viewing the queries, I use an INFORMATION_SCHEMA plugin, that is part of the same plugin library as the one above. although this is a different type of plugin. This is again pretty simple. The plugin will expose an INFORMATION_SCHEMA table that contains the list of the queries above, and when I select from that table, I materialize the list as a table. As this will then look like any normal table, you can use just about any SQL operation on it, like ordering, filtering specific queries etc. And all this with a simple SELECT!

The issues
Will this affect performance? Yes, sure, but I don't really know to what extent right now, I havent gotten around to test that. And then there is one more issue: You need MySQL 5.5 m3 or above, which currently means you need to get it from Launchpad, there are still (March 29) only m2 binaries to be found at http://dev.mysql.com.

Example code
Now it's time for some sample code. Lets first have a look at how it works. There is an information_schema table introduced by the plugin called my_is_proclist, and this is the one we are interested in. I have a table in the test database that I issue queries on. And it looks like this then:
MySQL [test]> select last_executed, num_executes, statement from information_schema.my_is_proclist where statement like '%from t1%';
+---------------------+--------------+-------------------------------+
| last_executed | num_executes | statement |
+---------------------+--------------+-------------------------------+
| 2010-02-28 15:12:26 | 4 | select * from t1 |
| 2010-02-28 15:34:07 | 6 | select * from t1 where c1 < ? |
+---------------------+--------------+-------------------------------+
2 rows in set (0.00 sec)

MySQL [test]> select * from t1 where c1 <>
+------+------+
| c1 | c2 |
+------+------+
| 1 | 16 |
| 2 | 16 |
| 3 | 16 |
+------+------+
3 rows in set (0.00 sec)

MySQL [test]> select last_executed, num_executes, statement from information_schema.my_is_proclist where statement like '%from t1%';
+---------------------+--------------+-------------------------------+
| last_executed | num_executes | statement |
+---------------------+--------------+-------------------------------+
| 2010-02-28 15:12:26 | 4 | select * from t1 |
| 2010-02-28 15:34:12 | 7 | select * from t1 where c1 < ? |
+---------------------+--------------+-------------------------------+
2 rows in set (0.00 sec)

As you can see, the counter goes of what I execute the query in question. Nothing complicated, and as you can see, I can do all sorts of filtering on the INFORMATION_SCHEMA table.

Installing the plugin
The plugins are contained within one library. I have the sourcecode and a simple makefile downloadable here: http://www.papablues.com/src/my_proclist.tar.gz

Modify the supplied Makefile appropriately and run:
make
make install
Note that you must have rights to write to the MySQL Server plugin directory for the install to work! Once this is done, you must tell the MySQL server about the plugins:
install plugin my_proclist soname 'my_proclist.so';
install plugin my_is_proclist soname 'my_proclist.so';
And that's it! Note that there is also a status variable that shows how many statement I currently keep track of. Currently, the max number of statement I track is 30, but there is a simple
#define NUM_STMTS 30
in my_proclist.cc that you may edit to fix that.


/Karlsson

No comments: