As a follow-up to this, I wanted to test what the difference really is when we use database intensive operations. Here. procedures really should be faster, as they run inside MySQL itself. Books has already compared the performance of MySQL Compute intensive operations to MSSQL, and regretable, MySQL came out behind the Redmond thingy.
But now to my really simple database operations performance test. I'm using a procedure to INSERT data into a table, and to reduce the impact of the INSERT itself, I'm using the Blackhole Storage Engine. Then I write a simple procedure to INSERT into the table. All in all, this looks like this:
CREATE TABLE IF NOT EXISTS foo(bar INT NOT NULL PRIMARY KEY, col2 CHAR(100))
ENGINE=Blackhole;
DROP PROCEDURE IF EXISTS perf;
delimiter //
CREATE PROCEDURE perf(nOper INTEGER)
BEGIN
DECLARE i INTEGER DEFAULT 0;
WHILE i < nOper DO
INSERT INTO foo VALUES(57, 'Some data');
SET i = i + 1;
END WHILE;
END
//
Now, I need to run this puppy somehow and I want to run the same inserts from a client program, and for this I write a simple C program, like this:
#include
#include
#define MY_SOCKET "/tmp/mysql5131.sock"
int main(int argc, char *argv[])
{
MYSQL *pMySQL;
int i, nLoop;
char *pStmt;
if(argc < 3)
{
fprintf(stderr, "Usage: %s
return 0;
}
nLoop = atoi(argv[1]);
pStmt = argv[2];
pMySQL = mysql_init(NULL);
if(mysql_real_connect(pMySQL, NULL, "perf", "perf", "test",
0, MY_SOCKET, CLIENT_COMPRESS) == NULL)
{
fprintf(stderr, "Error %s connecting to MySQL.\n",
mysql_error(pMySQL));
mysql_close(pMySQL);
return 1;
}
for(i = 0; i < nLoop; i++)
{
if(mysql_query(pMySQL, pStmt) != 0)
{
fprintf(stderr, "Error %s in MySQL query.\n", mysql_error(pMySQL));
mysql_close(pMySQL);
return 1;
}
}
mysql_close(pMySQL);
return 0;
}
The C prgram is called spperf, and it take 2 arguments, a counter of how many times the statement should be run and the text of the SQL statement to run. This way I can do something remotely interesting, which is to run X number of loops in SP and X number outside, i.e. I can loop in both the SP and in the C program. I'll show what I mean real soon.
To begin with, I run 100000 INSERTs to the table foo created above. I use linux time to time the execution. Yes, I know this is cride, but it works for this simple test. So:
[root@moe spperf]# time ./spperf 100000 "INSERT INTO foo VALUES(57, 'Some data')"
real 0m6.967s
user 0m0.580s
sys 0m0.702s
As we can see, it took some 7 seconds for this execution. Now, to run the same INSERTs using the procedure, I do this:
[root@moe spperf]# time ./spperf 1 "CALL PERF(100000)"
real 0m3.439s
user 0m0.003s
sys 0m0.001s
And as we can see, this is a bit faster. Now we can try some other combinations, like running 4 statements at the time in the procedure, and calling the procedure 25000 times, which will cause the same 100000 INSERTs as in the examples above
[root@moe spperf]# time ./spperf 25000 "CALL PERF(4)"
real 0m5.609s
user 0m0.184s
sys 0m0.268s
And the procedure, even when called 25000 times, still outperforms 100000 straghtforward Client inserts.
There are more tests that can be done here, I'm going to do two more now, one where I only do 1 INSERT per Stored Procedure call, and one where I do 2.
[root@moe spperf]# time ./spperf 100000 "CALL PERF(1)"
real 0m12.703s
user 0m0.770s
sys 0m0.853s
[root@moe spperf]# time ./spperf 50000 "CALL PERF(2)"
real 0m7.775s
user 0m0.337s
sys 0m0.461s
As we can see, for a simple procedure, it's not terribly fast, but as soon as there is something slightly more to do, a procedure really isn't such a bad idea, at least if that something is not comptationally intensive that is. I have just one more thing to test now.
As Brooks has already determined, computational operations in a stored procedure really aren't fast at all. So then, my perf procedure above, when running with just 1 INSERT and I run it 100000 times, really can be simplified. So if I do that, let's see what happens. I create a new simpler procedure like this:
DROP PROCEDURE IF EXISTS perf2;
delimiter //
CREATE PROCEDURE perf2()
BEGIN
INSERT INTO foo VALUES(57, 'Some data');
END
//
And then I run it and compare it with running the first procedure with the argument 1 and see what happens:
[root@moe spperf]# time ./spperf 100000 "CALL PERF(1)"
real 0m12.744s
user 0m0.698s
sys 0m0.779s
[root@moe spperf]# time ./spperf 100000 "CALL PERF2()"
real 0m7.630s
user 0m0.491s
sys 0m0.779s
Ouch! That was some difference from looping just 1 time in the first procedure. Lesson: Do as little compuational stuff as possible in any procedure.
/Karlsson
2 comments:
"Do as little computational stuff as possible in any MySQL procedure"
There, fixed that for ya ;)
Ok - jokes aside. I don't care so much about the procedures, but isn't this the axe for mysql stored functions? Argueably the point of those is computation. Is there anything that can be done to fix this? Beat Vontobel had a really smart and straighforward idea to use a kind of 'macro-optimization' to improve stored function performance, is there anyone interested to build this?
Actually, I was thinging of trying a few options for the loop in this procedure. I have some ideas for that. Instead of the simple increment maybe there is another option to do this that might be faster? I think we have a simple framework for some basic testing here.
/Karlsson
Post a Comment