Saturday, August 27, 2011

Percona Live getting closer and some odd notes

Percona Live MySQL Conference, London, Oct 24th and 25th, 2011Percona Live in London is getting closer, and Oct 24 will soon be with us. Before that I hope to finish my delayed honeymoon in S.t Petersburg, move house and Travel to New York (I am glad I am NOT doing that today, Hurricane Irene doesn't seem like a nice lady).

I have done some more work on my presentation, and I hope this will end up being pretty interesting. At the same time, Recorded Future is in the middle of some major rewamping on the backend, so things are a bit hectic, to say the least.

The other week we had an issue with one of our slaves falling behind and having problems to catch up. I did not have my accelerator working on that machine, so I started it, and immediately the slave was catching up way faster than before, so again I have proven to myself that this puppy really helps. You can read more on the accelerator here: Slavereadahead 1.3 released.

This fall I am testing a bunch of alternative database technologies, some based on MySQL, some complementary to MySQL and some not MySQL-related at all, and I'll be talking a bit on some of them in London.

Special offer: Use the discount code RecFPLUK when registering to get a special discount.

See you in London
/Karlsson

Wednesday, August 17, 2011

Speaking at Percona Live in London October 24-25

I'll be speaking about MySQL in the Cloud at Percona Live in London October 24-25 and this will be talk centered on our experiences of running a database setup, based on MySQL but with serveral complementary technologies, in Mamzon EC2.

I will talk about our experiences so far, we have a lot of data at Recorded Future and we are also write intensive, which is a slight difference from what you may be used to from other Web-based systems. The amount of data we have right now is in the multi Tb range, and we are growing fast. Is a could environment really useful for something like this? What does work and what doesn't?

Also, I will talk a bit on what I think the technological issues are here when using traditional servers such as MySQL and MongoDB, and how I would like it to work instead. Finally, I'll take a shot at looking at some alternatives that I have been testing recently? I like MySQL, and I have used it for many years and I support it now, but I am also a pragmatist+ Are there options and why are they interesting?

See you in London!
/Karlsson

Friday, August 12, 2011

MySQL Global status difference using MySQL procedures / functions

As a MySQL DBA, I guess you use the SHOW GLOBAL STATUS command or the corresponding INFORMATION_SCHEMA.GLOBAL STATUS table to show current mySQL status quite often. And many of us use different tools to collect that data and view it.

But sometimes we use same command from the SQL prompt, because we have to, because it's the only option or just because that is the most convenient way. And often you kick of two such command in a row, a few seconds apart, just to see how things are moving:
SHOW GLOBAL STATUS LIKE 'innodb%rows%';
... DBA picks his or her nose for a few seconds ...
SHOW GLOBAL STATUS LIKE 'innodb%rows%';

And then you can see how things are advancing. But how much? To figure of how much, you have to calculate the differnce between the values returned by those two statements. And then there is another issue. How much is the difference per second? To know what the difference is per second, we have to includ the UPTIME_SINCE_FLUSH_STATUS status value in those statements. So we do this then:

SHOW GLOBAL STATUS WHERE variable_name like '%rows%' OR variable_name = 'Uptime_since_flush_status';
... Yet another chance for the DBA to pick the nose ...
SHOW GLOBAL STATUS WHERE variable_name like '%rows%' OR variable_name = 'Uptime_since_flush_status';

Now we have the data we need to work with, but we still need to do some math. And math is fine of course, in and of itself, but wouldn't it be nice to have some kind of contraption to do that for us? Like a machine? Let's call it a computer! Whao!

No, joking aside here, this is perfect job for a stored procedure or something. What we need is a stored procedure to to the jobs for us, and somewhere to store the previous value inbetween status runs, and here I will use a MySQL variable for that. Before I show you the code, let me show you what running it looks like:

MySQL> call getstat('%rows%', FALSE);
+--------------------------+-------------+
| variable_name | current |
+--------------------------+-------------+
| INNODB_ROWS_DELETED | 171085420 |
| INNODB_ROWS_INSERTED | 4940247881 |
| INNODB_ROWS_READ | 16041373517 |
| INNODB_ROWS_UPDATED | 90534033 |
| NOT_FLUSHED_DELAYED_ROWS | 0 |
| SORT_ROWS | 266732753 |
+--------------------------+-------------+
6 rows in set (0.00 sec)

Query OK, 0 rows affected (0.00 sec)

MySQL> call getstat('%rows%', FALSE);
+--------------------------+-------------+-------------+------------+--------------------+
| variable_name | previous | current | difference | difference_per_sec |
+--------------------------+-------------+-------------+------------+--------------------+
| INNODB_ROWS_DELETED | 171085420 | 171085420 | 0 | 0.00 (2 s) |
| INNODB_ROWS_INSERTED | 4940247881 | 4940249044 | 1163 | 581.50 (2 s) |
| INNODB_ROWS_READ | 16041373517 | 16041373597 | 80 | 40.00 (2 s) |
| INNODB_ROWS_UPDATED | 90534033 | 90534044 | 11 | 5.50 (2 s) |
| NOT_FLUSHED_DELAYED_ROWS | 0 | 0 | 0 | 0.00 (2 s) |
| SORT_ROWS | 266732753 | 266732753 | 0 | 0.00 (2 s) |
+--------------------------+-------------+-------------+------------+--------------------+
6 rows in set (0.00 sec)

Query OK, 0 rows affected (0.00 sec)

The first parameter to the getstat procedure is the parameter LIKE-string you want to use, the second is a boolean if you want to rerun the base for the saved parameters. In other words, the first time the procedure is run, it collects relevant data, but it only shows current values, as that is what it has. The second and subsequent runs, it shows the difference from the first run, unless you set the other parameter to TRUE, which causes a new set of values to compare with to be set. If you subsequently again run it with the second parameter set to TRUE, the procedure will only show the difference between the two runs.

The nice thing with this procedure is that is uses no temp tables, which means no disk IO or anything, just access to the INFORMATION_SCHEMA GLOBAL_STATUS table, that's it.

So, now you can wait to get the procedure and use it yourself? OK, here we go, enjoy! (And there is actually the procedure and a simple supporting function).

/Karlsson

DROP PROCEDURE IF EXISTS getstat;
delimiter //
CREATE PROCEDURE getstat(p_like VARCHAR(64), p_gennew BOOL)
BEGIN
DECLARE v_stat TEXT;
DECLARE v_name VARCHAR(64);
DECLARE v_value VARCHAR(1024);
DECLARE v_nodata INTEGER;
DECLARE v_uptime_current INTEGER;
DECLARE v_uptime_last INTEGER;
DECLARE c1 CURSOR FOR SELECT variable_name, variable_value
FROM information_schema.global_status;
DECLARE CONTINUE HANDLER FOR NOT FOUND SET v_nodata = 1;

SET v_nodata = 0;
-- Make sure we always generate new stats the first time we run.
IF @savedstat IS NULL THEN
SET p_gennew = TRUE;
END IF;

IF p_gennew THEN
SET v_stat = '';
-- Get the current status now, and loop through it.
OPEN c1;
l1: LOOP
FETCH c1 INTO v_name, v_value;
IF v_nodata THEN
LEAVE l1;
END IF;

-- Add name and value to saved statuses.
SET v_stat = CONCAT(IF(v_stat = '', '', CONCAT(v_stat, ',')), v_name, ':', v_value);

-- Get current uptime.
IF v_name = 'UPTIME_SINCE_FLUSH_STATUS' THEN
SET v_uptime_current = v_value;
END IF;
END LOOP;
CLOSE c1;
ELSE
SELECT variable_value INTO v_uptime_current
FROM information_schema.global_status
WHERE variable_name = 'UPTIME_SINCE_FLUSH_STATUS';
END IF;

SET v_uptime_last = getstatvalue('UPTIME_SINCE_FLUSH_STATUS');
-- If there was a previous value to show, then show difference.
IF v_uptime_last != '' THEN
SELECT variable_name, getstatvalue(variable_name) AS previous,
variable_value AS current,
IF(getstatvalue(variable_name) = '', '',
variable_value - getstatvalue(variable_name)) AS difference,
IF(getstatvalue(variable_name) = '', '',
CONCAT(TRUNCATE((variable_value - getstatvalue(variable_name))
/ (v_uptime_current - v_uptime_last), 2), ' (',
v_uptime_current - v_uptime_last, ' s)')) AS difference_per_sec
FROM information_schema.global_status
WHERE variable_name LIKE p_like;
ELSE
SELECT variable_name, variable_value AS current
FROM information_schema.global_status
WHERE variable_name LIKE p_like;
END IF;

IF p_gennew THEN
SET @savedstat = v_stat;
END IF;
END//
delimiter ;

DROP FUNCTION IF EXISTS getstatvalue;
delimiter //
CREATE FUNCTION getstatvalue(p_name VARCHAR(64))
RETURNS VARCHAR(1025)
NO SQL
BEGIN
DECLARE v_pos INTEGER;
DECLARE v_end INTEGER;

-- If no saved status exists, then return nothing.
IF @savedstat = '' THEN
RETURN '';
END IF;

-- Find the value.
SET v_pos = INSTR(@savedstat, CONCAT(',', p_name, ':'));

-- Try to find the value name at the beginning of the string.
IF v_pos = 0 THEN
IF LEFT(@savedstat, LENGTH(p_name) + 1) = CONCAT(p_name, ':') THEN
SET v_pos = 0;
ELSE
RETURN '';
END IF;
END IF;

SET v_pos = v_pos + LENGTH(p_name) + 2;
-- Now find the value in the string, following the name and a colon.
SET v_end = LOCATE(',', @savedstat, v_pos);
IF v_end = 0 THEN
SET v_end = LENGTH(@savedstat) + 1;
END IF;

-- Extract the value and return it.
RETURN SUBSTR(@savedstat, v_pos, v_end - v_pos);
END//
delimiter ;