Tuesday, October 9, 2012

Too many or too big open MyISAM tables??

MySQL tries to be smart and to cram as much performance out of available hardware, and one thing that most MySQLers knows is that opening a table (ie. opening the frm file, and depending on the storage engine, any additional files related to the table) is expensive from a performance point of view. When we see the opened_tables status variable increasing, we tend to increase the table_open_cache setting and keep as many tables as possible open and avoid reopening them.

When it comes to MyISAM though, this has a nasty side-effect. When the server crashes and there are MyISAM tables open, these might well need to be REPAIRed. And if you have big MyISAM tables, this is an issue. Let's say that your application use several MyISAM tables, with the same content, and that you create new tables after a certain time, to keep the size of each individual table down? There are other reasons why you have this effect also, say a large MyISAM table that this rarely updated or read from, but suddenly it is, and then it is kept alone again? When a crash occurs, you might be stuck with several MyISAM tables that are open, but have not been accessed in a long time so might not need to be open, but still there is a risk that these will require a long and boring REPAIR.

There is a Feature request for this; 67142, but I was thinking that instead of having low level server code do this, this would be a good thing to implement using EVENTs. So let's give it a shot.

To being with, I need to figure out when a table was last touch. MySQL doesn't record when a table was last read from, but if we are OK with flushing tables that haven't been written to in a specific time, then the update_time column in the information_schema table does the trick. To figure out what tables to flush, I need to select table and database names from this table for all tables that haven't been modified within a certain time. Also, I must filter so I only get MyISAM tables, and make sure that I don't hit the MyISAM tables in the mysql database. An appropriate SELECT may look like this:
SELECT table_schema, table_name
  FROM information_schema.tables
  WHERE engine = 'MyISAM' AND table_schema != 'mysql'
    AND UNIX_TIMESTAMP(NOW()) - UNIX_TIMESTAMP(update_time) > <seconds>

That's not enough of course, all this gets me are the table and database names, we need to do something more than this. But let's start from the beginning. An EVENT will run an SQL statement, and in my world, there is mostly one statement it should run, which is a CALL to a stored procedure. And in this case, this procedure should use the SQL above to figure out what tables to flush, and then do the actual flushing. As the FLUSH command, when using inside a MySQL Routine, will not take any parameters, we have to run this as a PREPARED statement. All in all, when we end up with is something like this:
   DECLARE v_database VARCHAR(64);
   DECLARE v_table VARCHAR(64);
   DECLARE cur1 CURSOR FOR SELECT table_schema, table_name
     FROM information_schema.tables
     WHERE engine = 'MyISAM' AND table_schema != 'mysql'
       AND UNIX_TIMESTAMP(NOW()) - UNIX_TIMESTAMP(update_time) >


   OPEN cur1;

-- Loop for all MyISAM-tables that haven't been updated

-- for p_sec seconds.
   FETCH cur1 INTO v_database, v_table;
   WHILE NOT v_no_data DO
-- Create the FLUSH statement.
      SET @tmpstmt = CONCAT('FLUSH TABLES ', v_database, '.', v_table);

-- Prepare and execute the statement.
      PREPARE stmt1 FROM @tmpstmt;
      EXECUTE stmt1;

-- Get the next table / database.
      FETCH cur1 INTO v_database, v_table;
   CLOSE cur1;

-- This is to overcome a bug that causes a warning,

-- even though the warning was caught.
   SELECT COUNT(*) INTO v_dummy FROM mysql.user;

Note that noop SELECT at the end? In some versions of MySQL, when a handler is run, like in this case the CONTINUE HANDLER FOR NOT FOUND warnings will still be around here, which means the execution of the procedure will finish with a warning. To get rid of that, I issue an SQL statement that always runs.I'm not sure why this is happeniing, and for it might be that I have a mistake somewhere, but I do not think so.

So far so good, now all we need is an event, in this case I will run every 15 minutes and flush tables that hasn't been used in the last 15 minutes, but you can set this to anything:
CREATE EVENT myisamflush
  DO CALL myisamflush(900);

Was this enough? Maybe, but to get this working check if the event scheduler is running:
mysql> SHOW VARIABLES LIKE 'event%';
| Variable_name   | Value |
| event_scheduler | OFF   |

In this case, it wasn't running so we should start it (and we should also modify the config file so that the event scheduler is running when the server is restarted, but that is a different story):
mysql> set global event_scheduler=1;

Before we finish up, note that everything above assumes that you are running with root privileges or similar.


1 comment:

Simon J Mudd said...

Nice to see you mention my feature request and a solution which should almost work without too much effort.

flush_time was also mentioned to me as a contender but ideally "last access" is really what we want. I'm happy if we get that exposed via P_S or some other system table to use that rather than "in-built" stuff. The event scheduler is after all quite useful for stuff like this.