Wednesday, December 24, 2008

Using triggers for performance

Can you use TRIGGERs to increase performance? Really? Isn't so that a TRIGGER on a table will reduce performance, just as any FOREIGN KEYs will? Right?

Nope, none of those statements is necessarily very true. In the case of reads, for example, neither TRIGGERs, nor FOREIGN KEYs has any negative impact at all. Then, assuming we have a read-intensive system, such as some website, then performance isn't much of an argument for not using triggers or foreign keys.

You already know I'm quite a fan of foreign keys. Let me reiterate why I like them: They keep my data integrity intact, that's why. The argument that if I have transactions, then the only thing that might cause data integrity to fail is if there is a bug in the application. There are two things I have to say against that:

  • Many, if not most, users use more than one application. For example, if nothing else, the mysql and mysqldump commandline tools are used. To me, if I use foreign keys properly, I can safely mock around manually in the database with SQL, without causing data integrity issues.
  • Foreign keys are declarative. This means that the rather strict things they enforce are easy to look up and maintain, without looking into numerous pieces of code spread all over the place.


Now, I have made my point with Foreign keys, so what about the original subject of tjis article, i.e. Triggers to increase performance, all I have said so far is that they do not at least decrease performance for reads, but can they increase performance? Yepp. Let me give you an example.

Even with semi-complex data structures, you sometimes get into a situation where even a moderately normalized RDBMS structure cause some queries to be unnecessarily complex. On the other hand, denormalizing it usually has other issues. For example a de-normalized design is often best (performance wise) for SELECTs, whereas a normalized design keeps my data integrity better maintained, in particular with INSERT, UPDATE and DELETE statements. And this anomality is where triggers help!

This is how I think: Start with a pretty normalized structure. Identify the areas where a SELECT would be unnecessarily complex, then add the "de-normalizing" columns to the tables, without removing the normalized column structure or foreugn keys. This means I will have duplicate data, but it also means that SELECTs will be performaing much better. INSERTs, DELETEs and UPDATEs will be more complex and less performant, but this is no big issue as:

  • INSERT, UPDATE and DELETE are less frequent! I said this was a mostly read-only system, right!
  • And here is the magic sauce: I can use triggers to maintain the "denormalized" columns. My INSERTs, DELETEs and UPDATEs will not change at all compared to the normalized structure, as this is still in place as it always has been.


I had this issue the other day. I run a website on my spare time where I, among other things, keep track of bluesbands and gigs with them. I figured out that the bands usually were a combination of artists, i.e. I had "Fat Foo" on one gig, and on another I had "Fat Foo and the Bar". So I ended with having the latter being a lineup and the former an artist. In an ideal world, this would have been a great candidate for a hierarcical CONNECT BY, scenario, but not so now, without that particular feature in MySQL.

All the same, I realized, for a number of reasons, that this would be a bit sluggish. Every time I looked up a gig (in the gig table), got get the name of the artist I needed at least 3 lookups: The lineup, the lineup->artist table and the artist table. A gig always references just a lineup, so every artist is part of at least one lineup, the one with just one member:
Artist "Fat Foo" is the only artist in the Lineup "Fat Foo"
but
Artist "Fat Foo" is also one the members in the Lineup "Fat Foo and the Bar", the other artist in that lineup being "The Bar".

A lot of table lookups, just to get the name of a band. So I created not a trigger here (I'll soon tell you why), but a stored function that would update the corresponding lineups everytime an artist changed. The function would simply sync the lineup, having been given an artist ID.

Now it was to to create the trigger that would call the function above. So why didn't I embed the code in the function right into the trigger? Because I also wanted to make sure that I could always redo all the lineup names, if I decided I wanted to change the scheme I use for this. If I did that, I would use the function easily for that, with a simple SELECT.

So, now you ask, how does it work in practice? And what about the case when "Fat Foo" and "The Bar" keeps their names, but the combo is renamed to "The Fat Bar". I just have two columns in the lineup table, one which contains a "generated name" by the function above, and one which is manually entered. I use the manually entered name if it exists, else I use the generated one, a simple IFNULL will do for this, no extra lookups or anything!

Look at the result on PapaBlues.

And by the way, folks, Merry Christmas to you all
/Karlsson

4 comments:

Roland Bouman said...

Hi!

nice post. What I am interested in is, can you show us the figures that prove you actually did increase performance. I mean, I'll take your word for it, but it would be even better if you'd show us ;)

kind regards and happy holidays,

Roland

Karlsson said...

Roland!

That's a good point. That performance IS increased in clear, the only table I am querying now, I would have to query anyway. With at least 2 rows less to query, it would be cool to try this out.
I'll do this after the weekends, possibly using some dummy rows. As for numbers, I have some 30 QPS in the system, and keep track of about 3000 gigs, if I include the historic ones. Then we have to look at bands and lineups, but I'll do another blog posting in a few weeks and provide some real numbers. By then the site should have stabilized.

cathleen2014 said...

I just came out here to see if something like this has been suggested and to make the suggestion if not already made. This scenario has come up for me many times in the past. I fully understand that either triggers or defaults can be used to get some of this done; however, defaults alone don't do the whole job. click here

shahbaz said...

Awesome blog about gift idea. This is exactly what I was looking for! thank you so much for posting.recreation tips