The technology Virident use to improve performance is a third level of memory storage based on Flash. But it goes way beyond just adding SSD disks. To put things in perspective, look at how resources in an average server has developed in the last 20 years or so. We have now something like 1000 times more memory, and 1000 times more CPU performance, but disk performance has increased very little, maybe 5 times, and that is an optimistic number. Note that this is regarding disk performance, available disk storage has increased also 1000 times or so.
What does this mean then? Well disk I/O is an issue, probably the main issue for database performance. Now, database has still gotten faster, a lot so, as we have more memory and can hence cache A LOT more data, which speeds things up enormously. That performance comes from the fact that we can avoid disk I/O.
There are a couple of issues here though:
- For writes, I still need to go to the disk, independent of how much RAM I have, a disk I/O will still need to happen, to the database or a logfile, but it must happen. The reason is simple. If I put my written and committed transaction in a log buffer in memory, by transaction will not be persisted.
- Caching of databases only helps so much. Once you have cached up, say, 20 % of the data in the database, further caching will improve performance as much. The reason is of course that data access patterns are skewed, they are not evenly spread across the total size of the database.
But if we go back 20 years in time again, when we were then compensating for slow disks put caching data in RAM, there were compromises being done. Fast, and random, RAM access as opposed to slow disk block-level access. But what has happened now is that there is an even bigger gap in performance between size of RAM and disk performance. So can we not fill that gap?
Looking at attributes of the two types of memory we are looking at so far, in case of RAM:
- Is fast and random accessed.
- But RAM is also not persistent. It is this point that makes disks still so important. Having all the database in RAM is actually possible in many cases these days, but this is not useful, as that data will not be persistent.
- Is persistent and has higher capacity.
- But disks are also slow and use block-level I/O.
I want to note that there are other ways of solving this problem. One is to do what MySQL Cluster is doing, which is “semi persisting” RAM by synchronous replication between nodes.
As anyone can realize, applications really need to be aware of this “third storage media” that Virident provides to work properly. Virident has a special version of the InnoDB plug-in to handle this. And the known scalability issues with InnoDB are not really present here either, and least to a much larger extent that in “normal” InnoDB, as this is the InnoDB Plug-in with a lot of fixed for this same problem.
And it doesn’t end there. As I wrote above, for the developer this Flash memory has similar attributes to RAM, i.e. it is not a block-level device but random access, and there are no context switching needed! These are the two features that makes this technology stand away from just plugging in SSD disks in any server!
All in all, I’m excited about this, there is a lot of performance potential to gain from this setup. By being able to scale write-performance on a single server to new higher level, means that technologies, in and of themselves good, like sharding, might be needed asmuch anymore. Also, any distributed technology to solve this problem, like MySQL Cluster, has limitations, cache invalidation and distributed locking, none of which makes for high scalability. Maybe Virident technology will be a standard component in any high-end MySQL server eventually?
/Karlsson
No comments:
Post a Comment