Showing posts with label mysql. Show all posts
Showing posts with label mysql. Show all posts

Monday, August 8, 2016

MyQuery 3.5.6 released

I released version 3.5.6 of MyQuery, and there are quite a number of new features and fixes in there. The #1 bugfix is that the annoying access warnings that poped up from Windows when saving to the registry are gone, as I have now moved the registry to a more Windows 10 acceptable place. Among the new features are:
  • JSON format output when saving results.
  • More flexible CSV format output with many new options.
  • Ability to save Dyncol as JSON in CSV and JSON output.
  • Nicer formatting of numbers in status dialogs.
  • Auto refresh of status dialogs
 As this is a version with many new features, I still consider this a Beta. I have built it on Windows 10 and tested it on Windows 10 and 7, 64-bit, although MyQuery itself is still a 32-bit windows application.

Happy SQL'ing
/Karlsson

Tuesday, February 16, 2016

Loading JSON into MariaDB or even MySQL - mysqljsonimport 2.0 is available

It was a long time since I updated mysqljsonimport or mysqljsonexport and I had a few things I wanted to do with them. This release is significant enough for me to bump it up to 2.0, and the same is in the works for mysqljsonexport. The one big thing that is now implemented is reasonably advanced support for MariaDB Dynamic Columns, and it is actually pretty flexible, allowing you to load a nested JSON object into a MariaDB Dynamic Column. But don't worry, it will still link and run with MySQL if that is what you want to do (but then you will not have the dynamic column features, for obvious reasons),

Download from Sourceforge as usual (yes, I know I am oldfashioned and that I should have put it on github). Also as usual is the documentation in pdf format that is also downloadable separately.

/Karlsson

Monday, December 14, 2015

Getting started with MariaDB on IBM POWER 8

IBM POWER 8 is latest generation of the IBM POWER series, and it's a hot one. Above all, for you reading this, POWER 8 is the most Linux friendly so far and IBM really wants you to try this out. Seveal Linux distributions are supporting POWER 8 now, and MariaDB is of course the database of choise. Some cools things with the POWER 8 architecture are the support for CAPI (google for more details) and the fact that POWER 8 machines, due to a vastly superior memory architecture, can grow in memory size, which in general is good news but if you want your own POWER 8, this makes then a bit expensive (although maybe not when you consider the performance you get). IBM has fixed that recently and have announced the LC series of servers which start at $6.600 (see more here: http://www-03.ibm.com/systems/power/hardware/linux-lc.html).

So, whar about MariaDB then? Well, MariaDB is a standard component in the Linux distributions that support IBM POWER 8, but we here at MariaDB didn't stop there. We have made numerous fixes to MariaDB to make it perfom at it's best and to increase stability even more. To get at those nice additions and enhancements though, you have to run with the latest MariaDB versions and use the binary builds we provide, and here I'll show you how to do that.

To begin with, you have to register with MariaDB.com, which is free if you want to try MariaDB Enterprise. So surf to MariaDB.com and you should get something like this:
On the top right, as indicated above, are "Login" and "Sign up" links. Select the last of these two and follow the procedure. Then you can revsit this page and log in, and the the links at the top right now say "Logout", "My Portal" and "Profile". Click on the "My Portal" link and you get to a page that looks like this:
Above is indicated the "Downloads" tab, click on this and you will be taken to a page with many different download sections. In this case, let's assume you are on Ubuntu, then select this:
Now it is time to do the actual installation, just follow the steps listed on the page, as I write this, this means I will run:
wget https://downloads.mariadb.com/enterprise/dnae-wefq/generate/10.0/mariadb-enterprise-repository.deb
 dpkg -i mariadb-enterprise-repository.deb
 sudo apt-get update
sudo apt-get install mariadb-server
Following all this. MariaDB is now up and running, but I see you asking yourself, what about POWER 8? How do I install MariaDB on that? And fact is, that is exactly what we have dne here. Or Intel x86 for that matter, the procedure is exactky the same, the differences are handles behind the scenes.

Happy power hacking, I will get back soon with a writeup on MaxScale on POWER 8
/Karlsson

Friday, August 7, 2015

Oracle dump utility version 1.1

Today I released version 1.1 of myoradump for download from sourceforge. If you don't know what myoradump is, this is a utility for exporting data from an Oracle database in some relevant text format so that it can be imported to some other database.

The main thing in version 1.1 is that I have added a whole bunch of new output formats, so make it even easier to get your data out of expensive Oracle and into something more effective. The new formats supported are:
  • MySQL - The format of this is a bunch of INSERT statements that you get when you use mysqldump for example and is useful for import into MariaDB (and MySQL). INSERT arrays are supported as a bunch of more options.
  • JSON - This format is rather obvious, the output is a file consisting of one JSON object per row. To support binary data, which is a no-no in JSON, base64 encoding of binary data is also supported.
  • JSON Array - The format is similar to JSON, but instead of separate objects per row, this format consists of one or more JSON arrays of JSON objects.
  • HTML - This format will produce a valid HTML TABLE. This is sometimes useful when you want to view output data that includes UTF8 characters for example.
In additions, this version of  myoradump includes a bunch of new features and bug fixes. I will follow up this post with one that includes some specific examples of using myoradump eventually.

So, don't touch that dial!
/Karlsson

Friday, June 26, 2015

MariaDB with Galera available on the IBM Power8 platform

It was a very long time since I wrote something in this blob, but I have been very busy this spring with MariaDB on Power mostly. This has been a lot of work, but also a lot of fun. So, what is this MariaDB on Power thing all about, well I wrote an introduction to the Power platform late last year. Since then a lot of things has happened though.

One thing is that several service providers out there has adopted Power8 as a platform. To be honest, this really isn't sexy, but it is useful and as a user of one of these services, you will just see the same old Linux you are used to, but potentially it is more powerful and reliable. One such provider is OVH, whose service is more known as RunAbove. If you want to try it, you can do so for free for 7 7 days, just go there and off you go.

Another important thing is that MariaDB is now available on Power8 running, RedHat, SUSE or Ubuntu Linux. To get access to this, pop by MariaDB and if you are not yet signed up, then do this now and then go to "My Portal", further to "Downloads" and then select "MariaDB Enterprise and MariaDB Enterprise Cluster". You are now ready to install using the operating system of your choise, but on Power you are, as I said before,limited to SUSE, RedHat and Ubuntu, and if you want to test MariaDB Enterprise Cluster, i.e. MariaDB with Galera, you have to go with Ubuntu.

Installing MariaDB Enterprise Cluster on Power8 is no more complex than on Intel. There are a few thing to adjust before you can get started with this, after having installed the software. The first node has, as usual, to be configured with wsrep_cluster_adress set to gcomm:// to ensure that this first node will bootstrap without having to connect to a cluster. Once the cluster is up and running though, this variable is set to the cluster addresses. In my case, this what the Galera setting look like in /etc/mysql/my.cnf which is the location of this file on Ubuntu.
# Galera
wsrep_provider=/usr/lib/galera/libgalera_smm.so
wsrep_cluster_name="GaleraPower1"
wsrep_cluster_address=gcomm://92.127.22.124
wsrep_node_address=92.127.22.121
wsrep_node_name=galera3
binlog_format=ROW

Note in particular the binlog_format setting. This MUST be set to ROW for Galera to work. But fact is that these setting are not particular to MariaDB on Power, this is the same even on Intel.

Tf this isn't enough to convice you about the advantages of running MariaDB on IBM Power, then see what Foedus in Italy has to say about this combination in this Video:


There is more to say about running MariaDB on Power and there is more to come here, I'll look at some performance data, we'll have a look at MaxScale on Power (this is not official yet, but that isn't stopping me) as well as a blog on how to run a Power8 emulation on Intel which I have promissed before.

So, don't touch that dial!

/Karlsson

Thursday, October 2, 2014

OraMySQL 1.0 Alpha released - Replication from Oracle to MariaDB and MySQL!!

Now it's time to release something useful! At least I hope so. I have been going through how I came up with this idea and how I came up with the implementation in a series of blog posts:
 But now it's time for the real deal, the software itself. This is an Alpha 1.0 release but it should work OK in the more basic setups. It's available for download from sourceforge, and here you find the source package which uses GNU autotools to build. The manual is part of this download, but is also available as a separately.

You do need MariaDB libraries and includefiles to build it, but no Oracle specific libraries are needed, only Oracle itself of course. There are build instructions in the documentation.

Cheers and happy replication folks

/Karlsson

Wednesday, October 1, 2014

Replication from Oracle to MariaDB the simple way - Part 4

Now it's time to get serious about replicating to MariaDB from Oracle, and we are real close now, right? What I needed was a means of keeping track of what happens in a transaction, such as a LOG table of some kind, and then an idea of applying this log to MariaDB when there is a COMMIT in Oracle. And thing is, these two don't have to be related. So I can have a table which I write to and also have a Materialized View that is refreshed on COMMIT on, and I need a log table or something. And when the Materialized View is refreshed, as there is a COMMIT, then the log can be applied. From a schematic point-of-view, it looks something like this:
This looks more complex than it is, actually, all that is needed is some smart PL/SQL and this will work. I have not done much of any kind of testing, except checking that the basics work, but the PL/SQL needed I have done for you, and the order table triggers and what have you not is also created for you by a shell script that can do this for any table.

As for the DUMMY table that I have to use to get a trigger on COMMIT, this doesn't have to have that many rows, I actually just INSERT into it once per transaction, and then I INSERT the transaction id, which I get from Oracle. This table will have some junk in it after a while, all the transactions that were started and COMMITted will have an entry here. But in my code for this, I have included a simple job that purges this table from inactive transactions.

Best of all is that this works even with Oracle Express, so no need to pay for "Advanced Replication", not that I consider it really advanced or anything. I'd really like to know what you think about these ideas? Would it work? I know it's not perfect, far from it, for for the intent of having a MariaDB table reasonable well syncronized with an Oracle, this should work. Or? The solution is on one hand simple and lightweight, but I have given up on the number of features and possibly also the design affects performance a bit.But it should be good enough for many uses I think?

Let me hear what you think, I'm just about to release this puppy!

/Karlsson

Monday, September 29, 2014

Replication from Oracle to MariaDB the simple way - Part 3

In this third installment in this series, I'll explain why the smart solution I described in the previous post actually wasn't that good, and then I go on to explain how to fix it, and why that fix wasn't such a smart thing after all. So, this was the design we ended with last time:
We have Oracle replicating to a Materialized View, this to ensure that we can run triggers when the is a commit, and then triggers on this Materialized View updates MariaDB by sending a UDP message to a server that in turn is connected to MariaDB.

The issue with the above thingy was that a Materialized View by default is refreshed in it's entirety when there is a refresh, so if the table has 10.000 rows and 1 is inserted, then there will be 20.001 messages sent to MariaDB (10.000 rows deleted, 10.001 inserted). Not fun. And it seems that Materialized Views in Oracle aren't so smart, but I was sure they were this dumbed down, if they were, noone would be using them. So I rush for the Oracle documentation, yiihaa!

The default way of updating a Materialized View is not that fast, but there is a supposedly fast, alternative, method, appropriately named FAST (that the default method isn't called something like AWFULLY CRAZY SLOW is beyond me). So the materialized view using FAST REFRESH for the orders table should really be created like this:
CREATE MATERIALIZED VIEW orders_mv
  REFRESH FAST ON COMMIT
  AS SELECT *
  FROM orders;

But this gives an error from Oracle:
ORA-23413: table "SYSTEM"."ORDERS" does not have a materialized view log
Well the, let's create a MATERIALIZED VIEW LOG for table ORDERS then, that's no big deal:
CREATE MATERIALIZED VIEW LOG ON t1_mv;
But again I get an error, and this time indicating that old Larry has run out of gas in his MIG-21 and need my money to fill it up again, so he can fly off to his yacht:
ORA-00439: feature not enabled: Advanced replication

Grrnn (this is a sound I make when I get a bit upset)! Yes, if you want Materialized Views to work properly, they way the were designed, you need to part with some $$$, and as the cheap person I am, I run Oracle Express instead of SE or EE editions, as I rather spend my hard earned money on expensive beer than on Larrys stupid MIG-21. So, as they say, "Close, but no cigar".

But I'm not one to give up easily, as you probably know. And fact is, I don't need the whole Materialized View thing, all I want is a TRIGGER to execute on COMMIT. Hmm this requires a huge box of prescription drugs to fix, and I am already on the case. Sorry Larry, but you'll have to park your MIG-21 and have someone else buy you some gas.

More details on how I tricked Larry in the next part of this series on replication from Oracle to MariaDB.

/Karlsson

Replication from Oracle to MariaDB the simple way - Part 2

The theme for this series of posts is, and indicated in the previous post, "Try and try, again", and there will be more of this now when I start to make this work by playing with Oracle, with PL/SQL and with the restrictions of Oracle Express (which is the version I have available).

So, what we have right now is a way of "sending" SQL statements from Oracle to MariaDB, the question is when and how to send them from Oracle. The idea for this was then to use triggers on the Oracle tables to send the data to MariaDB, like this, assuming we are trying to replicate the orders table from Oracle to MariaDB:
In Oracle, and assuming that the extproc I have that created to send UDP messages is called oraudp, then I would do something like this:
CREATE OR REPLACE TRIGGER orders_insert
AFTER INSERT ON orders
FOR EACH ROW
BEGIN
   CALL oraudp('INSERT INTO orders VALUES '||
     :new.order_id||', '||:new.customer_id||', '||
     :new.amount||')');
END;
/

CREATE OR REPLACE TRIGGER orders_update
AFTER UPDATE ON orders
FOR EACH ROW
BEGIN
   CALL oraudp('UPDATE orders SET order_id = '||:new.order_id||
     ', customer_id = '||:new.customer_id||', amount = '||:new.amount||
     ' WHERE order_id = '||:old.order_id);
END;
/

CREATE OR REPLACE TRIGGER orders_delete
AFTER DELETE ON orders
FOR EACH ROW
BEGIN
    CALL oraudp('DELETE FROM orders WHERE order_id = '||:old.order_id);
END;
/
 As I noted in my previous post though, DML TRIGGERs execute when the DML statement they are defined for executes, not when that DML is committed, which means that if I replicate to MariaDB in a trigger and then roll back, the data will still be replicated to MariaDB. So the above design wasn't such a good one after all. But the theme is, as I said, try and try again, so off to think and some googling.

Google helped me find a workaround, which is by using Oracle MATERIALIZED VIEWs. And I know what you think, you think "What has dear old Karlsson been smoking this time, I bet it's not legal", how is all this related to MATERIALIZED VIEWs? Not at all, actually, but this is a bit of a kludge, it does actually work, and this is the nature of a kludge (I've done my fair share of these over the years). There are two attributes of Oracle MATERIALIZED VIEWs that makes these work for us:
  • A MATERIALIZED VIEW can be updated ON COMMIT, which is one of the two ways a MATERIALIZED VIEW is updated (the other is ON DEMAND). So when there is a COMMIT that affects the tables underlying the MATERIALIZED VIEW, we have something happening.
  • And the above would be useless if you couldn't attach a TRIGGER to a MATERIALIZED VIEW, but you can!
So, to make this work, I can define a MATERIALIZED VIEW, one that I don't really use for any other purpose, on the table I want to replicate. So then the design gets something like this, again using the orders table as an example:
In practice, if we continue to use theorders table, this is what I would do in Oracle:
CREATE MATERIALIZED VIEW orders_mvREFRESH ON COMMIT
AS SELECT order_id, customer_id, amount
FROM orders;

CREATE OR REPLACE TRIGGER orders_mv_insert
AFTER INSERT ON orders_mv
FOR EACH ROW
BEGIN
   CALL oraudp('INSERT INTO orders VALUES '||
     :new.order_id||', '||:new.customer_id||', '||
     :new.amount||')');
END;
/

CREATE OR REPLACE TRIGGER orders_mv_update
AFTER UPDATE ON orders_mv
FOR EACH ROW
BEGIN
   CALL oraudp('UPDATE orders SET order_id = '||:new.order_id||
     ', customer_id = '||:new.customer_id||', amount = '||:new.amount||
     ' WHERE order_id = '||:old.order_id);
END;
/

CREATE OR REPLACE TRIGGER orders_mv_delete
AFTER DELETE ON orders_mv
FOR EACH ROW
BEGIN
    CALL oraudp('DELETE FROM orders WHERE order_id = '||:old.order_id);
END;


Now we have a solution that actually works, and it also works with Oracle transactions, nothing will be sent to MariaDB if a transaction is rolled back. There is a slight issue with this though, you might think it's no big deal, but fact is that performance sucks when we start adding data to the orders table, and the reason for this is that the default way for Oracle to refresh a materialized view is to delete all rows and then add the new ones. In short, Oracle doesn't keep track of the changes to the base table, except writing them to the base table just as usual, so when the MATERIALIZED VIEW is to be refreshed, all rows in the materialized view will first be deleted, and then all the rows in the base table inserted. So although this works, it is painfully slow.

So it's back to the drawing board, but we are getting closer, now it works at least, we just have a small performance problem to fix. I'll show you how I did that in the next post in this series, so see you all soon. And don't you dare skip a class...

/Karlsson

Saturday, September 27, 2014

Replication from Oracle to MariaDB the simple way - Part 1

Yes, there is a simple way to do this. Although it might not be so simple unless you know how to do it, so let me show you how this can be done. It's actually pretty cool. But I'll do this over a number of blog posts, and this is just an introductory blog, covering some of the core concepts and components.

But getting this to work wasn't easy, I had to try several things before I got it right, and it's not really obvious how you make it work at first, so this is a story along the lines of "If at first you don't succeed mr Kidd" "Try and try again, mr Wint" from my favorite villains in the Bond movie "Diamonds are forever":
So, I had an idea of how to achieve replication from Oracle to MySQL and I had an idea on how to implement it, and it was rather simple, so why not try it.

So, part 1 then. Oracle has the ability to let you add a UDF (User Defined Procedure) just like MariaDB (and MySQL), but Oracle calls then extproc. So my first idea was an extproc that would send a message over UDP, any generic message. In this specific case though, I was to have a server that received that UDP message and then sent it to MariaDB as a query. So this server was connected to MariaDB and a message came in though UDP, and this message would of course be some DML statement. Simple and efficient, right? Of course UDP is not a reliable protocol, but for this usage it should be OK, right?

All that was needed after this was a set of triggers on the Oracle tables that would send a message to the server which would then send these to MySQL, and this message would be based on the changed data, as provided by the trigger, like a corresponding INSERT, UPDATE or DELETE statement. This should work. OK, let's get to work.

My first attempt at the server meant that a lot of UDP messages were lost. The reason for this was that there were just to many messages coming in bursts (this is how extprocs works in Oracle, nothing wrong with Oracle here, it's just the nature of the beast). The solution was to have the server be multithreaded, with one tight thread receiving messages on the UDP port, and then having a queue and another thread processing the queue and send the messages to MariaDB?

Well, this didn't work out either. The reason was that the means I did synchronization between the threads was using Mutexes, and this also slowed things down. It worked OK for a while, but some burst could easily mean that UDP messages were lost, and really not much was necessary in terms of load for this to happen.

So, what do we do now? I had two choices, either I skip UDP and go for a connected TCP solution instead. The issue with this was that partly the extra overhead this would cause on the Oracle end of things, and partly that this would increase the complexity of my extproc and server, and above all the API to all this. The other option was to go for a lock-less queue. Some googling and some testing allowed me to figure out how to do this with gcc and I could move on and test this. Fact is, it worked fine. And as UDP messages can't be routed, using UDP messages in and of themselves made things a bit more secure.

But then there was the issue with the TRIGGERs calling the extproc. A trigger is called on a DML operation, but that operation might later be rolled back, and in that case I would break the consistency of data. And for some reason, for all the enhancement to standard SQL that Oracle provides, (including DDL triggers), there are no ON COMMIT triggers. But there just had to be a way around that one, right?

So Try and try again, mr Wint. Yes, I will solve all this, but the story is to be continued in another blog post. Also, I will eventually provide with the code for all this. GPL code that you can use out of the box to replicate from Oracle to MariaDB. But not just yet.

So don't touch that dial!

/Karlsson

Monday, August 4, 2014

What is HandlerSocket? And why would you use it? Part 1

HandlerSocket is included with MariaDB and acts like a simple NoSQL interface to InnoDB, XtraDB and Spider and I will describe it a bit more in this and a few upcoming blogs.

So, what is HandlerSocket? Adam Donnison wrote a great blog on how to get started with it, but if you are developing MariaDB applications using C, C++, PHP or Java what good does HandlerSocket do you?

HandlerSocket in itself is a MariaDB plugin, of a type that is not that common as is is a daemon plugin. Adam shows in his blog how to enable it and install it, so I will not cover that here. Instead I will describe what it does, and doesn't do.

A daemon plugin is a process that runs "inside" the MariaDB. A daemon plugin can implement anything really, as long as it is relevant to MariaDB. One daemon plugin is for examples the Job Queue Daemon and another then is HandlerSocket. A MariaDB Plugin has access to the MariaDB internals, so there is a lot of things that can be implemented in a daemon plugin, even if it is a reasonable simple concept.

Inside MariaDB there is an internal "handler" API which is used as an interface to the MariaDB Storage Engines, although not all engines supports this interface. MariaDB then has a means to "bypass" the SQL layer and access the Handler interface directly, and this is done by using the HANDLER commands with MariaDB. Note that we are not talking about a different API to uise the handler commands, instead they are executed using the same SQL interface as you are used it's, it's just that you use a different set of commands. One limitation of these commands is that they only allow reads, even though the internal Handler interface supports writes as well as reads.

When you use the HANDLER commands, you have to know not only the tables name that you access, but also the index you will be using when you access that table (assuming then you want to use an index, but you probably do). A simple example of using the HANDLER commands follows here:

# Open the orders table, using the alias o
HANDLER orders OPEN AS o;

# Read an order record using the primary key.
HANDLER o READ `PRIMARY` = (156);

# Close the order table.
HANDLER o CLOSE;

In the above example, I guess this is just overcomplication something basically simple, as all this does is the same as
SELECT * FROM orders WHERE id = 156;

The advantage of using the handler interface though is performance, for large datasets using the Handler interface is much faster.

All this brings up three questions:
  • First, if we are bypassing the SQL layer, by using the HANDLER Commands, would it not be faster to bypass the SQL level protocol altogether, and just use a much simple protocol?
  • Secondly, while we are at it, why don't we allow writes, as this is the biggest issues, we can always speed reads by scaling out anyway?
  • And last, how much faster is this, really?
The answer to the first two questions then is the Handler Socket plugin, which was the whole deal with this blog, as this use a separate, very simple, protocol and allows writes! For the third question, this is where I come in, I have done some simple INSERT style benchmarks using HandlerSocket, and I have some results for you in my next blog. So don't touch that dial, I'll be right back!

/Karlsson

Friday, July 11, 2014

MariaDB Replication, MaxScale and the need for a binlog server

Introduction

This is an introduction to MariaDB Replication and to why we need a binlogs server and what this is. The first part is an introduction to replication basics, and if you know this already, then you want want to skip past the first section or two.

MariaDB Replication

MySQL and MariaDB has a simple but very effective replication system built into it. The replication system is asynchronous and is based on a pull, instead of a push, system. What this means in short is that the Master keeps track of the DML operations and other things that might change the state of the master database and this is stored in what is called the binlog. The slave on the other hand is responsible for getting the relevant information from the master to keep up to speed. The binlogs consist of a number of files that the master generates, and the traditional way of dealing with slaves is to point them to the master, specifying a starting point in the binlogs consisting of a filename and a position.

When a slave is started it gets the data from the binlogs, one record at the time, from the given position in the master binlogs and in the process updates the current binlogs file and position. So the master keep track of the transactions and the slave follows behind as fast as it can. The slave has two types of threads, the IO thread that gets data from the master and to a separate relay log on the slave, and an SQL thread that applies the data from the relay log to the slave database.
This really is less complicated than it sounds, in a way, but the implementation of it on the other hand is probably more complicated than one might think. There are also some issues with this setup, some which is fixed by the recent GTID implementation in MySQL 5.6 and more significantly in MariaDB 10.

In a simple setup with just 1 master and a few slaves, this is all there is to it: Take a backup from the master and take to keep track of the binlogs position when this is done, then recover this backup to a slave and then set the slave starting position at the position when the backup was executed. Now the slave will catch up with the operations that has happened since the backup was run and eventually it will catch up with the master and then poll for any new events.

One that that is not always the case with all database systems is that the master and the slave are only different in the sense of the configuration. Except this, these servers run the same software and the same operations can be applied to them, so running DML on a slave is no different than on a master, but if there is a collision between some data that was entered manually on the slave and some data the arrives through replication, say there is a duplicate primary key, then replication will stop.

What do we use replication for

Replication is typically used for one or more of three purposes:
•    Read Scale-out - In this case the slaves are used for serving data that is read, whereas all writes go to the master. As in most web applications there is a much larger amount of reads than writes, this makes for good scalability and we have fewer writes to be handled by the single master, whereas the many reads can be server by one or more slaves.
•    Backup - Using a slave for backup is usually a pretty good idea. This allows for cold backups even, as if we shut down the slave for backing it up, it will catch up with the master once restarted after the backup is done. Having a full database setup on a slave for backup also means that recovery times of we need to do that, is fast and also allows for partial recovery if necessary.
•    High Availability - As the master and Slave are kept in sync, at least with some delay, one can sure use the slave to fail over to should the master fail. The asynchronous nature of replication does mean though that failover is a bit more complex than one might think. There might be data in the relay slave log that has not yet been applied and might cause issues when the slave is treated as a master. Also, there might be data in the binlogs on the master which means that when the master is again brought on-line, it might be out of sync with the slave: some data that never was picked up by the slave might be on the master and data that entered the system after the slave was switched to a master is not on the old master.

The nature of replicated data

In the old days, the way replication worked was by just sending any statement that modified data on the master to the slave. This way of working is still available, but over time it was realized that this was a bit difficult, in particular with some storage engines.

This led to the introduction of Row-based Replication (RBR) where the data to be replicated is transferred not as a SQL statement but as a binary representation of the data to be modified.

Replicating the SQL statement is called Statement Based Replication (SBR). An example of a statement that can cause issues when using SBR is:
DELETE FROM test.tab1 WHERE id > 10 LIMIT 5
In this case RBR will work whereas when using statement based replication we cannot determine which rows will be deleted. There are more examples of such non-deterministic SQL statements where SBR fails but RBR works.

A third replication format mode is available, MIXED, where MariaDB decided ona statement by statement base which replication format is best.

Scaling replication

Eventually many users ended up having many slaves attached to that single master. And for a while, this was not a big issue, the asynchronous nature of things means that the load on the master was limited when using replication, but with enough nodes, eventually this turned in to being an issue.

The solution then was to introduce an "intermediate master". This is a slave that is also a master to other slaves, and this is configured having log_slave_updates on, which means that data that is applied on the slave from the relay log and into the slave, are also written to the binlog.

This is a pretty good idea, but there are some issues also. To begin with, on the intermediate master, data has to be written several times, once in the relay log, once in the database (and if InnoDB is used, a transaction log is also written) and then we have to write it to the binlog.

Another issue that is in effect here is the single threaded nature of replication (this is different in MariaDB 10 and MySQL 5.7 and up), which means that a slave on a master that runs many threads, might get into a situation where the slave can't keep up with the master, even though the slave is similarly configured as the master. Also, a run running statement on the master will hold up replication for as long as that statement runs on the slave, and if we have an intermediate master, then the delay will be doubled (once on the intermediate master and once on the actual slave).

The combined effect of the duplication of the delay and the requirement to write data so many times, leads to the result that an intermediate master maybe isn't such a good idea after all.
As for the replication use-cases, intermediate masters are sometimes used as alternative masters when failing over. This might seems like a good idea, but the issue is that the binlogs on the intermediate master doesn't look the same as the binlogs on the actual master. This is fixed by using Global Transaction IDs though, but these have different issues and unless you are running MariaDB 10 or MySQL 5.6, this isn't really an option (and even with MySQL 5.6, there are big issues with this).

What we need then is something else. Something that is a real intermediate master. Something that looks like a slave to the master and as a master to the slave, but doesn't have to write data three times first and that doesn't have to apply all the replication data itself so it doesn't introduce delays into the replication chain.

The slave that attaches to this server should see the same replication files as it would see it it connected to the real master.

MaxScale and the Plugin architecture

So let's introduce MaxScale then, and the plugin architecture. MaxScale has been described before, but one that that might not be fully clear is the role of the plugins. MaxScale relies much more on the plugins that most other architectures, fact is, without the plugins, MaxScale can't do anything, everything is a plugin!

The MaxScale core is a multi-threaded epoll based kernel with 5 different types of plugins (note that there might be more than one plugin of each type, and this is mostly the case actually:
•    Protocols - These implement communication protocols, including debugging and monitoring protocols. From this you realize that without appropriate protocol plugins, MaxScale will not be able to be accessed at all, so these modules are key. Among the current protocols are MariaDB / MySQL Client and Server protocols.
•    Authentication - This type of plugin authenticates users connecting to MaxScale. Currently MariaDB / MySQL Authentication is supported.
•    Router - This is a key type of module that determines how SQL traffic is routed an managed.
•    Filter - This is an optional type of pluging there the SQL traffic can be modfied, checked or rejected,
•    Monitor - This type of modules is there to monitor the servers that MaxScale connects to, and this data is used by the routing mode.

Before we end this discussion on MaxScale, note that there might be several configurations through one single MaxScale setup, so MaxScale can listen to one prot for one set up servers and routine setup, and on another port for a different setup.

With this we have an idea how MaxScale work, so let's see if we can tie it all up.

MaxScale as a Binlog server

As can be seen from the description of MaxScale a lot of what is needed to create a Binlog server to use as an intermediate server for slaves is there. What is needed is a router module that acts as a slave to the assigned master, downloads the binlogs from there, using the usual MariaDB / MySQL Replication protocol. This routing plugin also needs to serve the slaves with the downloaded binlogs files. In theory, and also in practice, the slaves will not know if it is connected to the real master or to MaxScale.

Using MaxScale this way as an intermediate Master, a slave that connects to the MaxScale can work from the same Binlog files and positions as when connected directly to the master, as the files are the same for all intents and purposes. There will be no extra delays for long running SQL statements as these aren't applied on MaxScale, the replication data is just copied from the master, plain and simple. As for parallel slaves, this should work better in when using MaxScale as a Binlog server, but this is yet to be tested.

So there should be many advantages to using MaxScale as a binlog server compared to using an intermediate MariaDB / MySQL server. On the other hand, this solution is not for everyone, many just doesn't drive replication that hard that the load on the master is an issue so that an intermediate Master is requited. On the other, many use an Intermediate Master also for HA, and in this case it would have be advantageous to use MaxScale instead of that Intermediate master, the latter which could still server the role as a fail-over HA server.

Now, there one issue with all of this that many of you might have spotted: That cool Binlog server plugin module for MaxScale doesn't exists. Well, I am happy to say that you are wrong, it does exist and it works. A Pilot for such a module has been developed by SkySQL together with Booking.com that had just this need for an intermediate server that wasn't just yet another MariaDB / MySQL server. For the details on the specific usecase, see the blog by Jean-François Gagné.