Monday, December 4, 2023

MariaDBTop in action

    MariaDBTop is a tool I have come to use quite often but it is a bit hard to configure and set up. In this post I will show you have to set it up by using a realistic example. So, what I want to show is monitoring of a simple application that emulated Telco style CDRs (Call Data Records). A CDR is a very common data structure in Telco and network applications where a CDR is a representation of an event in the network such as a call being started, a message is being sent etc. CDRs are never updated, they are just inserted.

    The CDR structure is not standardized but similar across the many places where it is used. In this case I have an application that inserts CDRs over multiple threads. The database that these records are recorded in is replicated and we want to keep an eye on the processing on both the primary and the replica.

Main configuration section

    To begin with we have to configure the main section of the MariaDB configuration. To begin with let's use a decent colour screen. For this type of applications I like green text on a black background and for alerts I want a bright red text colour. We are using the default configuration file mariadbtop.cnf, all the configuration data goes into that in this example, and the first part of this file then has the mariadbtop section then looks like this

[mariadbtop]
default-colour=green
default-bgcolour=black
alert-attribute=normal
alert-colour=black
alert-bgcolour=red

    We then provide connection information to the MariaDB Primary server

user=monitoruser
password=monitorpwd

    With this in place we need to define the replica server that we are connecting to as well as the pages that are to be displayed, the actual attributes of these are defined later. We have one server to add, the replica, and three pages

pages=dashboard,replication,innodbstatus
servers=replica_server

    With this in place I also want to refresh the status bit more frequently than usual and write alerts to a file

refresh-time=1000
alert-time=10
alert-file=alert.log

    The above configuration will refresh the screen every second (1000 ms) and will write a record to the altert log file when there is an alert and every 10 seconds for ongoing alerts.

    The above should be pretty much self explanatory

Defining the dashboard page

Defining the replica server attributes

    Before we go on from here we need to define how to connect to the replica server. The name of this is, as defined above, replica_server which means we need a section called just that with information on how to connect to it. In my case it looks like this

[replica_server]
user=monitoruser
password=monitorpwd
host=repl1


The above should be pretty much self explanatory

Defining the dashboard page

   Now we get to the interesting bit, the actual data items that we are to monitor and the first is the dashboard page.

[dashboard]
pagetitle=Dashboard
pagekey=d

    This above defines the title displayed at the top of the dashboard page and the key used to navigate to it. The next thing we want to do is to display the number of rows in the table we work with, this is called cdr and it also resides in the cdr database and for this we could do a SELECT COUNT(*) but I decide against that and that will slow down the actual processing going on, so instead I run a SHOW TABLE STATUS command which gives a less exact result but doesn't intrude into the normal processing as much. I end up with this setting

sql1=SHOW TABLE STATUS IN cdr LIKE 'cdr'
vertical1
hide1
name-prefix1=primary_ts_

    This will run the command above, the vertical setting will cause a row/value pair for each column and the hide attrbute will ensure that this data is fetched but not displayed, we will look at that later on. Finally the name_prefix attribute will prefix each column name fetched with primary_ts_. Following this I need to get the same data from the replica server, like this

sql2=SHOW TABLE STATUS IN cdr LIKE 'cdr'
vertical2
hide2
server2=replica_server
name-prefix2=replica_ts_

    The only attribute different here is the server, as we are getting this from the replica. Also, we run the same comman so we need to use a different prefix so we don't userwrite the key / value pairs gotten from the previous command. We have now defined how to get the data we want to display, but we haven't actually displayed it. What we want to show is three things, the number of rows on the Primary and the Replica and the difference between the two. This is how this is done

expr10=this.primary_ts_Rows
name10=Primary_rows
expr11=this.replica_ts_Rows
name11=Replica_rows
expr12=this.primary_ts_Rows - this.replica_ts_Rows
name12=Replica rows behind

    The difference may also be calculated like this

expr12=this.Primary_rows - this.Replica_rows
name12=Replica rows behind

The Replication status page

     Now it is time to display the replication status, page, and we start by the page specific data

[replication]
pagetitle=replication
pagekey=e

    With this in place we are ready to define how to get the data, on the replica we run a SHOW REPLICA STATUS command, and this has to be vertical as we want all the columns from that command, and on the master all we want at this point is the BINLOG GTID. This looks like this

sql1=SHOW REPLICA STATUS
vertical1
server1=replica_server
name-prefix1=replica_
hide1
sql2=SELECT @@GLOBAL.GTID_BINLOG_POS
name2=master_gtid_pos
hide2

    Again, we have so far only gotten data that is not displayed, the data that we do want to dipslay now is the GTID pos on the master and the replica and the Seconds_Behind_Master column from the SHOW REPLICA STATUS command. This looks like this

expr10=this.replica_Seconds_Behind_Master
name10=Seconds behind
expr11=this.replica_Gtid_IO_Pos
name11=Replica GTID
expr12=this.master_gtid_pos
name12=Master GTID

    And that completes the replication page.

The InnoDB status page

    We start the InnoDB status page with the usual page details and then getting  InnoDB status using SHOW GLOBAL STATUS.

[innodbstatus]
pagetitle=InnoDB Status
pagekey=i
sql1=SHOW GLOBAL STATUS LIKE 'Innodb%'
value-col1=Value
name-col1=Variable_name
name-prefix1=pgs_
hide1
sql2=SHOW GLOBAL STATUS LIKE 'Innodb%'
value-col2=Value
name-col2=Variable_name
name-prefix2=r1gs_
server2=replica_server
hide2

    Now, we want to process the data we got above to something we can display. Let start with the InnoDB cache hit ratio, which is an expression based on two other values returned from SHOW GLOBAL STATUS, so for this we use these two

expr3=round(this.pgs_Innodb_buffer_pool_read_requests / \ (this.pgs_Innodb_buffer_pool_reads + \ this.pgs_Innodb_buffer_pool_read_requests) * 100, 2) \
name3=Primary InnoDB Cache hit ratio
expr4=round(this.r1gs_Innodb_buffer_pool_read_requests / \ (this.r1gs_Innodb_buffer_pool_reads + \ this.r1gs_Innodb_buffer_pool_read_requests) * 100, 2)
name4=Replica InnoDB Cache hit ratio

    In addition to this we also like to get some other InnoDB data displayed, like this

expr5=this.pgs_Innodb_rows_read
name5=Primary Rows read
expr6=this.r1gs_Innodb_rows_read
name6=Replica Rows read
expr7=this.pgs_Innodb_rows_inserted
name7=Primary Rows inserted
expr8=this.r1gs_Innodb_rows_inserted
name8=Replica Rows inserted
expr9=this.pgs_Innodb_rows_updated
name9=Primary Rows updated
expr10=this.r1gs_Innodb_rows_updated
name10=ReplicaRows updated
expr11=this.pgs_Innodb_rows_deleted
name11=Primary Rows deleted
expr12=this.r1gs_Innodb_rows_deleted
name12=Replica Rows deleted

    And that completes the configuration, we now have three pages of data

The result

    With that in place the dashboard looks like this




 


And the InnoDB page like this


Configuration summary

In total, this is what the configuration looks like

[mariadbtop]
default-colour=green
default-bgcolour=black
alert-attribute=normal
alert-colour=black
alert-bgcolour=red
user=monitoruser
password=monitorpwd
pages=dashboard,replication,innodbstatus
servers=replica_server
refresh-time=1000
alert-time=10
alert-file=alert.log
 
[replica_server]
user=monitoruser
password=monitorpwd
host=repl1
 
[dashboard]
pagetitle=Dashboard
pagekey=d
sql1=SHOW TABLE STATUS IN cdr LIKE 'cdr'
vertical1
hide1
name-prefix1=primary_ts_
sql2=SHOW TABLE STATUS IN cdr LIKE 'cdr'
vertical2
hide2
server2=replica_server
name-prefix2=replica_ts_
 
[replication]
pagetitle=replication
pagekey=e
sql1=SHOW REPLICA STATUS
vertical1
server1=replica_server
name-prefix1=replica_
hide1
sql2=SELECT @@GLOBAL.GTID_BINLOG_POS
name2=master_gtid_pos
hide2
expr10=this.replica_Seconds_Behind_Master
name10=Seconds behind
expr11=this.replica_Gtid_IO_Pos
name11=Replica GTID
expr12=this.master_gtid_pos
name12=Master GTID
  
[innodbstatus]
pagetitle=InnoDB Status
pagekey=i
sql1=SHOW GLOBAL STATUS LIKE 'Innodb%'
value-col1=Value
name-col1=Variable_name
name-prefix1=pgs_
hide1
sql2=SHOW GLOBAL STATUS LIKE 'Innodb%'
value-col2=Value
name-col2=Variable_name
name-prefix2=r1gs_
server2=replica_server
hide2
expr3=round(this.pgs_Innodb_buffer_pool_read_requests / \ (this.pgs_Innodb_buffer_pool_reads + \ this.pgs_Innodb_buffer_pool_read_requests) * 100, 2) \
name3=Primary InnoDB Cache hit ratio
expr4=round(this.r1gs_Innodb_buffer_pool_read_requests / \ (this.r1gs_Innodb_buffer_pool_reads + \ this.r1gs_Innodb_buffer_pool_read_requests) * 100, 2)
name4=Replica InnoDB Cache hit ratio
expr5=this.pgs_Innodb_rows_read
name5=Primary Rows read
expr6=this.r1gs_Innodb_rows_read
name6=Replica Rows read
expr7=this.pgs_Innodb_rows_inserted
name7=Primary Rows inserted
expr8=this.r1gs_Innodb_rows_inserted
name8=Replica Rows inserted
expr9=this.pgs_Innodb_rows_updated
name9=Primary Rows updated
expr10=this.r1gs_Innodb_rows_updated
name10=ReplicaRows updated
expr11=this.pgs_Innodb_rows_deleted
name11=Primary Rows deleted
expr12=this.r1gs_Innodb_rows_deleted
name12=Replica Rows deleted

With that, i leave you to try MariaDB Top

Happy SQL'ing
/Karlsson