Friday, September 16, 2016

Replication from Oracle to MariaDB - Part 2

In part 1 on this series of blogs on how to replcate from Oracle to MariaDB, we looked at how to get data out from Oracle, and in an effort to look more modern than I really am, let's say that what we were attempting was CDC (Change Data Capture). Hey, I feel like I'm 18 again! Modern, cool! Maybe I should start to wear a baseball cap backwards and have my jeans no higher than my knees. Or again, maybe not. But CDC it is.

When I left you in the previous post we have just started to get some data. reluctantly, out of Oracle. But the SQL statement we got from the LogMiner was hardly useful. But fact is, we can fix that. Before we do that we need to look at a few other functions in LogMiner. The two functions we want to loot at are called DBMS_LOGMNR.COLUMN_PRESENT and DBMS_LOGMNR.MINE_VALUE. These two functions are simple enough, they both take two arguments, first a reference to a redo log record, which is the value of the column redo_value in the V$LOGMNR_CONTENTS table, and the other argument is the fully qualified name of the table in question. The table name of the redo log record is, surprisingly, in the table_name column. The reason we need two functions for this is that the second of these functions can return NULL, both when the column is question doesn't exist and well as when the value is NULL.

I will for the rest of this post assume that there exists an Oracle user called anders with the password anders and that the same thing exists in MariaDB, where there is also a database called anders (you already guess that, right?) where the user anders has full access.

So with the table_name of the value of the involved columns, we are ready to go. Sure. let's look at an insert.  First we need a table to test this on, and let's use the classic Oracle EMP sample table for this, but I have simplified this a bit to make my example easier to grasp without getting into too many details. So this is what it looks on the Oracle side of things, and let's call this script cretabora.sql:
CREATE TABLE anders.emp(
  empno NUMBER NOT NULL PRIMARY KEY,
  ename VARCHAR2(10) NOT NULL,
  job VARCHAR2(9) NOT NULL,
  sal NUMBER(7,2) NOT NULL,
  deptno NUMBER NOT NULL) TABLESPACE users;
What we also need is some way to keep track of processing, in this case we want to keep track of the SCN that was last used, so we know where to start when we get the next chunk of CDCs (He, I used it again. Boy am I young and hot today). And add this to the script above:
CREATE TABLE scnlog(
  table_name VARCHAR2(30) NOT NULL PRIMARY KEY,
  scn NUMBER) TABLESPACE users;
INSERT INTO anders.scnlog
  SELECT 'EMP', current_scn FROM v$database;
I also allocated space for the EMP table and set the SCN to the current database SCN. As I am using the V$DATABASE virtual table, I have to run the above as a privileged user.
$ sqlplus / as sysdba @cretabora.sql

OK, now I have some tables to work with. Before to go on let's also create the same objects in a MariaDB database. The minimum thing we need is something is, let's call this script cretabmaria.sql:
CREATE TABLE emp(
  empno INTEGER NOT NULL PRIMARY KEY,
  ename VARCHAR(10) NOT NULL,
  job VARCHAR(9) NOT NULL,
  sal DECIMAL(7,2) NOT NULL,
  deptno INTEGER NOT NULL);
I will not explain this in detail, you get it already I think:
$ mysql -u root
MariaDB [(none)]> CREATE USER 'anders'@'%' IDENTIFIED BY 'anders';
MariaDB [(none)]> CREATE DATABASE anders;
MariaDB [(none)]> GRANT ALL ON anders.* TO 'anders'@'%';
MariaDB [(none)]> exit
$ mysql -u anders -panders anders < cretabmaria.sql
The above really shows that MariaDB is easier to deal with than Oracle, right? But that wasn't what we were looking for here, and we're getting closer now, aren't you excited?

Let's insert a single row of data in the EMP table, just to get started:
$ sqlplus anders/anders
SQL> INSERT INTO emp VALUES(1, 'Anders', 'Sales', 10000, 10);
SQL> COMMIT;

Now we are ready to extract data from the redo log. For this we will run a small script that runs LogMiner, extracts rows from the redo log and converts that to valid MariaDB SQL syntax. Create a script called cdcemp.sql with this content:
SET serveroutput ON
SET linesize 4000
SET feedback off
DECLARE
   v_scn NUMBER;
   v_scnstart NUMBER;
   v_scnend NUMBER;
   v_redo_value NUMBER;
   v_sqlstmt VARCHAR2(4000);
   CURSOR cur1 IS
     SELECT scn, redo_value
       FROM v$logmnr_contents
       WHERE seg_owner = 'ANDERS' AND table_name = 'EMP'
         AND operation = 'INSERT';
BEGIN
-- Start LogMiner.
   SELECT NVL(MIN(l.scn) + 1, MIN(e.ora_rowscn)) INTO v_scnstart
     FROM anders.emp e, scnlog l
     WHERE l.table_name = 'EMP';
   SELECT MAX(e.ora_rowscn) INTO v_scnend
     FROM anders.emp e;
   DBMS_OUTPUT.PUT_LINE('-- SCN Range: ' || v_scnstart || ' - ' || v_scnend);
   DBMS_LOGMNR.START_LOGMNR(STARTSCN => v_scnstart,
     ENDSCN => v_scnend,
     OPTIONS => DBMS_LOGMNR.COMMITTED_DATA_ONLY +
       DBMS_LOGMNR.CONTINUOUS_MINE +
       DBMS_LOGMNR.DICT_FROM_ONLINE_CATALOG);
-- Dummy update to insert a more recent SCN to ensure
-- that we don't have to wait for the next update in the
-- following select.
   UPDATE scnlog SET table_name = table_name WHERE table_name = 'EMP';
   COMMIT;

-- Open cursor to get data from LogMiner.
   OPEN cur1;

-- Loop for all the rows in the redo log since the last time we ran this.
   LOOP
      FETCH cur1 INTO v_scn, v_redo_value;
      EXIT WHEN cur1%NOTFOUND;
      v_sqlstmt := 'INSERT INTO emp(EMPNO, ENAME, JOB, SAL, DEPTNO)'
        || 'VALUES(';
      v_sqlstmt := v_sqlstmt || DBMS_LOGMNR.MINE_VALUE(v_redo_value,
         'ANDERS.EMP.EMPNO') || ', ';
      v_sqlstmt := v_sqlstmt || '''' || DBMS_LOGMNR.MINE_VALUE(v_redo_value,
        'ANDERS.EMP.ENAME') || ''', ';
      v_sqlstmt := v_sqlstmt || '''' || DBMS_LOGMNR.MINE_VALUE(v_redo_value,
        'ANDERS.EMP.JOB') || ''', ';
      v_sqlstmt := v_sqlstmt || DBMS_LOGMNR.MINE_VALUE(v_redo_value,
        'ANDERS.EMP.SAL') || ', ';
      v_sqlstmt := v_sqlstmt || DBMS_LOGMNR.MINE_VALUE(v_redo_value,
        'ANDERS.EMP.DEPTNO') || ')';

      DBMS_OUTPUT.PUT_LINE('-- SCN = ' || v_scn);
      DBMS_OUTPUT.PUT_LINE(v_sqlstmt || ';');
   END LOOP;
   IF v_scn IS NOT NULL THEN
      UPDATE scnlog SET scn = v_scn WHERE table_name = 'EMP';
      COMMIT;
   END IF;
   CLOSE cur1;
END;
/
EXIT

The first three lines in this script are SQL*Plus specific settings. Then we declare some variables and a cursor for the V$LOGMNR_CONTENTS table. Following that we get the starting and ending SCN that we want and use that to start LogMiner and output some information on what we are doing. Then comes a wacko UPDATE. This seems to be necessary to ensure that the redo log is  progressed past the last log record so that we can get the data. Without this the SELECT from the cursor would wait until someone else generated some redo log content. This is a kludge, I know, but it seems to work. Also, Oracle isn't smart enough to skip these kinds of dummy updates.

Following this the cursor is opened and we get the rows from it and build a suitable SQL statement for MariaDB. Finally when the loop is over I update the SCN log table so I know where to start the next run. This script can be done to generate incremental updates from Oracle to be inserted into MariaDB. And with all that, let's now finally run the script, and here I will run it and extract the data to MariaDB:
$ sqlplus -S / as sysdba @cdcemp.sql | mysql -u anders -panders anders

Before we finish, let's check if this worked:
$ mysql -u anders -panders anders
MariaDB [anders]> SELECT * FROM emp;
+-------+--------+-------+----------+--------+
| empno | ename  | job   | sal      | deptno |
+-------+--------+-------+----------+--------+
|     1 | Anders | Sales | 10000.00 |     10 |
+-------+--------+-------+----------+--------+
1 row in set (0.00 sec)

That's it for this time. There are few few more things to look into though, like managing transactions on the MariaDB side for example. And UPDATEs and DELETEs, those require some special care. But we are getting there now, right? So stay tuned for the third post in this series.

Happy SQL'ing
/Karlsson



1 comment:

shahbaz said...

it seems like we shouldn't have to send a questionnaire to our soon-to-be representatives in regards to these basic issuesDrug Card