Parallel Replication with TokuDB & MariaDB 10.1

I) What is parallel replication?

Parallel replication is a nice feature introduced in MariaDB 10.0, and which allows to run in parallel some queries on a slave. Basically, by default (the conservative mode), it runs in parallel transactions that couldn't have any conflict. Note you have to enable GTID to use this feature.

Starting with MariaDB 10.1.3 new parallel modes have been introduced : optimistic & aggressive mode. Those modes try to increase the parallelism by executing queries, and handles conflicts with rollback and retry.

This feature is known to work well on InnoDB, but what's happening with an engine like TokuDB?

II) Let's try it on TokuDB!

To enable the parallel replication, you need to enable GTID on your slave. If it's not the case, enabling GTID is straightforward. Once this is done, you have to stop the slave, and then set the slave_parallel_threads variable (in this example, 4 threads in parallel) :

STOP SLAVE;
SET GLOBAL slave_parallel_threads = 4;
START SLAVE;
Once this is done, the slave will run in the default conservative mode.

You can check this is the case in the SHOW PROCESSLIST :

MariaDB [(none)]> show processlist

*************************** 1. row ***************************
      Id: 671496
    User: system user
    Host: 
      db: NULL
 Command: Connect
    Time: 2073111
   State: Waiting for master to send event
    Info: NULL
Progress: 0.000
*************************** 2. row ***************************
      Id: 671497
    User: system user
    Host: 
      db: NULL
 Command: Connect
    Time: 9
   State: Waiting for work from SQL thread
    Info: NULL
Progress: 0.000
*************************** 3. row ***************************
      Id: 671498
    User: system user
    Host: 
      db: NULL
 Command: Connect
    Time: 1
   State: Waiting for work from SQL thread
    Info: NULL
Progress: 0.000
*************************** 4. row ***************************
      Id: 671499
    User: system user
    Host: 
      db: NULL
 Command: Connect
    Time: 0
   State: Waiting for work from SQL thread
    Info: NULL
Progress: 0.000
*************************** 5. row ***************************
      Id: 671500
    User: system user
    Host: 
      db: NULL
 Command: Connect
    Time: 16
   State: Waiting for work from SQL thread
    Info: NULL
Progress: 0.000
*************************** 6. row ***************************
      Id: 671501
    User: system user
    Host: 
      db: NULL
 Command: Connect
    Time: 5724
   State: Slave has read all relay log; waiting for the slave I/O thread to update it
    Info: NULL
Progress: 0.000
*************************** 7. row ***************************
Here you see the 4 system user threads "Waiting for work from SQL thread".

And the good news is TokuDB is working perfectly fine in conservative mode, no replication issue!

Since I'm using MariaDB 10.1.16, let's try to switch in optimistic mode :

STOP SLAVE;
SET GLOBAL slave_parallel_threads = 4;
SET GLOBAL slave_parallel_mode="optimistic";
START SLAVE;
Bad news, something is not working properly in optimistic mode, and the replication is now broken according to the SHOW SLAVE STATUS, with a "Lock wait timeout exceeded; try restarting transaction".

After reporting the issue to Kristian Nielsen and Rich Prohaska, it appears TokuDB is not yet implementing all the required logic to handle the optimistic (& aggressive) replication modes.

III) Conclusion

Parallel replication is working well on TokuDB, but in conservative mode. The logic needed to make it working in optimistic & aggressive modes is not yet implemented. However Kristian Nielsen & Rich Prohaska are currently working on it, and they are making good progress. (you can check the current WIP branch)

I'm confident we could have it working properly in the next version of MariaDB!