Mysql – MariaDB galera cluster fail when dumping or optimizing database

clusterdumpgaleramariadbMySQL

It seems that a Galera cluster deadlocks every time when I run a mysqldump or table optimize.

I've ran a "mysqlcheck" and a "mysqldump" on my MariaDB 10.1 database server several times. (which runs in a Galera cluster with two other servers)

I've noticed that both tasks stop and don't show any progress after running for a short time.

For example the mysqldump stops after creating an 14,0 MB (14.760.912 bytes) dump file and doesn't proceed.

The mysqlcheck to repair and optimize tables also hangs.

In both situations the cluster starts to have issues and the only way to get it to work normally again is by taking the server that executed the job offline and also taking another server offline. I then take them back online on by one and the cluster works normally again.

I'm not sure what is causing these problems. I haven't found any errors in the syslog, although during the shutdown of the servers I noticed the following:

Jan 10 20:43:46 france mysqld[1015]: 2016-01-10 20:43:46 140096330258176 [Warning] WSREP: TO isolation failed for: 3, schema: mysql, sql: OPTIMIZE TABLE proc. Check wsrep connection state and retry the query.

Jan 10 21:58:47 france mysqld[1034]: 2016-01-10 21:58:47 139691511322368 [Warning] WSREP: TO isolation failed for: 3, schema: smf, sql: OPTIMIZE TABLE smf_categories. Check wsrep connection state and retry the query.
Jan 10 21:58:47 france mysqld[1034]: 2016-01-10 21:58:47 139691511322368 [Warning] Aborted connection 24 to db: 'smf' user: 'maintenance' host: 'localhost' (Unknown error)
Jan 10 21:58:47 france mysqld[1034]: 2016-01-10 21:58:47 139691509827328 [Warning] WSREP: TO isolation failed for: 3, schema: (null), sql: SELECT 1 FROM mysql.user LIMIT 1. Check wsrep connection state and retry the query.

I've found the issue to be with Galera. When I take the server out of the cluster the optimize and dump job run much quicker and complete correctly.

Best Answer

I was pointed into the right direction by Tom on the MariaDB KB: https://mariadb.com/kb/en/mariadb/galera-cluster-fail-during-dump-or-optimize/#comment_1911

The issue appears to be caused by flow control. I've resolved it by tuning the flow control settings.

I did this by adding the following wsrep_provider_options: gcs.fc_limit=500; gcs.fc_master_slave=YES; gcs.fc_factor=1.0

However it also created a new problem, the job now finishes correctly but after that the cluster still go's down.

Related Solutions

Fail to set up mariadb-galera cluster

It seems likely that these are the important lines in the log:

 150128 14:29:02 [ERROR] WSREP: gcs/src/gcs_core.c:gcs_core_open():202: Failed to open backend connection: -110 (Connection timed out)
 150128 14:29:02 [ERROR] WSREP: gcs/src/gcs.c:gcs_open():1291: Failed to open channel 'my_wsrep_cluster' at 'gcomm://<IP1>,<IP2>': -110 (Connection timed out)
 150128 14:29:02 [ERROR] WSREP: gcs connect failed: Connection timed out
 150128 14:29:02 [ERROR] WSREP: wsrep::connect() failed: 7

Typically this points to a network issue, more often that not firewall. Make sure you have tcp ports 3306,4444,4567 & 4568 open between the nodes.

MariaDB Galera Cluster, Force Sync

I was not able to find an official way to handle this, so I went with the idea of dumping the tables individually and reimporting them. Not wanting to do it by hand, I whipped a PHP script to do it for me. I'm posting it here in case anyone else finds this useful.

/*
 * InnoDB Convert
 * Converts existing non-InnoDB tables to InnoDB, then re-imports the
 * data so that it's replicated across the cluster.
 */

// Configuration
    $_config['db'] = array(
        'type'     => 'mysql',
        'host'     => 'localhost',
        'username' => 'user',
        'password' => 'password'
    );

// Establish database connection
    try {
        $pdo = new PDO(
            $_config['db']['type'] . ':host=' . $_config['db']['host'],
            $_config['db']['username'],
            $_config['db']['password']
        );
    } catch ( PDOException $e ) {
        echo 'Connection failed: ' . $e->getMessage();
    }

// Get list of databases
    $db_query = <<<SQL
  SHOW DATABASES
SQL;
    $db_result = $pdo->prepare( $db_query );
    $db_result->execute();

    while ( $db_row = $db_result->fetch( PDO::FETCH_ASSOC )) {
    // Look through databases, but ignores the ones that come with a
    // MySQL install and shouldn't be part of the cluster
        if ( !in_array( $db_row['Database'], array( 'information_schema', 'mysql', 'performance_schema', 'testdb' ))) {
            $pdo->exec( "USE {$db_row['Database']}" );

            $table_query = <<<SQL
  SHOW TABLES
SQL;
            $table_result = $pdo->prepare( $table_query );
            $table_result->execute();

            while ( $table_row = $table_result->fetch( PDO::FETCH_ASSOC )) {
            // Loop through all tables
                $table = $table_row["Tables_in_{$db_row['Database']}"];

                $engine_query = <<<SQL
  SHOW TABLE STATUS WHERE Name = :table
SQL;
                $engine_result = $pdo->prepare( $engine_query );
                $engine_result->execute( array(
                    ':table' => $table
                ));

                $engine_row = $engine_result->fetch( PDO::FETCH_ASSOC );

                if ( $engine_row['Engine'] != 'InnoDB' ) {
                // Engine is not equal to InnoDB, let's convert it
                    echo "Converting '$table' on '{$db_row['Database']}' from '{$engine_row['Engine']}' to InnoDB:\n";

                    echo     "Modifying engine...";

                    $change_query = <<<SQL
  ALTER TABLE $table ENGINE=InnoDB
SQL;
                    $change_result = $pdo->prepare( $change_query );
                    $change_result->execute();

                    echo "done!\n";

                    echo "    Exporting table...";

                    exec( "mysqldump -h {$_config['db']['host']} -u {$_config['db']['username']} -p{$_config['db']['password']} {$db_row['Database']} $table > /tmp/dump-file.sql" );

                    echo "done!\n";

                    echo "    Re-importing table...";

                    exec( "mysql -h {$_config['db']['host']} -u {$_config['db']['username']} -p{$_config['db']['password']} {$db_row['Database']} < /tmp/dump-file.sql" );

                    echo "done!\n";

                    unlink( '/tmp/dump-file.sql' );

                    echo "done!\n";
                }
            }
        }
    }

I successfully used it to convert hundreds of tables across a couple dozen databases in about two minutes.

Best Answer

Related Solutions

Fail to set up mariadb-galera cluster

MariaDB Galera Cluster, Force Sync

Related Topic