Mariadb Galera Cluster Cannot Start Up

galeramariadbMySQL

I built a mariadb galera cluster on CentOS7. Below is the galera node information:

10.200.67.27    MariaDB-Node1
10.200.67.29    MariaDB-Node2
10.200.67.26    MariaDB-Node3

However, MariaDB-Node2 and MariaDB-Node3 were unexpected stopped. I tried to restart mysql service on the two servers, but they didn't start up. Then I removed the wsrep_on=1 setting and restart mysql on MariaDB-Node2, it shows error as below:

[xiaofang@sd-vm-0003929 ~]$ sudo systemctl start mysql
[xiaofang@sd-vm-0003929 ~]$ 
[xiaofang@sd-vm-0003929 ~]$ 
[xiaofang@sd-vm-0003929 ~]$ sudo systemctl status mysql
● mariadb.service - MariaDB 10.6.2 database server
   Loaded: loaded (/usr/lib/systemd/system/mariadb.service; enabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/mariadb.service.d
           └─migrated-from-my.cnf-settings.conf
   Active: activating (start) since Fri 2022-05-27 16:34:16 CST; 402ms ago
     Docs: man:mariadbd(8)
           https://mariadb.com/kb/en/library/systemd/
  Process: 13569 ExecStartPost=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS)
  Process: 13580 ExecStartPre=/bin/sh -c [ ! -e /usr/bin/galera_recovery ] && VAR= ||   VAR=`cd /usr/bin/..; /usr/bin/galera_recovery`; [ $? -eq 0 ]   && systemctl set-environment _WSREP_START_POSITION=$VAR || exit 1 (code=exited, status=0/SUCCESS)
  Process: 13578 ExecStartPre=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS)
 Main PID: 13606 (mariadbd)
   CGroup: /system.slice/mariadb.service
           └─13606 /usr/sbin/mariadbd

May 27 16:34:16 sd-vm-0003929.novalocal systemd[1]: Starting MariaDB 10.6.2 database server...
May 27 16:34:16 sd-vm-0003929.novalocal mariadbd[13606]: 2022-05-27 16:34:16 0 [Note] /usr/sbin/mariadbd (mysqld 10.6.2-MariaDB) starting as process 13606 ...
May 27 16:34:16 sd-vm-0003929.novalocal mariadbd[13606]: 2022-05-27 16:34:16 0 [Warning] You need to use --log-bin to make --binlog-format work.
May 27 16:34:16 sd-vm-0003929.novalocal mariadbd[13606]: 2022-05-27 16:34:16 0 [Note] InnoDB: Compressed tables use zlib 1.2.7
May 27 16:34:16 sd-vm-0003929.novalocal mariadbd[13606]: 2022-05-27 16:34:16 0 [Note] InnoDB: Number of pools: 1
May 27 16:34:16 sd-vm-0003929.novalocal mariadbd[13606]: 2022-05-27 16:34:16 0 [Note] InnoDB: Using crc32 + pclmulqdq instructions
May 27 16:34:16 sd-vm-0003929.novalocal mariadbd[13606]: 2022-05-27 16:34:16 0 [Note] InnoDB: Using Linux native AIO
May 27 16:34:16 sd-vm-0003929.novalocal mariadbd[13606]: 2022-05-27 16:34:16 0 [Note] InnoDB: Initializing buffer pool, total size = 134217728, chunk size = 134217728
May 27 16:34:16 sd-vm-0003929.novalocal mariadbd[13606]: 2022-05-27 16:34:16 0 [Note] InnoDB: Completed initialization of buffer pool
May 27 16:34:16 sd-vm-0003929.novalocal mariadbd[13606]: 2022-05-27 16:34:16 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=34426674408,34426674408

So how can I start the mysql service again?

Best Answer

If Node2 and Node3 were stopped, unless you changed something in the quorum weights Node1 will not be in good shape. I'd recommend you do the following (assuming Node3 is still down):

  • Stop Node2
  • Stop Node1
  • Run as root on Node1: "galera_new_cluster"

At this point you should verify Node1 is OK and formed a cluster by itself.

If that is ok you can try starting Node3 and see if it joins the cluster.

Node2, to be on the safe side I would wipe and let it rejoin the cluster starting from an empty database after you set wsrep_on back to 1.

Related Topic