Why Redis slaves don’t take over the master after master fail in Redis Cluster

clusterfailoverclustermaster-slaveredisreplication

I have a Redis Cluster with 2 masters and 4 slaves (2 slaves for each master). After I manually crash a master (i.e. redis-cli -p 6379 debug segfault) slaves doesn't do anything. They detects that something wrong with the master but they are doing nothing (I waited for 20 minutes).

Here is cluster nodes output (i.e. redis-cli cluster nodes):

08dfd1bdd470a8831b33b7b0409a40bf45ee22d0 192.168.0.15:6379 myself,slave 55787eb63780365a0c7d4a0ed72cac4b97a55ed0 0 0 1 connected
7fedf234aba8d906dca5a4725a54d1cc5c979c18 192.168.0.18:6379 slave a739cfbcd9b804345808bb3a78b6a00b2d6050f9 0 1477865886164 2 connected
a739cfbcd9b804345808bb3a78b6a00b2d6050f9 192.168.0.14:6379 master,fail? - 1477865551940 1477865548392 0 disconnected 8192-16383
5dcc0a0a3f13ea9343171a13fbf0ec7054dfc2ab 192.168.0.19:6379 slave a739cfbcd9b804345808bb3a78b6a00b2d6050f9 0 1477865884135 5 connected
55787eb63780365a0c7d4a0ed72cac4b97a55ed0 192.168.0.16:6379 master - 0 1477865885150 2 connected 0-8191
601a5e0dd9d40d8c01119714e89be63eaee87900 192.168.0.17:6379 slave 55787eb63780365a0c7d4a0ed72cac4b97a55ed0 0 1477865882100 3 connected

As we see here is a failed master node 192.168.0.14:6379 which marked as master,fail? . I don't know why it shows a question mark. But I am waiting for 20 minutes and nothing changed. Why slaves don't take over a master?

Best Answer

You need to have at least 3 masters to form a redis cluster. If majority of masters die at the same time, cluster becomes unusable. Failover does not happen if majority of masters are not available.