After a network failure,both servers running keepalived become master.
When the network is reestablished, both keep the MASTER state.
What could be causing it?
Edited: Another information that might be relevant, each server has two NICs.
Here is the virtual instance configuration:
vrrp_instance VGAPP {
interface eth0
virtual_router_id 61
state BACKUP
nopreempt
priority 50
advert_int 3
virtual_ipaddress {
10.26.57.61/24
}
track_interface {
eth0
}
track_script {
jboss_check
#tomcat_check
#interface_check
#interface_check02
}
notify_master "/opt/keepalived/scripts/set_state.sh MASTER"
notify_backup "/opt/keepalived/scripts/set_state.sh BACKUP"
notify_fault "/opt/keepalived/scripts/set_state.sh FAULT"
notify_stop "/opt/keepalived/scripts/set_state.sh STOPPED"}
Best Answer
This can actually be caused a bug. I know because I've had to fix it myself.
According to the RFC, when priorities are equal on both nodes;
So, he who has the biggest IP address will win.
In keepalived, the way this is done is basically wrong. Endianness is not considered properly when doing this comparison.
Lets imagine we have two routers, (A)10.1.1.200 and (B)10.1.1.201.
The code should perform the following comparison.
On A:
On B:
However because the endianness is not incorrectly handled, the following comparison is made instead.
On A:
On B:
This patch should work, but i've remade it from my original patch and have not tested it. Not even tested it compiles! So no refunds!