Pacemaker 2 floatingIP

high-availabilitypacemaker

I'm triying to config a cluster with two servers, each of them with two interfaces, and i want to config two floatings ips, one private and other public.
The idea is, if some of two interface fail in one server fails, cluster swicths to the other server.

I'm using pacemaker but until now, i've just made it work with one floating IP. With two, until now, no way.
I'm configuring two resources (2 float ip) and two rings.

Have somebody tried this? Or somebody can guide me?

Thanks in advance!

Best Answer

Should be fairly simple. Just configure up another IPaddr2 primitive. If you need to specify which interface the virtual IP binds to this can be done fairly easily via the nic= parameter. Naturally, the interface names on both nodes will need to match if using nic= parameter. for example:

primitive p_ip_private IPaddr2 \
    params ip=192.168.35.5 cidr_netmask=255.255.255.0 nic=eth1 \
    op start interval=0 timeout=20 \
    op stop interval=0 timeout=20s \
    op monitor interval=20 timeout=20

You need not use the nic= parameter though. The IPaddr2 resource agent should be smart enough to properly choose the right interface based upon the network and subnet already assigned to the interface.

Related Solutions

LDAP (389 Directory Service) and Pacemaker with Multi-Master

Thanks to the Pacemaker mailing list, we have a solution. The problem is that the LSB script for 389 doesn't understand the concept of master/slave. The easiest solution is to use a simple clone, rather than a master/slave clone. New Pacemaker configuration looks like the following:

property stonith-enabled=false
property no-quorum-policy=ignore

rsc_defaults resource-stickiness=100

primitive elastic_ip lsb:elastic-ip op monitor interval="10s"
primitive dirsrv lsb:dirsrv op monitor interval="15s" role="Slave" timeout="10s" op monitor interval="16s" role="Master" timeout="10s"
clone ldap-clone dirsrv
order ldap-after-eip inf: elastic_ip ldap-clone
colocation ldap-with-eip inf: elastic_ip ldap-clone

Monitoring Varnish with Heartbeat and Pacemaker

Your cluster architecture confuses me, as it seems you are running services that should be cluster-managed (like Varnish) standalone on two nodes at the same time and let the cluster resource manager (CRM) just juggle IP addresses around.

What is it you want to achieve with your cluster setup? Fault tolerance? Load balancing? Both? Mind you, I am talking about the cluster resources (Varnish, IP addresses, etc), not the backend servers to which Varnish distributes the load.

To me it sounds like you want an active-passive two-node cluster, which provides fault tolerance. One node is active and runs Varnish, the virtual IP addresses and possibly other resources, and the other node is passive and does nothing until the cluster resource manager moves resources over to the passive node, at which point it becomes active. This is a tried-and-true architecture that is as old as time itself. But for it to work you need to give the CRM full control over the resources. I recommend following Clusters from Scratch and modelling your cluster after that.

Edit after your updated question: your CIB looks good, and once you patched the Varnish init script so that repeated calls to "start" return 0 you should be able to add the following primitive (adjust the timeouts and intervals to your liking):

primitive p_varnish lsb:varnish \
    op monitor interval="10s" timeout="15s" \
    op start interval="0" timeout="10s" \
    op stop interval="0" timeout="10s"

Don't forget to add it to the balancer group (the last element in the list):

group balancer eth0_gateway eth1_iceman_slider eth1_iceman_slider_ts \
    eth1_iceman_slider_pm eth1_iceman_slider_jy eth1_iceman eth1_slider \
    eth1_viper eth1_jester p_varnish

Edit 2: To decrease the migration threshold add a resource defaults section at the end of your CIB and set the migration-threshold property to a low number. Setting it to 1 means the resource will be migrated after a single failure. It is also a good idea to set resource stickiness so that a resource that has been migrated because of node failure (reboot or shutdown) does not automatically get migrated back later when the node is available again.

rsc_defaults $id="rsc-options" \
    resource-stickiness="100" \
    migration-threshold="1"

Best Answer

Related Solutions

LDAP (389 Directory Service) and Pacemaker with Multi-Master

Monitoring Varnish with Heartbeat and Pacemaker

Related Topic