Ubuntu – Pacemaker/Corosync resource cleanup causes restart on Ubuntu (any version)

clustercorosynccrmpacemakerUbuntu

I'm having an issue with a pacemaker/corosync (2 node) cluster on Ubuntu (12.04 / 14.04 / 16.04 and 18.04) and couldn't find anyone else describing this issue.
There are two ressources: res_ip (virtual IP) and res_apache (apache2).
These are just examples of resources. The issue appears with any kind of corelated/colocated resources.

The res_apache is colocated with the res_ip to make the apache always run on the "active" server, which is available through the virtual ip.
There are cases, where the res_ip fails and restarts, which makes the res_apache restart as well (as expected) leaving a failcount behind.

The issue:
Trying to cleanup the resource res_ip (crm resource cleanup res_ip) causes the res_apache (which depends on the res_ip) to restart and I don't know why.

The same command on CentOS doesn't cause any interruption in the operation of the webapplication. The command solely cleans up the failcount.

The attached node1_corosync.log.extract.txt shows, that the res_ip is recognized as stopped (line 951) and hence the dependant res_apache is restarted.
The cleanup command is ran around that time (15:18:17) so I assume the check, whether the resource is running, is initiated by the cleanup command.
It just shouldn't be in the 'stopped' state and therefor not restart the dependant ressource res_apache.

Again I need to point out that I see this issue with all Ubuntu releases but not on CentOS and the kind of resource does not matter.

Anyone any ideas why this happens (and only happens in Ubuntu)?

Logfiles and configuration: https://1drv.ms/u/s!Av4S568oLfJmgZtQ6pcE40FOKN8yDg?e=IOHKX8

Best Answer

(you should be on mailing of any software (have you tried there?) you use, really, and looking for and offering help there first!) Below is excerpted from clusterlab mailing list, answers are from Redhat guys (although it my vary from cluster version to version)........

When I call “pcs resource cleanup Res1” this will result in an interruption of service at the side of Res2 (i.e. stop Res2 …) My – unconfirmed – assumption was, that pacemaker would first detect the current state of the resource(s) by calling monitor and then decide if there are any actions to be performed. But from reading the logfiles I would interpret that Res1 is temporarily removed from the cib and re-inserted again. And this results in stopping Res2 until Res1 has confirmed state “started”.

Correct, removing the resource's operation history is how pacemaker triggers a re-probe of the current status.

As I interpret the documentation it would be possible to avoid this behaviour by configuring the order constraint with kind=Optional. But I am not sure if this would result in any other undeserved side effects. (e.g on reverse order when stopping)

kind=Optional constraints only apply when both actions need to be done in the same transition. I.e. if a single cluster check finds that both Res1 and Res2 need to be started, Res1 will be started before Res2. But it is entirely possible that Res2 can be started in an earlier transition, with Res1 still stopped, and a later transition starts Res1. Similarly when stopping, Res2 will be stopped first, if both need to be stopped.

In your original scenario, if your master/slave resource will only bind to the IP after it is up, kind=Optional won't be reliable. But if the master/slave resource binds to the wildcard IP, then the order really doesn't matter -- you could keep the colocation constraint and drop the ordering.

Another work-around seems to be setting the dependent resource to unmanaged, perform the cleanup and then set it back to managed.

This is what i would recommend, if you have to keep the mandatory ordering.

And I wonder if “pcs resource failcount reset” would do the trick WITHOUT any actions being performed if no change in state is necessary. But I think to remember that we already tried this now and then and sometimes such a failed resource was not started after the failcount reset. (But I am not sure and had not yet time to try to reproduce.)

No, in newer pacemaker versions, crm_failcount --delete is equivalent to a crm_resource --cleanup. (pcs calls these to actually perform the work)

Is there any deeper insight which might help with a sound understanding of this issue?

It's a side effect of the current CIB implementation. Pacemaker's policy engine determines the current state of a resource by checking its operation history in the CIB. Cleanups remove the operation history, thus making the current state unknown, forcing a re-probe. As a side effect, any dependencies no longer have their constraints satisfied until the re-probe completes.

It would be theoretically possible to implement a "cleanup old failures" option that would clear a resource's fail count and remove only its operation history entries for failed operations, as long as doing so does not change the current state determination. But that would be quite complicated, and setting the resource unmanaged is an easy workaround.

Best Answer

Related Solutions

Pacemaker/corosync timeout before resource transfers

Linux – Corosync with Pacemaker Resource explanation

Related Topic