Centos – Apache Failed to Start in Pacemaker

apache-2.4centoshigh-availabilitypacemaker

I am using Pacemaker with Corosync to set up a basic Apache HA cluster with 3 nodes running CentOS. For some reasons, I cannot get the apache resource started in pcs.

Cluster IP: 192.168.200.40

# pcs resource show ClusterIP
     Resource: ClusterIP (class=ocf provider=heartbeat type=IPaddr2)
      Attributes: cidr_netmask=24 ip=192.168.200.40
      Operations: monitor interval=20s (ClusterIP-monitor-interval-20s)
                  start interval=0s timeout=20s (ClusterIP-start-interval-0s)
                  stop interval=0s timeout=20s (ClusterIP-stop-interval-0s)



# pcs resource show WebServer
 Resource: WebServer (class=ocf provider=heartbeat type=apache)
  Attributes: configfile=/etc/httpd/conf/httpd.conf statusurl=http://localhost/server-status
  Operations: monitor interval=1min (WebServer-monitor-interval-1min)
              start interval=0s timeout=40s (WebServer-start-interval-0s)
              stop interval=0s timeout=60s (WebServer-stop-interval-0s)



# pcs status
Cluster name: 
WARNING: corosync and pacemaker node names do not match (IPs used in setup?)
Stack: corosync
Current DC: server3.example.com (version 1.1.18-11.el7_5.2-2b07d5c5a9) - partition with quorum
Last updated: Thu Jun  7 21:59:09 2018
Last change: Thu Jun  7 21:45:23 2018 by root via cibadmin on server1.example.com

3 nodes configured
2 resources configured

Online: [ server1.example.com server2.example.com server3.example.com ]

Full list of resources:

 ClusterIP  (ocf::heartbeat:IPaddr2):   Started server2.example.com
 WebServer  (ocf::heartbeat:apache):    Stopped

Failed Actions:
* WebServer_start_0 on server3.example.com 'unknown error' (1): call=49, status=Timed Out, exitreason='',
    last-rc-change='Thu Jun  7 21:46:03 2018', queued=0ms, exec=40002ms
* WebServer_start_0 on server1.example.com 'unknown error' (1): call=53, status=Timed Out, exitreason='',
    last-rc-change='Thu Jun  7 21:45:23 2018', queued=0ms, exec=40003ms
* WebServer_start_0 on server2.example.com 'unknown error' (1): call=47, status=Timed Out, exitreason='',
    last-rc-change='Thu Jun  7 21:46:43 2018', queued=1ms, exec=40002ms


Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

The httpd instance is enabled and running on all three nodes. The cluster IP and individual node IPs are able to access the web page. The ClusterIP resource also works well for failover. What may go wrong for the apache resource in this case?

Thank you very much!

Update:

Here is more information from the debug output. It seems the Apache is unable to bind to the port, but there is no error from the apache log, and systemctl status httpd gave all green on all nodes. I can open web pages via the cluster IP and node IPs. The ClusterIP resource failover works fine, too. Any idea on why Apache resource doesn't work with pacemaker?

# pcs resource debug-start WebServer --full
Operation start for WebServer (ocf:heartbeat:apache) failed: 'Timed Out' (2)
 >  stderr: ERROR: (98)Address already in use: AH00072: make_sock: could not bind to address [::]:80 (98)Address already in use: AH00072: make_sock: could not bind to address 0.0.0.0:80 no listening sockets available, shutting down AH00015: Unable to open logs
 >  stderr: INFO: apache not running
 >  stderr: INFO: waiting for apache /etc/httpd/conf/httpd.conf to come up
 >  stderr: INFO: apache not running
 >  stderr: INFO: waiting for apache /etc/httpd/conf/httpd.conf to come up
 >  stderr: INFO: apache not running
 >  stderr: INFO: waiting for apache /etc/httpd/conf/httpd.conf to come up
 >  stderr: INFO: apache not running

Best Answer

In CentOS8

doing this...

pcs resource create httpd_monitor ocf:heartbeat:apache \
configfile="/etc/httpd/conf/httpd.conf" \
statusurl="http://127.0.0.1/server-status" --group apache

The file /etc/httpd/conf/httpd.conf is checked for the PidFile parameter. This is not defined, but defaults to /var/run/httpd/httpd.pid

[root@hanode1 ~]# pcs resource
  * Resource Group: apache:
    * httpd_fs  (ocf::heartbeat:Filesystem):     Started hanode1.lab.local
    * httpd_vip (ocf::heartbeat:IPaddr2):        Started hanode1.lab.local
    * apache_service    (service:httpd):         Started hanode1.lab.local
    * httpd_monitor     (ocf::heartbeat:apache):         Stopped

You get this error message

Feb 02 17:39:21 INFO: apache not running
Feb 02 17:39:21 INFO: waiting for apache /etc/httpd/conf/httpd.conf to come up

So if you define this in /etc/httpd/conf/httpd.conf

# this is the default but is required by pcs to be defined
PidFile /var/run/httpd/httpd.pid

This will run fine, as below:

[root@hanode1 ~]# pcs resource debug-start httpd_monitor
Operation start for httpd_monitor (ocf:heartbeat:apache) returned: 'ok' (0)
Feb 02 17:39:57 INFO: apache already running (pid 88022)

Then you can clean up with pcs resource cleanup httpd_monitor

# pcs resource
  * Resource Group: apache:
    * httpd_fs  (ocf::heartbeat:Filesystem):     Started hanode1.lab.local
    * httpd_vip (ocf::heartbeat:IPaddr2):        Started hanode1.lab.local
    * apache_service    (service:httpd):         Started hanode1.lab.local
    * httpd_monitor     (ocf::heartbeat:apache):         Started hanode1.lab.local

kudos to @cleverpig

Related Topic