Linux – What things are messed up after reboot in Linux? Very strange issue: LDAPS stops working only after reboot

ldaplinuxservicesslstart-stop-daemon

I have a Centos 6.3 virtualbox VM snapshot with LDAPS (openldap) set up. I've set up it few days ago following tips from different sources and wrote down everything afterwards. But when I try to repeat installation (following my own instructions), I fail to do that: SSL handshake is dropped as if the server has completely misconfigured SSL certificates. It looks like if I've pointed server config to non-existent certificate file. I'm running all checks locally (using ldapsearch and "openssl s_client"). To make things even worse, LDAPS at my snapshot stops working with the same issue after reboot. Restarting slapd/nslcd/nscd services without reboot does not break it 0_o Copying exact configs and certs to clean (without LDAPS set up) snapshot of the same VM and proper configuration does not work. That's why the issue seems to be not related to configuration and certificates. I've spent digging more than 10 hours, but still have a strong wish to understand the cause.

It's principal (educational) for me to understand why does this issue occur only after reboot and not after service restart. Please feel free to post any ideas of things that are reset to default/smashed/messed up when Linux host reboots. In other words, in what way system reboot may differ from service restart in scope of a separate service captured in a VM snapshot?

I've already checked:

  • Of course, logs/netstat/ps
  • a tmp directory (it's cleaned at every reboot, but does not contain any related files)
  • environment variables
  • date (at snapshot, date is wrong. Fixing date and restarting services does not break LDAPS)
  • hostname/ip (I'm using manually assigned IP for this instance. After reboot and restoring network settings, I tried to restart services with no success)
  • service arguments and slapd.args file at /var/run directory
  • writing garbage into config files of a service and restarting it to see if exactly this file is used.
  • /etc/env / .bashrc / .bash_aliases files have NOT been modified and should not interfere.

Maybe SELinux is a cause (maybe it was disabled at snapshot, will check it tomorrow at work)

Any other suggestions? Too tired to fight further today…

Best Answer

SSL connections failing is often caused by the time being out of sync. VM's tend to do that, so make sure you run ntpd on all your VM's and that an ntpdate is run at boot before ntpd starts.

Related Topic