For about 1 year we've been using openldap
on ubuntu
server 10.04LTS
for authenticating about 20 IT users and everything has been running fine (the operations on the LDAP server were basically limited to creating/removing users using apache directory studio).
More recently (6 months ago) we've also started implementing openldap
(openldap-2.4.21/debian) as an external authentication system for our website which is being migrated from an external CMS to a new platform we're developing in house using Drupal CMS
. We have a 45K-user database and things haven't been going smoothly at all. Issues that we've had are:
–ldap crashing after a backup restore, needing to be recovered.
–the ldap recover tool unable to recover the ldap database on some occassions
-slapd consuming 100% CPU while no authentication activity on the website.
Due to lack of resources and knowledge internally, all we've done so far is to find ways of keeping LDAP running without really investigating any of these issues (use monit
to restart it when it crashes, db_recover
to recover the db if needed, and slapcat
to recreate the db from scratch when db_recover
fails).
Recently we've had a round of interviews to hire a Senior infrastructure engineer to assist us with all the various infra. issues we're running into. Several candidates confirmed they've either had or heard about issues with openldap
in large production environments and never managed to come up with a single stable standalone openldap
server but instead had to come up with redundant deployments (replication, load balancing, auto-recovery/restart routines) to keep ldap running. Some candidates even said that openldap
just wasn't fit for production environments and that instead, using alternatives such as Novel eDirectory
was necessary.
Q: If you have experience in dealing with ldap in production environments with thousands of users, do you have facts to share which tend to prove that openldap
is indeed unstable for such setups and that using other ldap servers are indeed recommended?
Best Answer
I use OpenLDAP supporting a user-base of about 10,000 active users who rely on it throughout the day for everything. Problems are rare. Many services rely on it, for authentication and other things.
However, we have 4 read-only replicas (slaves/consumers) behind a load-balancer, a hidden master and a hot standby master. Used to be 2 front-end servers, but we had load problems during certain peak times (when 4,000 or so of those users were desperately trying to hit it at the same second). All write access to LDAP is via our code.
That equipment and OS is all old and we're working on replacing it with a new setup that will go back to only 2 replicas (that aren't doing as many other things) and "mirror mode" replication between a pair of masters in an HA configuration. Again, problems are rare.
We used to have some problems with replication failing, but that's mostly from when we were using slurpd instead of syncrepl. Also, unclean shutdowns of a server can corrupt the data.
Keys to running OpenLDAP in a large-scale production environment, in my experience:
Basically, though, if it's a key part of your infrastructure, somebody on your team should really understand it well.
Addendum: By request, the
DB_CONFIG
file from my openldap DB directory. Look at http://docs.oracle.com/cd/E17076_02/html/api_reference/C/configuration_reference.html for details.