Best Practices – Handling Large Number of Configuration Files

configurationconfiguration-managementproperties

Imagine a system that has a large number of servers. Each of them has a number of settings:

  • Some specific to the server
  • Some specific to the region
  • Some common across all of them
  • Maybe you can have some custom groupings, like this group of servers is for reading only
  • etc.

Current practice I have in mind is a simple property structure with overriding abilities.

Lets take Google servers for the purpose of the example. Each one of them has a list of settings to load.

For example, the London server may have:

rootsettings.properties, europesettings.properties, londonsettings.properties, searchengine.properties, etc.

Where each file contains a set of properties and the loading sequence allows you to override properties, the further you go.

For example: rootsettings.properties may have accessible=false as a default, but is overriden in searchengine.properties with accessible=true


The problem I am having with this structure is it is very easy to get out of control. It is not structured at all, meaning you can define any property at any level and many items can become obsolete.

Furthermore changing a middle level becomes impossible as the network grows, as you now affect a very large number of servers.

Last but not least, each individual instance may need 1 special property, meaning your tree ends up with with a config for each server anyways, making it not very optimal solution.

I would greatly appreciate if you have any suggestions/ideas of a better configuration management architecture.

Best Answer

I think you need to ask yourself some questions first, and clarify some points, then you can better decide how to solve your problem.

First: who shall have control over the servers?

  • Is it a single administrator who is going to control hundreds of servers? Then you need to centralize the configuration as much as possible.

  • Or is each server potentially under control of an individual admin, who does not want his settings overruled or controlled by a centralized configuration? Then you should focus on decentralized configuration. If each admin has a handful of servers to manage at maximum, this is still manageable manually.

Second: do you really need tons of configuration options, or can you keep the number down to a few? Instead of making all and everything configurable "just in case", better self-restrict yourself to the options you know your system really needs. This can be done, for example, by

  • making your software a little bit smarter (for example, what can the program determine automatically by asking the environment)?

  • following "convention over configuration" rigidly - for example, by establishing certain naming conventions, or by deriving some options as a default from other options

Third: do you really want a hierarchical level of configuration hardcoded into the software? Imagine you don't know beforehand how many servers there will be, if a tree-like hierarchy really is the best structure for them, or how many levels the tree needs to have. The most simple solution I can think of is by providing no centralized configuration at all, only one config file per server, and let the reponsible server administrators decide themselves how they solve the problem of managing the configurations of multiple servers.

For example, the admins might write generator scripts which distribute a central config file to a group of different servers and making some minor modifications to each copy. That way, you do not have to make any assumptions about the "server distribution topology" beforehand, the topology can be adjusted at any time to the real world requirements. The drawback is, you need administrators with some knowledge of how to write scripts in a language like Perl, Python, Bash or Powershell.