File Handling – Where to Load and Store Settings from a File?

designfile handlinglanguage-agnostic

I think this question should apply to most programs that load settings from a file. My question is from a programming point of view, and it is really how to deal with the loading of settings from a file in terms of different classes and accessibility. For instance:

If a program had a simple settings.ini file, should its contents be
loaded in a load() method of a class, or perhaps the constructor?
Should the values be stored in public static variables, or should
there be static methods to get and set properties?
What should happen in the event of the file not existing or not being readable? How would you let the rest of the program know it can't get those properties?
etc.

I'm hoping I'm asking this in the right place here. I wanted to make the question as language agnostic as possible, but I'm mainly focusing on languages that have things like inheritance – especially Java and C#.NET.

Best Answer

This is actually a really important question and it is often done wrongly as it is not given enough importance even though it is a core part of pretty much every application. Here are my guidelines:

Your config class, which contains all the settings should be just a plain old data type, struct/class:

class Config {
    int prop1;
    float prop2;
    SubConfig subConfig;
}

It should not need to have methods and should not involve inheritance (unless it is the only choice you have in your language for implementing a variant field - see next paragraph). It can and should use composition to group the settings into smaller specific configuration classes (e.g. subConfig above). If you do it this way it will be ideal to pass around in unit tests and the application in general as it will have minimal dependencies.

You'll likely need to use variant types, in case configs for different setups are heterogeneous in structure. It is accepted that you'll need to put a dynamic cast in at some point when you read the value to cast it to the right (sub-)configuration class, and no doubt this will be dependent on another config setting.

You should not be lazy about typing in all the settings as fields by just doing this:

class Config {
    Dictionary<string, string> values;
};

This is tempting as it means you can write a generalised serialisation class that doesn't need to know with what fields it's dealing, but it is wrong and I'll explain why in a moment.

Serialisation of the config is done in a completely separate class. Whatever API or library you use to do this, the body of your serialisation function should contain entries that basically amount to being a map from the path/key in the file to the field on the object. Some languages provide good introspection and can do this for you out of the box, others you'll have to explicitly write the mapping, but the key thing is you should only have to write the mapping once. E.g. consider this extract I adapted from the the c++ boost program options parser documentation:

struct Config {
   int opt;
} conf;
po::options_description desc("Allowed options");
desc.add_options()
    ("optimization", po::value<int>(&conf.opt)->default_value(10);

Note that the last line basically says "optimization" maps to Config::opt and also that there is a declaration of the type that you expect. You want the reading of the config to fail if the type is not what you expect, if the parameter in the file is not really a float or an int, or doesn't exist. I.e. Failure should occur when you read the file because the problem is with the format/validation of the file and you should throw an except/return code and report the exact problem. You should not delay this to later in the program. That's why you should not be tempted to have a catch all Dictionary style Conf as mentioned above which won't fail when the file is read - as casting is delayed until the value is needed.

You should make the Config class read-only in some fashion - setting the contents of the class once when you create it and initialise it from file. If you need to have dynamic settings in your application that change, as well as const ones that don't, you should have a separate class to handle the dynamic ones rather than trying to allow bits of your config class to be not read-only.

Ideally you read in the file in one place in your program i.e. you only have one instance of a "ConfigReader". However, if you're struggling with getting the Config instance passed around to where you need it, it is better to have a second ConfigReader than it is to introduce a global config (which I'm guessing is what the OP means by "static"), which brings me to my next point:

Avoid the seductive siren song of the singleton: "I'll save you having to pass that class class around, all your constructors will be lovely and clean. Go on, it'll be so easy." Truth is with a well designed testable architecture you'll hardly need to pass the Config class, or parts of it down through that many classes of your application. What you'll find, in your top level class, your main() function or whatever, you'll unravel the conf into individual values, which you'll provide to your component classes as arguments which you then put back together (manual dependency injection). A singleton/global/static conf will make unit testing your application much harder to implement and understand - e.g. it will confuse new developers to your team who will not know they have to set global state to test stuff. It will also prevent you from running unit tests in parallel.

If your language supports properties you should use them for this purpose. The reason is it means it will be very easy to add 'derived' configuration settings that depend on one or more other settings. e.g.

int Prop1 { get; }
int Prop2 { get; }
int Prop3 { get { return Prop1*Prop2; }

If your language doesn't natively support the property idiom, it may have a workaround to achieve the same effect, or you simply create a wrapper class that provides the bonus settings. If you can't otherwise confer the benefit of properties it is otherwise a waste of time to write manually and use getters/setters simply for the purpose of pleasing some OO-god. You'll be better off with a plain old field.

You might need a system to merge and take multiple configs from different places in order of precedence. That order of precedence should be well-defined and understand by all developers/user e.g. consider windows registry HKEY_CURRENT_USER/HKEY_LOCAL_MACHINE. You should do this functional style so that you can keep your configs read only i.e.:

final_conf = merge(user_conf, machine_conf)

rather than:

conf.update(user_conf)

I should finally add that of course if your chosen framework/language provides its own built-in, well-known configuration mechanisms you should consider the benefits of using that instead of rolling your own.

So. Lots of aspects to consider - get it right and it will profoundly affect your application architecture, reducing bugs, making things easily testable and sort of forcing you to use good design elsewhere.

Best Answer

Related Solutions

Java XML Database – Using XML to Store and Edit Data Records

Design – Java Design for Data enrichment based on logic defined in a database

Related Topic