Java – Elegant Ways to Avoid Hard Coding CSV File Format

coding-stylejava

I know this is trivial issue, but I just feel this can be more elegant.

So I need to write/read data files for my program, lets say they are CSV for now. I can implement the format as I see fit, but I may have need to change that format later. The simply thing to do is something like

out.write(For.getValue()+","+bar.getMinValue()+","+fi.toString());

This is easy to write, but obviously is guilty of hard coding and the general 'magic number' issue. The format is hard-coded, requires parsing of the code to figure out the file format, and changing the format requires changing multiple methods.

I could instead have my constants specifying the location that I want each variable to be saved in the CSV file to remove some of the 'magic numbers'; then save/load into the an array at the location specified by the constants:

int FOO_LOCATION=0;
int BAR_MIN_VAL_LOCATION=1;
int FI_LOCATION=2
int NUM_ARGUMENTS=3;

String[] outputArguments=new String[NUM_ARGUMENTS];
outputArguments[FOO_LOCATION] = foo.getValue();
outputArgumetns[BAR_MIN_VAL_LOCATION] = bar.getMinValue();
outptArguments[FI_LOCATOIN==fi.toString();

writeAsCSV(outputArguments);

But this is…extremely verbose and still a bit ugly. It makes it easy to see the format of existing CSV and to swap the location of variables within the file easily. However, if I decide to add an extra value to the csv I need to not only add a new constant, but also modify the read and write methods to add the logic that actually saves/reads the argument from the array; I still have to hunt down every method using these variables and change them by hand!

If I use Java enums I can clean this up slightly, but the real issue is still present. Short of some sort of functional programming (and java's inner classes are too ugly to be considered functional) I still have no obvious way of clearly expressing what variable is associated with each constant short of writing (and maintaining) it in the read/write methods. For instance I still need to write somewhere that the FOO_LOCATION specifies the location of foo.getValue().

It seems as if there should be a prettier, easier to maintain, manner for approaching this?

Incidentally, I'm working in java at the moment, however, I am interested conceptually about the design approach regardless of language. Some library in java that does all the work for me is definitely welcome (though it may prove more hassle to get permission to add it to the codebase then to just write something by hand quickly), but what I'm really asking is more about how to write elegant code if you had to do this by hand.

Best Answer

Regardless of language, if you don't want to hard code the values, then one will need some sort of meta data that describes how the data will be mapped, formatted, and output.

In this example, it could be called a mapper or formatter.

Map/format files (XML, JSON, or simliar) would describe how the data would be formatted and written. Your application would read in the map/format file and use it to create the output.

Then your application could format the CSV anyway you see fit, without a programming change. One could also extend this to flat or fixed formats and XML as well. Then your code is generic as it uses the mapping meta data to create the CSV file.

For the CSV case, at a high level one would need to describe:

  • Name and Order of fields
  • Delimiter (Sometimes comma is not used)
  • Whether or not to include a header
  • Whether or not to include quotes around the data

Just as a side note, there is a time difference to develop the two applications. Hardcoded field values are much faster to develop, although as you have pointed out less elegant. But if you need to get something done quickly, the aproach is OK.

Developing something more generic would take more time up front, but if your producing a lot of CSV formatted files of different format, in the long run one would get ROI on it.

With this approach, one can also write some sort of nice "GUI" for business analysts to use to create the map files so developers will be less involved in the overall process.

Related Topic