Java – Tradeoff between clean code, duplicate code and code efficiency in java

clean codecoding-stylejava

I have a question on writing clean code. I’m trying to refactor the following method:

private static Map<String, String> createMapOfAttributes(
        final String Id,
        final String attributes, 
        final Map<String, String> invalidLines
) {          
    final String[] arrayOfAttributes = attributes.split(";");
    final int numberOfAttributes = arrayOfAttributes.length;
    final Map<String, String> mapOfAttributes = 
        new HashMap<String, String>(numberOfAttributes);

    for (int i = 0; i < numberOfAttributes; i++) {
        final String attributeEntry = arrayOfAttributes[i];
        if (attributeEntry == null || attributeEntry.isEmpty()) {
            continue;
        }

        //extract family and attribute
        final int attributeEntryDelimitPosition =  
            attributeEntry.indexOf("=");

        final String family = 
            attributeEntry.substring(0, attributeEntryDelimitPosition).trim();

        final String attribute =
            attributeEntry.substring(attributeEntryDelimitPosition + 1).trim();

        final String familyAndAttribute = family + '=' + attribute;

        final String previousFamilyAndAttribute = 
            mapOfAttributes.put(family,familyAndAttribute);

        if (previousFamilyAndAttribute != null) {
            invalidLines.put(Id, family);
        }
    }
    return mapOfAttributes;
}

So the first two arguments are input arguments, the last argument is an output argument that is manipulated and then there is an output argument returned.

One guideline for writing clean code is that any method should do only one thing, which the method does not do.

When I try to separate the things the method does I run into a problem: in the four last lines of the for loop, the mapOfAttributes is filled and it is tested whether an entry already existed in the map; if so the entry is collected in the invalidLines map.

When I try to separate those two things I would come up with the following: one method returns the mapOfAttributes and a second method returns the invalidLines map. In the second method I would need to somehow test for each entry if it’s a dublicate entry, possibly by adding it again to a map and thereby doing the same as in the first method (leading to dublicate code and dublicate computational burden). Furthermore, I would need to have the code to extract family and attribute in both methods, also leading to dublicate code.

So my question is, what would be your take on this? How would you refactor the method?
And also, in more general terms, are readable code, code efficiency and code dublicity contrary goals that sometimes cannot all be satisfied at the same time? (Which in this case might mean that there is no satisfying solution?)

Best Answer

I think you're taking the "do one thing" advice too literally. In my view, this method does do only one thing: it parses input in a text format into an internal data structure. That your data structure has two parts (the valid data and the list of invalid items) is, to my view, irrelevant. The parsing process is a single, atomic concern.

Things may be clearer if you collect the resulting map of valid items and list of invalid ones in a class so that you can return then both as an object. Output parameters are a design smell; the results of a function should be in it's return value, wherever possible.

Related Topic