The Don’t Repeat Yourself (DRY) principle in documentation

documentationdry

Dave Thomas, the author of the Don't Repeat Yourself principle said:

DRY says that every piece of system knowledge should have one
authoritative, unambiguous representation. Every piece of knowledge in
the development of something should have a single representation. A
system's knowledge is far broader than just its code. It refers to
database schemas, test plans, the build system, even documentation.

I have difficulty to understand how it applies to programming code documentation.

Java API has the java.util.Arrays class with the copyOf method that is overloaded 7 times. The 8 methods can be documented in two ways containing the same information. The first way describes all overloaded methods using a single description, highlighting the differences between overloads. The second way describes each overloaded method using a separate description. Below are both ways – I bolded the differences in the content. The second way was literally copied from the API documentation of the java.util.Arrays class.

Is the second way compliant with the DRY principle? In my opinion it isn't, because it contains text which was copied multiple times, then, additionally, minor modifications were made to each copy. The same information can be provided in one text, which is almost eight times shorter. But I am not sure if I understand correctly the definition of the DRY principle. What are the "pieces of system knowledge" in this case? Do they have single, unambiguous, authoritative representation in the quoted documentation?

Documentation of the overloaded copyOf() methods – 1st way

public static byte[]    copyOf​(byte[]    original, int newLength)
public static short[]   copyOf​(short[]   original, int newLength)
public static int[]     copyOf​(int[]     original, int newLength)
public static long[]    copyOf​(long[]    original, int newLength)
public static float[]   copyOf​(float[]   original, int newLength)
public static double[]  copyOf​(double[]  original, int newLength)
public static char[]    copyOf​(char[]    original, int newLength)
public static boolean[] copyOf​(boolean[] original, int newLength)

Copies the specified array, truncating or padding with the default
values
(if necessary) so the copy has the specified length. For all
indices that are valid in both the original array and the copy, the
two arrays will contain identical values. For any indices that are
valid in the copy but not the original, the copy will contain the
default value
. Such indices will exist if and only if the specified
length is greater than that of the original array.

The following table lists default values for given type of array elements:

type the default value of type
byte, short, int, long 0
float, double 0.0
char null character ('\u0000')
boolean false

Parameters:

  • original – the array to be copied
  • newLength – the length of the copy to be returned

Returns: a copy of the original array, truncated or padded with the
default values
to obtain the specified length

Throws:

  • NegativeArraySizeException – if newLength is negative
  • NullPointerException – if original is null

Since: 1.6

Documentation of the overloaded copyOf() methods – 2nd way

public static byte[] copyOf​(byte[] original, int newLength)

Copies the specified array, truncating or padding with zeros (if
necessary) so the copy has the specified length. For all indices that
are valid in both the original array and the copy, the two arrays will
contain identical values. For any indices that are valid in the copy
but not the original, the copy will contain (byte)0. Such indices
will exist if and only if the specified length is greater than that of
the original array.

Parameters:

  • original – the array to be copied
  • newLength – the length of the copy to be returned

Returns: a copy of the original array, truncated or padded with
zeros to obtain the specified length

Throws:

  • NegativeArraySizeException – if newLength is negative
  • NullPointerException – if original is null

Since: 1.6

public static short[] copyOf​(short[] original, int newLength)

Copies the specified array, truncating or padding with zeros (if
necessary) so the copy has the specified length. For all indices that
are valid in both the original array and the copy, the two arrays will
contain identical values. For any indices that are valid in the copy
but not the original, the copy will contain (short)0. Such indices
will exist if and only if the specified length is greater than that of
the original array.

Parameters:

  • original – the array to be copied
  • newLength – the length of the copy to be returned

Returns: a copy of the original array, truncated or padded with
zeros to obtain the specified length

Throws:

  • NegativeArraySizeException – if newLength is negative
  • NullPointerException – if original is null

Since: 1.6

and so on for: int, long, float, double, char, boolean. It is exactly the same text, eight times, where only the bold parts are different.

Best Answer

About this specific problem:
From a technical perspective for me, the documentation of Array.copyOf is DRY, from my Developer perspective it is not.

The relevant information of each method are the input parameters, the output, the special behavior for different lengths, the possible errors.
Those are nearly identical over all overloaded implementations. From a technical perspective, each overloaded method could in theory behave differently. That means even if that is not the case now, it could in the future. Therefore those information are the same, but not identical. Therefor "repeating" it is absolutely correct and is not violating the DRY principle.

On the other side, the intention of the whole method is to provide the absolutely same functionality for different data types. That means, the intention is, that those functions will evolve equally. Then those information are not just the same, but identical. Therefore from this perspective the documentation is violating the DRY principle.

When WE are in such a situation to decide how to write a documentation and should we follow DRY or not:

The question is what we want to achieve. DRY is not a goal. It's a strategy to achieve a goal.

If you want that you don’t have to change much in the documentation, when your code changes, then use DRY.

But, as in code, this comes with a cost. It means you will have to write more "generic" documentation. Generics are in most cases more complex, that means harder to understand. Hmm that is violating the KISS (keep it simple, stupid) principle now? Yep :-(

Therefore the first question is, what do you want to achieve. And then you have to balance the principles.

For example, if I have some code that will change very rarely, and people will often need some functions of it, but rarely all of them, then I would prefer the approach with describing each function independently. It makes understanding easier and the additional maintaining effort is not that important, because my expectation is that there will not be so many changes.

If my code changes often, and/or most people will need the whole thing and not only parts, then a more condensed, generic style would be appropriate.

Therefore, in my eyes we should know those principles. And never completely ignore them. But it's not about following a principle blindly, but about realising which principle will support me in the current context the best to achieve my specific goals.
In my experience, the hardest part is to determine good, precise, goals. But that's another topic. :-)

Updated (added the initial part) because the question was sharpened. Thanks @iwis for the question.

Related Topic