How to organize localization string resources

internationalizationlocalizationnamingresources

We're developing a large application, consisting of many small packages. Each package has its own set of resource files for localization.

What's the best approach to organizing and naming the localization strings?

Here are my thoughts so far:

Handling duplicates

The same text (say, "Zip code") may occur multiple times within a given package. Programming instinct (DRY) tells me to create a single string resource shared by all occurrences.

Then again, a translator may want to choose a long translation ("Postleitzahl") in some places and a shorter one ("PLZ") in places with less space. Or we may decide to append a colon to some occurrences ("Zip code:"), but not to others. Or we may require a different capitalization ("zip code") in some places. All these arguments point to creating one resource per usage, even if their contents are identical.

Naming

If we aim to eliminate duplicates, it makes sense to name resources by content, maybe hinting at the kind of usage via prefix. So we may have labelOK = "OK", messageFileTooLarge = "The file exceeds the maximum file size.", and labelZipCode = "Zip code".

Naming by content has the advantage of handling format arguments naturally: The resource messageFileHas_0_MBWhileMaximumIs_1_MB clearly takes two formatting arguments, the actual file size and the maximum file size.

If we allow duplicates, however, naming by content alone doesn't make sense. In order to get unique resource names, we must somehow include the place of usage in the resource name. That works for graphical controls, although the identifiers tend to get a bit long: fileSelectionConfirmationButtonText = "OK", customerDetailsTableColumnZipCode = "Zip Code". However, for non-visual code files, it gets harder. How do you name a specific usage of a string if you don't know where it will eventually be displayed? By code file and function name? Seems rather clumsy and brittle to me.

All in all, I'm leaning toward allowing duplicates, but I'm struggling to find a consistent naming scheme that supports this.

Edit: This question has two aspects: How to organize resources (DRY vs. duplicates) and how to name them. So far, the answers have concentrated on the first aspect. I'd appreciate some feedback regarding naming conventions!

Best Answer

I would accept duplication whenever you cannot be absolutely sure the meaning is exactly the same in all cases a certain string is used.

Even if two labels always contain the same string in English (or your native tongue) they will not necessarily be the same in all languages. Accepting duplication may give you (or rather the translators) the flexibility needed to handle such situations.

As an example: Consider a label "Condition", which - depending on context - might get translated to "Zustand" or "Bedingung" in German (among lots of other possible translations).

Related Topic