Should putting html tags into localization files be avoided

htmllocalization

We are localizing all parts of our website to many languages. We use XML localization files. I think this scenario is so common, and even there should be a standard solution to this, but still I couldn't find any good advice, and every developer here has different opinion about it, so I'm asking you.

Suggest the following example:

If you have question, please ask our <a href="blabla" target="_blank" title="Our nice Customer Support">Customer support</a> or write an email to <a href="mailto:XXX">Jane Doe </a> our specialist.

Or a long, formatted text:

 <p>A very long, marketing blah-blah. A very long, marketing blah-blah. A very long, marketing blah-blah concluding in a list:</p>
 <ul>
   <li>1</li>
   <li>2</li>
   <li>3</li>
</ul>

1) Would you put html tags into the localization XML file?

My concern is the View is separated into 2 files: your page and your localization file. People will forget to check the localization file. Also, there's logic and style embedded in html as well (see target="_blank", or the fact that the above mentioned list is unordered…)

2) Or splitting it to smaller parts?

<msg id="IfYouHaveQuestion">If you have question, please ask our</msg>

<msg id="CustomerSupport">Customer support</msg>

<msg id="OrWrite">or write an email to</msg>

Now, View can contain all the markup and style. It's easy to change it, flexible.

But.

There's absolutely no guarantee that the word order will be the same in all languages. Also, this would make the translator's work a nightmare, making it a puzzle.

3) Or introduce BB-style markdown?

<msg id="HaveQuestion">If you have question, please ask our [link url="{customersupportlink}" title="{{CustomerSupportTitle}}">Customer support[/link] or write an email …</msg>

But probably this is over-complicating the issue, and also we have to write our own parser for this (though, I think it wouldn't be so hard). And probably does not solve the long, formatted text problem.

4) ??? (Your golden solution here) 🙂

Best Answer

It seems there is no mainstream choice, so here is my suggestion :

Localisation files could be used more like semantic data than just text strings. It seems reasonnable to expect identifiying a list , tagging a paragraph, a name or a part of a phrase being part of the localization team work. So it could contain semantic (but no logical) html tags and use semantic span tags ( like <span id="seo-name"> ) in localization files. Note : I here suggest span, wich is a valid html tag and so could be manipulated easily as a DOM element, but nothing stop you to use your own tags to parse.

Doing so you can in your view logic code, when extracting the text from the localization file, identify the seo-name, and adding the html link tags properly.

You may even, since some prior answer have made a security point about leting potentialy unknown people writting html code on your website, have a security parser wich check only limited safe tags (<p>, <ul>, <ol>, <li>, <span id="blabla">, ...) are present in localization files.

An exemple to illustrate :

If you have question, please ask our <a href="blabla" target="_blank" title="Our nice Customer Support + boilerplate SEO bullshit">Customer support</a> or write an email to <a href="mailto:XXX">Jane Doe </a> our specialist.

Could became in the file with thoses convention :

If you have question, please ask our <span id="customer-support">Customer support</span><span id ="customer-support-description">Our nice Customer Support + boilerplate SEO bullshit</span> or write an email to <span id="specialist-name">Jane Doe </span> our specialist.

Wich (I think) ins't really difficult to localize in french for (bad since closely-related) exemple by :

Si vous avez des question, n'hésitez pas à rencontrer notre <span id="customer-support">Service client</span><span id ="customer-support-description">Notre super service client + habituel blabla commercial</span> ou à contacter par email notre spécialiste <span id="specialist-name">Jane Doe </span>.

And a ugly controller pseudocode exemple :

$customerServiceParagraph = getLocalizedText("customerServiceContact",$lang);
$customerSupportDescription = getTextContentByElementIdThenDelete("customer-support-description"));
linkify($customerServiceParagraph,"customer-support",$customerSupportDescription,"blabla","_blank");
mailto_ify($customerServiceParagraph, "specialist-name","XXX" );

I'm far from being a localization expert but think the ordered or unordered list choice is a matter of cultural convention, and so is also a part of localization team work, even if I agree thoses tags are a legacy of dirty old styling/semantic tags collection of prior html versions.