R – Tagging content system – with i18n

databaseinternationalizationtagging

The Idea is to have a tagging system between Users and Content(images, videos, posts)
Kind of like the tagging system here on SO with questions.

I like the achievements system on SO, meaning that after a certain amount of points
a user can start making his/her own tags. Same Idea for my system

My current table design looks like

Tag           UserTag       User
---           -------       ----
tag_id        user_id       user_id
tag_name      tag_id        username
usage_count                 .... 

It brings me to this question.

Q How can you have a tagging system for content in different languages.

  • Yet at the same time be able to search for the same content with tags in different languages.
  • Have auto-complete with different languages for the same tag

When i use autocomplete I search for tag names like the characters the user is typing.

E.g. I have a tag named "nightclub" in English

yet in French if they were tagging that the translation would be "discothèque"


Or is there no way of doing this, and just let people make tags in different languages.

Best Answer

Yes you can. But be aware that some words in one language may have several translations in others.

You may have a languages table, a tags table with only a tag_id, and a many to many table with language_id, tag_id, tag_name.

Like I said previously, you might run into problems when people want to make refinements that their own language allows, but other languages can't. To stay in the french example, talking about bread, you may have 'baguette', 'flûte', 'recuit', 'demi-recuit', etc. tags, whereas the english would merely have a 'bread' tag. The mapping between the tags in then significantly complicated. but that's a general translation problem, not only in programming realm.


Regarding your comment : a compromise would be to add a "tag_related_to_tag" table, allowing to make couplings between tags. Users could tell which tag is related to which other in a different language. This would allow the maximum flexibility with the minimum of complexity, but would need some administration (otherwise you might have evil users making very unexpected relationships between tags, breaking the usefulness of the system).

That's something I actually was thinking to implement for a website which has a very narrow field (stoic philosophy) and target public. If the field is too broad, it might be very ineffective.

Related Topic