Database Design – Approach to Multilingual Database Design

Architecturedatabase-designdesign

My boss is planning on a new db and wants to support multilingual data in this manner:

LocalizedDescs (Guid / LanguageGuid being the primary key)

  • Cluster
  • Guid
  • LanguageGuid
  • Desc

ProductCategories

  • Cluster
  • Guid
  • (…)

Products

  • Cluster
  • Guid
  • CategoryGuid
  • DetailedDescGuid
  • (…)

The way it works is that every table is having a Guid field, used as the primary key. The LocalizedDesc table's Guid field, in turn, corresponds to any guid used in tables throughout the db, making it a parent table to every table in the system.

In rare cases where a table record needs another localized resource, an additional field is used in the table that will also point to a LocalizedDesc record. As an example, the Products table has a DetailedDescGuid that is meant to contain a throughout, longer description of a product. This way we have both a summary description and a detailed description, both of which can be localized to different languages.

Originally, we were supposed to have a LocalizedDescGuid field in each table needing a description. But my boss claims the db indexes will be smaller if we do this the other way.

Design-wise, what is this solution's worth? Was it better when it used an additional field in each table? Or are we doing this all wrong?

Best Answer

I would suggest the following structure due to my experiences in some other applications.

First of all I would build a language table:

language_id (PK)
iso_country_code
iso_language_code
codepage
translation_id (FK)

The language_id would represent the primary key. Each available language is added to this table. iso_country_code can contain the ISO country code (GB - Britain, DE - Germany, ...). iso_language_code can be used to cover different located languages (e.g. en_US, en_GB,...) codepage is the codepage which will be sent out. translation_id more on this later on.

The second thing should be a translation table. The translation table should hold back all translations for every translatable term.

translation_id PK
language_id PK (FK -> language)
term

The table consists of an combined primary key over translation_id and language_id which will prevent double insertion. language_id will refer to the language table. term itself will just be the translated term (e.g.: Table in german -> Tabelle)

The next thing will happen on each table which holds translatable items. For example a Table which holds your products for example called products.

product_id PK
product_information
product_price
...
translation_id (FK)

Only the translation_id is needed. It will refer to the translation table and retrieve the correct translation. The application can give or user session can be joined into the statement to filter the proper language for the logged in user.

The other positive thing on this solution is that if you have multiple translations in different tables which all means the same but in a different context and all can have the same translation_id, they already can share the same translation_id which will reduce your data weight and will improve the translation.

Hopefully this will give you a good hint and help you to improve your solution.

Related Topic