My boss is planning on a new db and wants to support multilingual data in this manner:
LocalizedDescs (Guid / LanguageGuid being the primary key)
- Cluster
- Guid
- LanguageGuid
- Desc
ProductCategories
- Cluster
- Guid
- (…)
Products
- Cluster
- Guid
- CategoryGuid
- DetailedDescGuid
- (…)
The way it works is that every table is having a Guid
field, used as the primary key. The LocalizedDesc
table's Guid
field, in turn, corresponds to any guid used in tables throughout the db, making it a parent table to every table in the system.
In rare cases where a table record needs another localized resource, an additional field is used in the table that will also point to a LocalizedDesc
record. As an example, the Products
table has a DetailedDescGuid
that is meant to contain a throughout, longer description of a product. This way we have both a summary description and a detailed description, both of which can be localized to different languages.
Originally, we were supposed to have a LocalizedDescGuid
field in each table needing a description. But my boss claims the db indexes will be smaller if we do this the other way.
Design-wise, what is this solution's worth? Was it better when it used an additional field in each table? Or are we doing this all wrong?
Best Answer
I would suggest the following structure due to my experiences in some other applications.
First of all I would build a language table:
The
language_id
would represent the primary key. Each available language is added to this table.iso_country_code
can contain the ISO country code (GB - Britain, DE - Germany, ...).iso_language_code
can be used to cover different located languages (e.g. en_US, en_GB,...)codepage
is the codepage which will be sent out.translation_id
more on this later on.The second thing should be a translation table. The translation table should hold back all translations for every translatable term.
The table consists of an combined primary key over translation_id and language_id which will prevent double insertion.
language_id
will refer to the language table.term
itself will just be the translated term (e.g.: Table in german -> Tabelle)The next thing will happen on each table which holds translatable items. For example a Table which holds your products for example called products.
Only the
translation_id
is needed. It will refer to the translation table and retrieve the correct translation. The application can give or user session can be joined into the statement to filter the proper language for the logged in user.The other positive thing on this solution is that if you have multiple translations in different tables which all means the same but in a different context and all can have the same
translation_id
, they already can share the sametranslation_id
which will reduce your data weight and will improve the translation.Hopefully this will give you a good hint and help you to improve your solution.