Make an Interest table, and then another table (UserInterest) holding the foreign keys of both User and Interest
What you're describing is a many-to-many relationship between User and Interest. If you go through the process of database normalization for your data model you will see that holding a comma-separated list of values fails on two points:
- Each column should hold only one data item, but a comma-separated list is multiple items.
- Each Interest is a repeated data item in User. If you needed to change something about it (maybe correcting a spelling error, or changing the name to something more descriptive), then if you hold the Interests as strings in the User table, you'd need to run a massive update statement on User.
The combination of holding everything in User and holding comma-separated lists compounds the second issue; if you did need to change the name of an Interest, you wouldn't be able to run a very simple update - you'd have to do some sort of string manipulation to find every instance of an Interest and modify it.
With the structure I have suggested, if you ever need to change the name of an Interest, you will update one row in the Interest table.
Also, this way means you'll never need to do string manipulation to match up interests between two people. If you had the interests of Person A as "fishing, baking, paper craft" and Person B as "tobogganing, paper craft, fishing" you'd have to do some unpleasant data-wrangling to realise they shared two-thirds of their interests. Instead, you can just do a set of joins - and with aggregation functions in SQL you could even do things like figuring out the number of matched interests, percentage of matched interests, etc.
Finally, you'll thank yourself later if you ever need to build any reporting, business intelligence, or other down-stream systems over this data. Or you'll remain in the good books of whoever it is who does need to build those things.
Best Answer
The obvious downside is that if your database doesn't support query over JSON, and especially indexing over JSON, you may have lower performance performing operations other than "fetch this blob of translations" - especially "fetch only the "EE" translations", or "tell me what doesn't have an "EE" translation.
The key advantage of storing JSON is that you can serve it much faster: key lookup, read column, dump to browser. No need to futz with encoding it to JSON on the fly - you get the same advantage that you do serving pre-compressed static content to clients rather than on-the-fly gzip compression.
You can also balance the two: use the structured model with the link table to preserve query capabilities, but store JSON data to minimize encoding and decoding costs when you send.
Ultimately, it depends on what queries are most and least common in your problem space.
(Oh, and don't forget to think about what happens if two languages get uploaded at the same time, if you pack everything into the one column.)