Proper Implementation of a Dictionary in Java

dictionaryjavajson

The problem:

Let's say I want to create a java app (currently using java 8) that works like a dictionary. I want the app to give the user the capacity to add and search for words, every word must contain a name, a meaning and zero or more examples. In addition, I want the user to be able to add more examples to a word that is already stored in the dictionary and to add multiple meanings to the same word, for this, the dictionary must detect when a user tries to add a duplicated word and instead of creating a new entry, it should add the new information for the already existing word. Finally, I need the app to store all of this information so it don't forget it whenever the user exits the application.

My approach:

Obviously, I'm not asking you to implement this app, so here's my idea. I know java has a Dictionary class, so I could create a Hashmap<String, Word> to manage all the dictionary entries, where the key would be the 'name' of the word, and Word would be an object containing the meanings and the examples as two lists of strings. That way I could easily search for a word, and add or retrieve information from the Word object. Also, this way, if the user tries to add a word, if the Hashmap already contains the word, it will not create a new Word object, but add the info to an existing one (kind of what you would do with a Flyweight).

Now, for storing the dictionary, given that my idea is to work with Word objects, I was thinking on using a JSON file to store the dictionary as an array of words.

The question:

Now, as I said, I'm not asking for an implementation of this app. What I want to know is if my approach is a good idea, and maybe discuss some alternative implementations. So, is a Hashmap a good structure to maintain a dictionary? Is it a good idea to use a Word object? Is using a JSON file a good way to store this kind of data?

Bonus: What if I want to add the words in alphabetic order?

Best Answer

Your design has multiple entities with rigid relationships between them. You have a set of words, each of which has a set of meanings, a set of related words (eg different tenses of verbs, plurals of nouns, etc), and so on. Meanings have definitions and citations. You may have cross references between entities. All of these things are well defined relations between entities of finite types with well known attributes that can be determined ahead of time. Furthermore, you do not have entities nesting inside themselves but instead have a flat, predefined hierarchy of statically nested types. You also want persistence, but your data is large enough that loading and saving it inits entirety could cause performance problems, so you need fast access to subsets of the data to work with.

All of these factors suggest using a relational database rather than a recursively defined structure like JSON, which is likely to be harder to work with and less performant.

Related Topic