Relational Database – Is It Ever Okay to Use Lists?

databasedatabase-designrelational-databasesql

I've been trying to design a database to go with a project concept and ran into what seems like a hotly debated issue. I've read a few articles and some Stack Overflow answers that state it's never (or almost never) okay to store a list of IDs or the like in a field — all data should be relational, etc.

The problem I'm running into, though, is that I'm trying to make a task assigner. People will create tasks, assign them to multiple people, and it will save to the database.

Of course, if I save these tasks individually in "Person", I'll have to have dozens of dummy "TaskID" columns and micro-manage them because there can be 0 to 100 tasks assigned to one person, say.

Then again, if I save the tasks in a "Tasks" table, I'll have to have dozens of dummy "PersonID" columns and micro-manage them — same problem as before.

For a problem like this, is it okay to save a list of IDs taking one form or another or am I just not thinking of another way this is achievable without breaking principles?

Best Answer

The key word and key concept you need to investigate is database normalization.

What you would do, is rather than adding info about the assignments to the person or tasks tables, is you add a new table with that assignment info, with relevant relationships.

Example, you have the following tables:

Persons:

+−−−−+−−−−−−−−−−−+
| ID |    Name   |
+====+===========+
|  1 |  Alfred   |
|  2 |  Jebediah |
|  3 |  Jacob    |
|  4 |  Ezekiel  |
+−−−−+−−−−−−−−−−−+

Tasks:

+−−−−+−−−−−−−−−−−−−−−−−−−−+
| ID |        Name        |
+====+====================+
|  1 |  Feed the Chickens |
|  2 |  Plow              |
|  3 |  Milking Cows      |
|  4 |  Raise a barn      |
+−−−−+−−−−−−−−−−−−−−−−−−−−+

You would then create a third table with Assignments. This table would model the relationship between the people and the tasks:

+−−−−+−−−−−−−−−−−+−−−−−−−−−+
| ID |  PersonId |  TaskId |
+====+===========+=========+
|  1 |         1 |       3 |
|  2 |         3 |       2 |
|  3 |         2 |       1 |
|  4 |         1 |       4 |
+−−−−+−−−−−−−−−−−+−−−−−−−−−+

We would then have a Foreign Key constraint, such that the database will enforce that the PersonId and TaskIds have to be valid IDs for those foreign items. For the first row, we can see PersonId is 1, so Alfred, is assigned to TaskId 3, Milking cows.

What you should be able to see here is that you could have as few or as many assignments per task or per person as you want. In this example, Ezekiel isn't assigned any tasks, and Alfred is assigned 2. If you have one task with 100 people, doing SELECT PersonId from Assignments WHERE TaskId=<whatever>; will yield 100 rows, with a variety of different Persons assigned. You can WHERE on the PersonId to find all of the tasks assigned to that person.

If you want to return queries replacing the Ids with the Names and the tasks, then you get to learn how to JOIN tables.