A bit about relational databases
One of the most powerful features of relational databases is the ability to connect sets of data through common points. In order to do this efficiently, a database should follow the rules of normalization. To sum those rules up, a database should have:
- No repeating elements or groups of elements
- No partial dependencies on a concatenated key
- No dependencies on non-key attributes
With these rules in mind, an example table for the comments could look like this:
----------------------------------------------------------------------
| comments |
|--------------------------------------------------------------------|
| comment_id (key, auto-increment) | comment | post_id | user_id |
|----------------------------------|-----------|---------|-----------|
| 1 | <text...> | 1 | 123452 |
----------------------------------------------------------------------
Here, the comment_id
is the 'key,' or unique identifier, of the table. It's also been set to 'auto increment,' which means that it will automatically increase its value as 'records,' or rows, are added to the table. As shown, the table containing information related to comments only knows what it needs to know. How, then, do you relate user-specific information to a comment? This is where the 'relational' part of 'relational database' comes into play:
--------------------------------------------|
| users |
|-------------------------------------------|--------------------------------|
| user_id (key, auto-increment) | avatar | additional fields not shown... |
|-------------------------------|-----------|--------------------------------|
| 123452 | <img_url> |
---------------------------------------------
Note that the user_id
column contains the same data for both the comments
table and the users
table. This way, you can 'join' data from the two tables. For example, to get all of the comments made by a user, you could run the MySQL query:
SELECT comment_id, comment, post_id FROM comments NATURAL JOIN users WHERE user_id=123452;
This method also answers the question
how should the user should keep track of all of their posts, comments, and favorites?
The user
table should not keep track of such information in itself, but rather the respective tables should contain references to a globally unique user ID.
Almost there
So basically, you were on the right track. The only real change from the model you specified was to move the user's avatar information to the table concerning users, instead of in the table concerning comments.
As you seem to be leaning away from using a raw SQL database, you could consider the tables to be classes, and use the rules of normalization as a design guideline.
Finally, the Pointers vs. Arrays thing: both of those are very specific to the Parse backend (and admittedly, neither are explained very well). The best comparison I can come up with is that Pointers would be like Lists (as in Java or C#), and Arrays would be like, well, Arrays. The difference between the two is that Arrays can only store a predetermined amount of data, while Lists (or Pointers, in Parse's case) can store an unspecified amount. In theory, the amount would be infinite, but in practice the amount is determined by the amount of space available in the heap. For more information about the difference between Lists and Arrays, see this question. If you are planning to use Parse, I would recommend using Pointers in conjunction with Join Tables (which are essentially wrappers around the SQL method I described above), as those are the options offering the most flexibility.
Best Answer
My first instinct would be to reuse an existing user authorization mechanism, rather than writing one. Then I would skip the rest until I had firm requirements. Premature generalization is just as bad as premature optimization.