It's not about security or something. There is a table to record user information, like username, password, postcode and so on, one record per person. There is also a table called post to record the posts the users posted. It's very easy to set a relation to identify the user who posted the post. But think of this situation, how to store the posts that has been read by a user? If use many-to-many relation, the table will increase to very huge size. What's more, there's a lot things to record, I just take one of them for example. I call this personal information. The problem is how should I store it properly?
Database Security – How to Store Personal Information Properly
databaseMySQL
Related Solutions
If information like how many comments, up-votes, favorites a post have is frequently used, you are better off storing these counts somewhere; querying the DB every time for number of comments in a post, number of followers a user have, etc. may quite soon become your performance bottleneck.
Two common approaches exist:
Add column/field for
Posts
in the database that store the comment count. when new comments are added, increase that counter field by one (synchronously), or update that count every hour/day (asynchronously).Use cache (memcache, redis) to store those numbers to reduce database queries. Memory cache are usually required for high traffic sites, anyway.
Either way you can weave it into your ORM / Abstraction Layer, so there won't be too much additional code involved. For advantages/disadvantages, there are plenty of discussions/debates around this issue on the net. Google it for thorough and detailed explanations :)
A bit about relational databases
One of the most powerful features of relational databases is the ability to connect sets of data through common points. In order to do this efficiently, a database should follow the rules of normalization. To sum those rules up, a database should have:
- No repeating elements or groups of elements
- No partial dependencies on a concatenated key
- No dependencies on non-key attributes
With these rules in mind, an example table for the comments could look like this:
----------------------------------------------------------------------
| comments |
|--------------------------------------------------------------------|
| comment_id (key, auto-increment) | comment | post_id | user_id |
|----------------------------------|-----------|---------|-----------|
| 1 | <text...> | 1 | 123452 |
----------------------------------------------------------------------
Here, the comment_id
is the 'key,' or unique identifier, of the table. It's also been set to 'auto increment,' which means that it will automatically increase its value as 'records,' or rows, are added to the table. As shown, the table containing information related to comments only knows what it needs to know. How, then, do you relate user-specific information to a comment? This is where the 'relational' part of 'relational database' comes into play:
--------------------------------------------|
| users |
|-------------------------------------------|--------------------------------|
| user_id (key, auto-increment) | avatar | additional fields not shown... |
|-------------------------------|-----------|--------------------------------|
| 123452 | <img_url> |
---------------------------------------------
Note that the user_id
column contains the same data for both the comments
table and the users
table. This way, you can 'join' data from the two tables. For example, to get all of the comments made by a user, you could run the MySQL query:
SELECT comment_id, comment, post_id FROM comments NATURAL JOIN users WHERE user_id=123452;
This method also answers the question
how should the user should keep track of all of their posts, comments, and favorites?
The user
table should not keep track of such information in itself, but rather the respective tables should contain references to a globally unique user ID.
Almost there
So basically, you were on the right track. The only real change from the model you specified was to move the user's avatar information to the table concerning users, instead of in the table concerning comments.
As you seem to be leaning away from using a raw SQL database, you could consider the tables to be classes, and use the rules of normalization as a design guideline.
Finally, the Pointers vs. Arrays thing: both of those are very specific to the Parse backend (and admittedly, neither are explained very well). The best comparison I can come up with is that Pointers would be like Lists (as in Java or C#), and Arrays would be like, well, Arrays. The difference between the two is that Arrays can only store a predetermined amount of data, while Lists (or Pointers, in Parse's case) can store an unspecified amount. In theory, the amount would be infinite, but in practice the amount is determined by the amount of space available in the heap. For more information about the difference between Lists and Arrays, see this question. If you are planning to use Parse, I would recommend using Pointers in conjunction with Join Tables (which are essentially wrappers around the SQL method I described above), as those are the options offering the most flexibility.
Best Answer
to solve your read posts issue you should use a mapping table that links a post and a user, then insert a record for each post read by each user. Creating a mapping table is the basic strategy to resolve many to many relationship issues.
Databases can grow large, but if you design properly you can limit repeated data and use the smallest necessary data type to conserve disk space, also disk space is one of the cheapest parts of hosting a website so database getting large isn't a huge problem. You can also implement policies to help limit the size of your database, things like deleting a user after X months of no activity, or only storing history for the previous X months.
If you want to know how to design a table to properly store all this information you should read up on database normalization. If you are wondering how to store the information securely you should read up on encryption and hashing, and only stored hashed or encrypted values for sensitive information like a password.