Web Development – Storing and Posting Blog Entries in a Database

databaseweb-development

I'm trying to make a simple blog from scratch, but it occurs to me that I probably shouldn't have a bunch of static html pages that I link to, especially since I want the home page to be a cascade of previews of each post. So I was thinking I would store the posts in a database, and I would just retrieve parts for the home page and then the whole thing for each individual post's page. But as a beginner, I have two problems.

The first is that I don't know if that's good practice or what anyone actually does. Every time I see examples or tutorials of how to use databases, it always has very short data like name and address and number of kids. Never an entire blog post. And also not the pictures that would go along inside the post. So is there another way to store all that information? Do you maybe store file names in the database? I'm particularly concerned about images, I have no clue how that works.

The second issue is once it's all set up and online, I don't know how I would continue to update it. Would I make a POST route that only I can access? That sounds like a security risk to me.

Forgive my ignorance, I guess it's just such a simple question that I can't find anything that answers it explicitly. I don't know the right nouns to ask well enough, so I'm just looking for some best practices.

Best Answer

You seem to be creating a Content Management System (CMS). Many such systems already exist, and the majority stores the content within a database.

The fact that the content may be rather large is not an issue. The database can handle large amounts of plain text or binary data per entry without problem. The important point to remember is that it doesn't make sense to enforce a maximum size for blog posts, so in an SQL database we can't declare the column with type VARCHAR(10000) or something like that. Instead, you probably want an unrestricted-size TEXT or BLOB type. For TEXT types, you'll have to pick a suitable text encoding – many databases are not quite Unicode-compatible.

Some CMSes handle static resources like images separately from page content. E.g. Wordpress lets user upload images and other files into the /wp-content/ folder. Requests for these resources can be handled directly by the HTTP server. That is quicker than routing the request through the CMS.

A CMS always has a management interface where you can log in, configure the system, and create and edit content. This should require a login (e.g. OpenAuth or password over HTTPS). The CMS backend checks each request that would affect settings or content for proper authorization: the user must be logged in and must have the necessary rights, if the CMS includes a rights management system. That can be done securely, but it is easy to make fatal mistakes especially if you haven't created a web application before. The most common problems are SQL injection vulnerabilities or bugs in the authorization check for each protected action. If the site allows content to be generated by untrusted users (comments, a search bar, or a 404 page that displays the missing URL), then XSS attacks are also possible.

In recent years, static blogging has become increasingly popular. Here, no CMS runs in the web server. Since there is no code, there can be no security problems in our code. Instead, all HTML files were created on a different computer and then uploaded to the web server. The tool that creates the HTML files from the content can be arbitrarily complex. I've written my own static blogging engine imitating Jekyll, and it supports a tag system, post excerpts on the home page, RSS feeds, theming, …. The only features that a static system cannot have are dynamic features like user comments or a search bar, at least not without using 3rd-party providers.

Related Solutions

PHP Database – How to Integrate and Store Sessions in a Database

Sessions are based on the session identifier, a string that uniquely identifies a browser session via a cookie¹ set on the client when session_start() is called. Since they are based on cookies, you should treat them as user input and you should never absolutely trust user input. Cookies can be stolen via XSS, and PHP cookies are not encrypted by default.

Sessions are stored in the %TEMP% directory, whichever that may be for your system. The save location can be changed from the session.save_path directive or by the session_save_path during runtime, but on a typical hosting environment², it's generally considered easier to gain access to the filesystem than a database.

The various security concerns are addressed on the manual:

The session module cannot guarantee that the information you store in a session is only viewed by the user who created the session.

...

Assess the importance of the data carried by your sessions and deploy additional protections -- this usually comes at a price, reduced convenience for the user.

...

The session module cannot guarantee that the information you store in a session is only viewed by the user who created the session.

Session fixation is the commonly quoted buzzword, you can find a little more info on the Session Fixation Vulnerability in Web-based Applications article that's referenced on the manual.

The session_set_save_handler() function allows you to transparently choose where sessions are saved:

session_set_save_handler() sets the user-level session storage functions which are used for storing and retrieving data associated with a session. This is most useful when a storage method other than those supplied by PHP sessions is preferred. i.e. Storing the session data in a local database.

After you build and set your handler, you can continue using sessions as usual. There are quite a few examples on the comments to the manual page on storing sessions to a database. Storing sessions in a database has an important disadvantage: if for any reason the database goes awol, then you've lost everything you've built around sessions too.

_{¹ You can propagate the session id via a URL parameter, but that's more trouble than it's worth.}

_{² By typical I mean cheap shared hosting.}

Database – How to store “fuzzy dates” into a database

Store all dates in normal DATE field in the database and have additional accuracy field how accurate DATE field actually is.

date_created DATE,
date_created_accuracy INTEGER,

date_created_accuracy: 1 = exact date, 2 = month, 3 = year.

If your date is fuzzy (e.g May 1980) store it at start of period (e.g. May 1st. 1980). Or if your date is accurate to year (e.g. 1980) store it as January 1st. 1980 with corresponding accuracy value.

This way can easily query in a somewhat natural way and still have notion how accurate dates are. For example this allows you to query dates between Jan 1st 1980 and Feb 28th 1981, and get fuzzy dates 1980 and May 1980.

Best Answer

Related Solutions

PHP Database – How to Integrate and Store Sessions in a Database

Database – How to store “fuzzy dates” into a database

Related Topic