Database Cleaning: Application Logic vs DB Triggers

application-designdatabase-designMySQLPHP

When cleaning out old data from a set of database tables, is it safer to have the logic in the application or in a database trigger?

I am upgrading an application that I wrote a while back (and in a hurried fashion) and one of the things I want to clean up first is ensuring that data which is no longer being used is propmtly deleted. Currently this is being done in my application itself via a number of SQL calls to various tables. Some of these queries take a bit of time to run and as the application is PHP driven, I don't want users having to wait too long.

I was therefore thinking about creating a database trigger on one table that would merrily clean up all the other tables and then allow the single row delete to occur from my master table.

I am considering these pros and cons at the moment:

  • Database trigger
  • (+) Invisible to the application
  • (+) Simplifies application logic – new objects can be simplified
  • (-) Adds complexity to overall maintenance – have to maintain code and triggers during table changes etc.

  • Application Logic

  • (+) Keeps all maintenance tasks in one place
  • (-) Longer page loads on certain tasks
  • (-) More chance of a request timing out – orphaning data

If you have any further insight, considerations I haven't thought of here, experience with one vs the other or can point me towards some reading on the matter I would love to hear it.

Edit: I originally thought that my pages would run quicker, but if I am creating a Before Delete trigger, will the page loads be the same – as in the application sends a tiny query through but it won't be actioned until the entire set of statements in the trigger complete?

Best Answer

In the case of archiving or deleting old or obsolete data, would you consider a third option, i.e. a scheduled batch job, which detects and deletes old data. The job could then be scheduled e.g. once per day at a relatively quiet time, thus having minimal impact on your application. The job could be part of other standard database jobs, e.g. OPTIMIZE / reindexing etc.

The one exception to this could be if. e.g. you are updating a 2 tables with a 1 : many relationship (e.g. Invoice and InvoiceLineItem), whereby you've decided to delete all existing invoice line items before inserting the new ones. Because the deletion is deterministic (you know exactly what you need to delete), and because you also want the deletion to be part of a larger transaction which includes the subsequent inserts, then I would say that this would be a good candidate for the application tier.

However, I cannot see the purpose of doing this in a trigger at all - this could cause unnecessary performance loss for your inserts / updates while your trigger searches for stale data, and then locks unrelated rows elsewhere after deleting them.

Related Topic