Database Synchronization – Share Database or Sync with Web Services?

couplingdatabaseenterprise-architecturesoaweb services

We have decided to build separate applications for user authentication and user management. The reason is that the the former is a "this just needs to always work, and do so instantly" style service, and the latter includes a much-larger user-interface component with many requirements and features.

Unfortunately, while white-boarding, prototyping, and getting more information about required features, we discovered that they need to share several types of information. For example, both apps need to know whether a user is locked out or needs to reset their password.

We are considering connecting the applications to the same database. We considered using web services for communication, but that has a higher up-front cost, is more complex, and will have worse performance (assuming the same query will be made in either case). We'd have to keep some pieces of information in sync across two database instances, which is obviously not an optimal setup.

On the other hand, sharing the database instance would require the ORM's entity schemas to stay in sync somewhat (even if we weren't using one, we'd still need to make sure our queries still worked in both apps whenever the database schema changed). It creates a shared dependency. In addition, the database becomes a single-point of failure/poor performance. In addition, I'm not sure it would be possible to hide irrelevant data from either app, so encapsulation is reduced.

Has anyone here worked this dilemma before and gained any insights? I'm sure there are other considerations here that I haven't thought of yet. It's not exactly unusual to have different applications connect to a central database.

Edit (much later):

The outcome of this story almost two years later is that we ended up merging the applications into one (which thankfully was easy because they were both just modules using the same framework). The reasoning for splitting them into two made sense at the time, but in retrospect they shared too many responsibilities to make it practical.

Best Answer

Long story short: all in one.

If you consider one of these as "master" and the other as an augmented replica, that updates are replicated one way only, and you can survive large latencies in synchronisation (and I'll let you define "large") then two separate DBs will work. For anything else I'd suggest a single DB.

If you have two-way updates will they be synchronous or async? The former and you have closely coupled the two DBs. The latter and you must have conflict resolution measures in place, adding complexity. With a single DB you have the whole weath of concurrency technology to support you.

With any copy process there will be latency, even if it is only milli- or micro-seconds. That's all it takes to introduce inconsistency between the source's and replica's data, however, and then you have the whole reconciliation & resolution thing again.

How will you handle DR? How will you ensure a common sync point across two DBs, potentially on different servers (or data centres)? How do you ensure you get the two matching tapes back from the off-site vault first time, every time?

Any schema change which adds tables or columns will be transparent to the other application. (You're not writing SELECT *, are you? ARE YOU?) Removing objects should only affect modules which use then, which you'd want to change anyway. (Two DBs with a redundant table in one but removed from the other smells like a potential source of great confusion to me.) Changing schema because business rules have changed may cause some duplication of work; but it seems to me if one application isn't implementing that business rule it had no call to be in that part of the DB anyway.

Could you store the ORM-implementing code files in way that made them common to both applications? Then schema changes will affect only one set of code.

"Database as single point of failure." Well, fair point. But then it has been on absolutely every application I've ever worked on since they've all had a DB. That's why we have cluster/ mirror/ high-availability built into the DBMS products. If you have two DBs and one fails, what then? Back to the reconciliation/ conflict resolution cycle. It is also a single point of tuning; fix it one place and it's fixed everywhere!

I would have thought views, stored procedures, the ORM itself, and limiting the items in any SELECT query would have been able to isolate any part of the application from data in which it has no interest.

My reading of your question is that your two applications are actually quite closely coupled. My response is coloured by that. I've written plenty of systems where we bulk copy or two-way replicate data and deal with the consequences.

Related Topic