SQL Server 2005/8 Replication Transaction ID

replicationsql serversql-server-2005

I have a scenario where I'm using transactional replication to replicate multiple SQL Server 2005 databases (same instance) into a single remote database (different instance on a separate physical machine).

I am then performing some processing on the replicated data for reporting purposes. I'm using table level triggers to identify changes which actions my post processing code.

Up to this point everything is fine.

However, what I'd like to know is, where certain tables are created, updated or deleted in the same transaction, is it possible to identify some sort of transaction ID from replication (or anywhere) so then I don't perform the same post processing multiple times for a single transaction.

Basic Example: I have a TUser Table and TAddress table. If I was to create both in a single transaction, they would be replicated across in a single transaction too. However, there would be two triggers fired in the replicated database – which at present causes my post processing code to be run twice. What I'd really like to identify is that these two changes arrived in the replicated in the same transaction.

Is this possible in any way? Does an identifier as I've describe exist and is it accessible?

Best Answer

Short answer is no, there is nothing of the sort that you can rely on. Long answer in summary would be that yes it exists, but it would not be recommended in any way to be used for anything.

Given that replication is transactionally consistent, one approach you could consider would be pushing an identifier for the primary record (in this case TUser, since an TAddress is related to TUser) onto a queue (using something like Service Broker ideally or potentially a user-defined queue) and then perform the post-processing by popping data off the queue and processing separately.

Another possibility would be simply batch processing every 'x' amount of time by polling for new/updated records from the primary tables and post-processing in that manner - you'd need to track id's, rowversions, or timestamps of some sort that you've processed for each primary table as meta-data and pull anything that hasn't yet been processed during each batch run.

Just a few thoughts, hope that helps.