Sql-server – Random record duplications in database (unintentional)

sql server

Apologies ahead of time for the length; I just want to make sure I have all the facts covered so a proper diagnosis can be reached.

I am just absolutely stumped on this issue and hoping someone has come across this issue before or can at least provide some insight on the matter.

For the past few weeks now, our application has been experiencing an issue where seemingly random records from seemingly random tables will duplicate themselves. There does not seem to be any rhyme or reason to it at all and I don't believe it's a coding issue. Sometimes the records duplicate 3 times, sometimes they duplicate 20+ times, and everything in between.

I have a datetime field on virtually every table in my database, I add them habitually, and one thing I find very odd about the records is that each time the duplication happens, the real record will have a correct datetime stamp such as 2010-10-28 16:28:26.903 (note the milliseconds are accurate and what you would expect) – but the duplicate records all have a datetime stamp like 2010-10-28 16:28:27.000.

This is one of the reasons I say it's not a coding issue. I would expect that, if there were some junky erroneous looping going on that was inserting a bunch of records randomly in some rogue code block somewhere, that the inserted record's datetime stamps would be several milliseconds apart – but that is not the case here. All duplicated records have the milliseconds drop and are rounded up or down to the nearest second.

All other information is absolutely identical with the exception of the PK/identity column.

This has happened to about 4 different tables now in the application, to my knowledge, however it might have happened to more… I haven't checked yet (the db has a lot of tables). I know it's not happening to all of the tables. It seems to have stopped in the table where the problem first occurred, and now "spread" to other tables.

I took a copy of the database for pre-version release staging/testing and there are duplicates in that database as well. That database is on the same box as the live database which is experiencing this issue. However, we have a database on our development box which does not seem to have this issue at all. I would think if it were a coding issue, we'd see the problem in all three databases.

In addition to that, one of the tables that it affected is more of a back-end table. It's only got two records in it but one of them got duplicated 10 times the other day. There is no code in our system anywhere that inserts records or updates records or interacts in any way what so ever with this particular table. There are no stored procedures written for it, no views, no triggers, absolutely nothing… and yet one of its records was duplicated. I could see a coding issue affecting the other tables, but in light of this particular table having duplicates while there is no code interaction with it what so ever – I find that extremely peculiar. I should also mention that the record here that got duplicated was inserted manually months ago.

These duplications also do not happen at the time that the real record is inserted. I did some testing when it first started happening – thinking that it was a coding issue and attempting to hunt down the rogue section of code. All my testing turned up fine and I was not able to reproduce the result. I tested out the area(s) in question, checked the database, and none of the newly inserted records had duplicates. I checked the next morning out of curiosity and no duplicates were inserted. I checked later that afternoon and sure enough, 24 duplicate records had been inserted – nearly 24 hours after the fact.

Does anyone know of any database process that could perhaps glitch and cause this to happen? Something stuck in memory when a record is inserted that just pumps out the record a number of times some time later?

I know it's not your run of the mill issue, or at least I don't think it is. First time I've ever heard of it happening. Any ideas? Even a shot in the dark would be appreciated at this point.

I know one "solution" is to put constraints on the affected tables to make duplicate records impossible; but I'd really like to get to the bottom of the issue or at least have some sort of inclination as to what is causing it (if possible) before I go and do something like that. I'd like to know if I maybe haven't set something up correctly and can know to avoid the issue in the future. If there's a problem, I'd rather fix it rather than mask it and have it come back and haunt me later. Murphy's law, you know.

Thanks guys; and if you need any more information I'll be happy to provide.

EDIT Sorry, I should have mentioned I'm using SQL Server 2005 on the production database server, SQL Server 2008 on the development box, and they're both hooked up to an ASP.net application

Best Answer

Putting constraints on tables isn't a "solution" (in a figurative sense). Rather, it's the right thing to do. By definition a database schema should be designed to cause the RDBMS to enforce constraints that prevent invalid data from being represented in the database.

You should have constraints on your tables to prevent "duplicate data" from being inserted as part of the basic design of the database. If you were using natural primary keys you'd get this "for free". Since you're using artifical primary keys ("identity" columns) you need uniqueness constraints on the columns that make up the natural primary key.

It's almost assured that this is happening somewhere in your code. SQL Server has been beat-on pretty heavily by very, very large users over the years. A bug in the SQL Server engine itself of this type would've been sorted out a long time ago.

Once you've got uniqueness constraints in place errors will be thrown when these duplicate records are inserted and you'll be able to trace them back to their source. If your code doesn't have the equivalent of an "On Error Resume Next" type of construct in it you should start getting errors (unhandled exceptions, etc) when the offending code fails to perform the inserts on "duplicate" data.

The whole "rounded off time" symptom you describe makes me think that you have a "utility" function buried somewhere in the code that is "helping" by truncating the time (since truncating datetime values is fairly trivial).

Make SQL Server throw contraint violation errors and you'll be able to track down the offending code. Leave the constraints on the database even after you've fixed the offending code because excluding invalid data is part of the database's job.