Sql-server – Select Distinct/Unique Rows from a Huge Table in SQL Server Without Timeout Errors

sqlsql servertimeout

I have a table in SQL Server with around 400,000 rows. Unfortunately, this table has many duplicate rows, as there isn't a primary key. I'm using SELECT DISTINCT to get all unique rows, but as there's so many of them, I keep getting timeout errors. What's the best way to approach this problem?

Best Answer

Good question. Improove timeout - raise the value. Or throw in more power. Technically 400.000 rows is not a lot, so it would go very fast for a decent server. Otherwise - if you have IO problems because your "server" has only one (pair of) hard discs.... then your IO Will kill you.

Basically, DISTINCT triggers a temporary result creation in the tempdb - so it is tempdb heavy.

But at the end, I would increase timeout, especially if this is a one time operation only.

Btw., to put that into perspective - where is the HUGH table you talk about? 400.000 is trivial in size.