I have a table in SQL Server with around 400,000 rows. Unfortunately, this table has many duplicate rows, as there isn't a primary key. I'm using SELECT DISTINCT
to get all unique rows, but as there's so many of them, I keep getting timeout errors. What's the best way to approach this problem?
Sql-server – Select Distinct/Unique Rows from a Huge Table in SQL Server Without Timeout Errors
sqlsql servertimeout
Related Topic
- Sql-server – SQL Import to update existing records
- SQL Server; how to import data from another DB with different collation; conversion errors
- Sql-server – Repairing inconsistent pages in database
- Sql-server – Change next auto-number assignment without recreating entire table, MS SQL Server 2000
- SQL Server 2008 Timeout Expired
- Sql-server – MSSQL: “Could not allocate space for object ‘dbo.SORT temporary run storage in database because ‘PRIMARY’ filegroup is full
- SQL Server – Splitting MDF File and Partitioning Table Data
Best Answer
Good question. Improove timeout - raise the value. Or throw in more power. Technically 400.000 rows is not a lot, so it would go very fast for a decent server. Otherwise - if you have IO problems because your "server" has only one (pair of) hard discs.... then your IO Will kill you.
Basically, DISTINCT triggers a temporary result creation in the tempdb - so it is tempdb heavy.
But at the end, I would increase timeout, especially if this is a one time operation only.
Btw., to put that into perspective - where is the HUGH table you talk about? 400.000 is trivial in size.