Sql – How to get distinct rows faster from a huge table

distinctgroup-byoracleperformancesql

I have a huge table which contains about 250 million rows. I have only SELECT privilege to this table. My intention is to query distinct records from a specific columns. I'm using a query

select var1, count(*)
from hr.hugetable
group by var1

This query takes about 15 minutes to complete. There is no index on var1 and I'm not able to add it. Is there a way to refine this query to fetch results faster? This query would also do it (I do not need count the record, only distinct values), but I think it is not faster.

select distinct Var1
from hr.hugetable

Best Answer

You should talk to your DBA about your options, including indexing the column, and potentially more involved possibilities like a materialized view, if it's a frequently-executed query.

One possibility to consider is to parallelise the query, but be aware that this could actually slow it down depending on where the bottleneck is. There's a white paper from Oracle on parallelisation.

In principle you could add the parallel hint:

select /*+ parallel */ distinct var1
from hr.hugetable

You should discuss that with your DBA too, particularly the degree of parallelism (DOP) to use, and whether automatic DOP is appropriate. Also read up on what it does in the documentation, and compare the explain plans and timings - including at different DOP - to see what's appropriate. You don't want to risk impacting other users with what you're doing, so approach with caution, and with the DBA's involvement.