Sql – How to get distinct rows faster from a huge table

distinctgroup-byoracleperformancesql

I have a huge table which contains about 250 million rows. I have only SELECT privilege to this table. My intention is to query distinct records from a specific columns. I'm using a query

select var1, count(*)
from hr.hugetable
group by var1

This query takes about 15 minutes to complete. There is no index on var1 and I'm not able to add it. Is there a way to refine this query to fetch results faster? This query would also do it (I do not need count the record, only distinct values), but I think it is not faster.

select distinct Var1
from hr.hugetable

Best Answer

You should talk to your DBA about your options, including indexing the column, and potentially more involved possibilities like a materialized view, if it's a frequently-executed query.

One possibility to consider is to parallelise the query, but be aware that this could actually slow it down depending on where the bottleneck is. There's a white paper from Oracle on parallelisation.

In principle you could add the parallel hint:

select /*+ parallel */ distinct var1
from hr.hugetable

You should discuss that with your DBA too, particularly the degree of parallelism (DOP) to use, and whether automatic DOP is appropriate. Also read up on what it does in the documentation, and compare the explain plans and timings - including at different DOP - to see what's appropriate. You don't want to risk impacting other users with what you're doing, so approach with caution, and with the DBA's involvement.

Related Solutions

Sql – How to concatenate text from multiple rows into a single text string in SQL Server

If you are on SQL Server 2017 or Azure, see Mathieu Renda answer.

I had a similar issue when I was trying to join two tables with one-to-many relationships. In SQL 2005 I found that XML PATH method can handle the concatenation of the rows very easily.

If there is a table called STUDENTS

SubjectID       StudentName
----------      -------------
1               Mary
1               John
1               Sam
2               Alaina
2               Edward

Result I expected was:

SubjectID       StudentName
----------      -------------
1               Mary, John, Sam
2               Alaina, Edward

I used the following T-SQL:

SELECT Main.SubjectID,
       LEFT(Main.Students,Len(Main.Students)-1) As "Students"
FROM
    (
        SELECT DISTINCT ST2.SubjectID, 
            (
                SELECT ST1.StudentName + ',' AS [text()]
                FROM dbo.Students ST1
                WHERE ST1.SubjectID = ST2.SubjectID
                ORDER BY ST1.SubjectID
                FOR XML PATH ('')
            ) [Students]
        FROM dbo.Students ST2
    ) [Main]

You can do the same thing in a more compact way if you can concat the commas at the beginning and use substring to skip the first one so you don't need to do a sub-query:

SELECT DISTINCT ST2.SubjectID, 
    SUBSTRING(
        (
            SELECT ','+ST1.StudentName  AS [text()]
            FROM dbo.Students ST1
            WHERE ST1.SubjectID = ST2.SubjectID
            ORDER BY ST1.SubjectID
            FOR XML PATH ('')
        ), 2, 1000) [Students]
FROM dbo.Students ST2

Sql – How to limit the number of rows returned by an Oracle query after ordering

You can use a subquery for this like

select *
from  
( select * 
  from emp 
  order by sal desc ) 
where ROWNUM <= 5;

Have also a look at the topic On ROWNUM and limiting results at Oracle/AskTom for more information.

Update: To limit the result with both lower and upper bounds things get a bit more bloated with

select * from 
( select a.*, ROWNUM rnum from 
  ( <your_query_goes_here, with order by> ) a 
  where ROWNUM <= :MAX_ROW_TO_FETCH )
where rnum  >= :MIN_ROW_TO_FETCH;

(Copied from specified AskTom-article)

Update 2: Starting with Oracle 12c (12.1) there is a syntax available to limit rows or start at offsets.

SELECT * 
FROM   sometable
ORDER BY name
OFFSET 20 ROWS FETCH NEXT 10 ROWS ONLY;

See this answer for more examples. Thanks to Krumia for the hint.

Best Answer

Related Solutions

Sql – How to concatenate text from multiple rows into a single text string in SQL Server

Sql – How to limit the number of rows returned by an Oracle query after ordering

Related Topic