SQL Performance – IN() vs OR: Which is Faster?

speedsql

I was working with a query I wrote today had to change the code from the WHERE clause to use a IN(list of stuff) filter instead of using something like

item_desc = 'item 1'
OR item_desc = 'item 2'
OR item_desc = 'item 3'
OR item_desc = 'item 4'

The above ran for 15 minutes and returned nothing, yet the following gave me my result set in 1.5 minutes

item_desc IN (
'item 1'
,'item 2'
,'item 3'
,'item 4'
)

I did this in SQL and am wondering why the IN(list of items) performed so much faster then the OR statement.

— EDIT —
SQL Server 2008, I do apologize for not putting this bit of info in in the first place.

Here is the Query in its entirety using the OR statements:

DECLARE @SD DATETIME
DECLARE @ED DATETIME
SET @SD = '2013-06-01';
SET @ED = '2013-06-15';

-- COLUMN SELECTION
SELECT PV.PtNo_Num AS 'VISIT ID'
, PV.Med_Rec_No AS 'MRN'
, PV.vst_start_dtime AS 'ADMIT'
, PV.vst_end_dtime AS 'DISC'
, PV.Days_Stay AS 'LOS'
, PV.pt_type AS 'PT TYPE'
, PV.hosp_svc AS 'HOSP SVC'
, SO.ord_no AS 'ORDER NUMBER'
--, SO.ent_dtime AS 'ORDER ENTRY TIME'
--, DATEDIFF(HOUR,PV.vst_start_dtime,SO.ent_dtime) AS 'ADM TO ENTRY HOURS'
, SO.svc_desc AS 'ORDER DESCRIPTION'
, OSM.ord_sts AS 'ORDER STATUS'
, SOS.prcs_dtime AS 'ORDER STATUS TIME'
, DATEDIFF(DAY,PV.vst_start_dtime,SOS.prcs_dtime) AS 'ADM TO ORD STS IN DAYS'

-- DB(S) USED
FROM smsdss.BMH_PLM_PtAcct_V PV
JOIN smsmir.sr_ord SO
ON PV.PtNo_Num = SO.episode_no
JOIN smsmir.sr_ord_sts_hist SOS
ON SO.ord_no = SOS.ord_no
JOIN smsmir.ord_sts_modf_mstr OSM
ON SOS.hist_sts = OSM.ord_sts_modf_cd

-- FILTER(S)
WHERE PV.Adm_Date BETWEEN @SD AND @ED
AND SO.svc_cd = 'PCO_REMFOLEY'
OR SO.svc_cd = 'PCO_INSRTFOLEY'
OR SO.svc_cd = 'PCO_INSTFOLEY'
OR SO.svc_cd = 'PCO_URIMETER'

AND SO.ord_no NOT IN (
    SELECT SO.ord_no
    FRROM smsdss.BMH_PLM_PtAcct_V PV
    JOIN smsmir.sr_ord SO
    ON PV.PtNo_Num = SO.episode_no
    JOIN smsmir.sr_ord_sts_hist SOS
    ON SO.ord_no = SOS.ord_no
    JOIN smsmir.ord_sts_modf_mstr OSM
    ON SOS.hist_sts = OSM.ord_sts_modf_cd
    WHERE OSM.ord_sts = 'DISCONTINUE'
    AND SO.svc_cd = 'PCO_REMFOLEY'
    OR SO.svc_cd = 'PCO_INSRTFOLEY'
    OR SO.svc_cd = 'PCO_INSTFOLEY'
    OR SO.svc_cd = 'PCO_URIMETER'
)
ORDER BY PV.PtNo_Num, SO.ord_no, SOS.prcs_dtime

Thank you,

Best Answer

Oleski's answer is incorrect. For SQL Server 2008, an IN list gets refactored to a series of OR statements. It may be different in say MySQL.

I'm fairly certain that if you generated actual execution plans for both your queries they would be identical.

In all likelihood the second query ran faster because you ran it second, and the first query had already pulled all the data pages from the database and paid the IO cost. The second query was able to read all the data from memory and execute a lot faster.

Update

The actual source of the variance is likely that the queries are not equivalent. You have two different OR lists below:

WHERE PV.Adm_Date BETWEEN @SD AND @ED
AND SO.svc_cd = 'PCO_REMFOLEY'
OR SO.svc_cd = 'PCO_INSRTFOLEY'
OR SO.svc_cd = 'PCO_INSTFOLEY'
OR SO.svc_cd = 'PCO_URIMETER'

and later

 WHERE OSM.ord_sts = 'DISCONTINUE'
    AND SO.svc_cd = 'PCO_REMFOLEY'
    OR SO.svc_cd = 'PCO_INSRTFOLEY'
    OR SO.svc_cd = 'PCO_INSTFOLEY'
    OR SO.svc_cd = 'PCO_URIMETER'

In both those WHERE clauses, operator precendence (where AND is handled before OR) means that the actual logic run by the engine is:

WHERE (ConditionA AND ConditionB)
OR ConditionC
OR ConditionD
OR ConditionE

If you replace the OR lists with an IN expression, the logic will be:

WHERE ConditionA
AND (ConditionB OR ConditionC OR ConditionD OR ConditionE)

Which is radically different.

Related Topic