Sql-server – SQL Server multiple INNER JOINs very slow

query-optimizationsql server

I'm pretty much an idiot when it come to databases, I can write the query to do what I want without too many problems, but when I hit a performance issue I really have no idea what to do, so any help would be gratefully received.

I have three tables:

Bill

  • Bill_Id – BIGINT – Primary key
  • BillDate – DATE

BillDetail

  • BillDetail_Id – BIGINT – Primary key
  • Bill_Id – BIGINT – Foreign key for Bill, indexed
  • BillDetailType_Id – INT – Foreign key for BillDetailType, indexed
  • Charge – MONEY

BillDetailType

  • BillDetailType_Id – INT – Primary key
  • TypeName – NVARCHAR(20)

Each Bill has multiple BillDetails, which are basically the individual items on a bill. Every BillDetail has a BillDetailType, which is what kind of bill item it is (e.g. electricity, internet, tax).

I have also created a view like this:

CREATE VIEW BillSubtotal
AS
SELECT b.*,
        (SELECT SUM(bd.Charge) FROM BillDetail AS bd INNER JOIN BillDetailType AS bdt ON bd.BillDetailType_Id = bdt.BillDetailType_Id
            WHERE (bdt.TypeName = 'Tax') AND (bd.Bill_Id = b.Bill_id)) AS Tax,
        (SELECT SUM(bd.Charge) FROM BillDetail AS bd INNER JOIN BillDetailType AS bdt ON bd.BillDetailType_Id = bdt.BillDetailType_Id
            WHERE (bdt.TypeName <> 'Tax') AND (bd.Bill_Id = b.Bill_id)) AS NonTaxTotal
        FROM Bill AS b

Running that view takes about 14 seconds with the current dev database, which has about 60000 Bills and 700000 BillDetails. There are 26 different BillDetailTypes. I'd like to add some more subtotals once I get this working, but for now that's all I have.

Now I'm trying to do a join like this:

SELECT bs.BillDate, bs.Tax, bs.NonTaxTotal, bd.Charge, bdt.TypeName FROM
BillDetail bd
INNER JOIN BillSubtotal bs ON bs.Bill_Id = bd.Bill_Id
INNER JOIN BillDetailType bdt ON bdt.BillDetailType_Id = bd.BillDetailType_Id

I would like to calculate what percentage of a pre-tax Bill a particular BillDetail is and some other things, so I will eventually have something like bd.Charge/bs.NonTaxTotal*100, but at the moment this query takes 14 hours to run and I really don't understand why.

If I remove either of the INNER JOINs, the query speeds up dramatically:

SELECT bs.BillDate, bs.Tax, bs.NonTaxTotal, bd.Charge FROM
BillDetail bd
INNER JOIN BillSubtotal bs ON bs.Bill_Id = bd.Bill_Id

Takes about 1.5 minutes to run.

SELECT bd.Charge, bdt.TypeName FROM
BillDetail bd
INNER JOIN BillDetailType bdt ON bdt.BillDetailType_Id = bd.BillDetailType_Id

Takes about 12 seconds.

I don't understand why either of the joins by themselves runs in such a short time, but when I do the joins together it takes hours. Maybe it's something very obvious, but because I don't really understand how the queries are being evaluated I'm missing it. I looked at the execution plan, but I can't glean anything useful from it and I'm kind of at a dead end. I've tried various ways of switching things around, moving one of the joins to a subquery and other things I thought might help, but nothing I've done has changed the performance.

Thanks for any help.

Best Answer

I would suggest not using a view at all. I did a bunch of this several years ago, but they just became too difficult to manage over a long time. If you add a column to one of the tables, you should update the view. It just becomes too laborious. That being said, you can add indexes to views.

I would also suggest using the Group By strategy. In my experience, this can be a whole lot faster. I've used it in several cases and found remarkable improvements in speed. Something like this:

 SELECT Bill_Id, 
     SUM(BillDetail.Charge), 
     CASE
         WHEN BillDetailType.TypeName = 'Tax'
             THEN 'Tax'
             ELSE 'Not Tax'
     END AS TypeName            
 FROM BillDetail
     INNER JOIN BillDetailType 
         ON BillDetail.BillDetailType_Id = BillDetailType.BillDetailType_Id
 GROUP BY Bill_Id, TypeName

You could just use this query and join to that rather than creating a view. This would leverage that indexes on the tables themselves.

Finally, you may want to try running whatever query you end up with through Sql Server Profiler tool.

I have a blog post about SQL Query Optimization, which recounts various techniques I've learned over the past 7 years.