Entity Framework – Nesting Linq-to-Objects Query within Linq-to-Entities

entity-frameworklinq

   var numbers = new int[] { 1, 2, 3, 4, 5 };

   var contacts = from c in context.Contacts
                  where c.ContactID == numbers.Max() | c.ContactID == numbers.FirstOrDefault()
                  select c;

    foreach (var item in contacts) Console.WriteLine(item.ContactID); 

Linq-to-Entities query is first translated into Linq expression tree, which is then converted by Object Services into command tree. And if Linq-to-Entities query nests Linq-to-Objects query, then this nested query also gets translated into an expression tree.

a) I assume none of the operators of the nested Linq-to-Objects query actually get executed, but instead data provider for particular DB (or perhaps Object Services) knows how to transform the logic of Linq-to-Objects operators into appropriate SQL statements?

b) Data provider knows how to create equivalent SQL statements only for some of the Linq-to-Objects operators?

c) Similarly, data provider knows how to create equivalent SQL statements only for some of the non-Linq methods in the Net Framework class library?

EDIT:

I know only some Sql so I can't be completely sure, but reading Sql query generated for the above code it seems data provider didn't actually execute numbers.Max method, but instead just somehow figured out that numbers.Max should return the maximum value and then proceed to include in generated Sql query a call to TSQL's build-in MAX function. It also put all the values held by numbers array into a Sql query.

 SELECT CASE
         WHEN (([Project1].[C1] = 1)
               AND ([Project1].[C1] IS NOT NULL)) THEN '0X0X'
         ELSE '0X1X'
       END                      AS [C1],
       [Extent1].[ContactID]    AS [ContactID],
       [Extent1].[FirstName]    AS [FirstName],
       [Extent1].[LastName]     AS [LastName],
       [Extent1].[Title]        AS [Title],
       [Extent1].[AddDate]      AS [AddDate],
       [Extent1].[ModifiedDate] AS [ModifiedDate],
       [Extent1].[RowVersion]   AS [RowVersion],
       CASE
         WHEN (([Project1].[C1] = 1)
               AND ([Project1].[C1] IS NOT NULL)) THEN [Project1].[CustomerTypeID]
       END                      AS [C2],
       CASE
         WHEN (([Project1].[C1] = 1)
               AND ([Project1].[C1] IS NOT NULL)) THEN [Project1].[InitialDate]
       END                      AS [C3],
       CASE
         WHEN (([Project1].[C1] = 1)
               AND ([Project1].[C1] IS NOT NULL)) THEN [Project1].[PrimaryDesintation]
       END                      AS [C4],
       CASE
         WHEN (([Project1].[C1] = 1)
               AND ([Project1].[C1] IS NOT NULL)) THEN [Project1].[SecondaryDestination]
       END                      AS [C5],
       CASE
         WHEN (([Project1].[C1] = 1)
               AND ([Project1].[C1] IS NOT NULL)) THEN [Project1].[PrimaryActivity]
       END                      AS [C6],
       CASE
         WHEN (([Project1].[C1] = 1)
               AND ([Project1].[C1] IS NOT NULL)) THEN [Project1].[SecondaryActivity]
       END                      AS [C7],
       CASE
         WHEN (([Project1].[C1] = 1)
               AND ([Project1].[C1] IS NOT NULL)) THEN [Project1].[Notes]
       END                      AS [C8],
       CASE
         WHEN (([Project1].[C1] = 1)
               AND ([Project1].[C1] IS NOT NULL)) THEN [Project1].[RowVersion]
       END                      AS [C9],
       CASE
         WHEN (([Project1].[C1] = 1)
               AND ([Project1].[C1] IS NOT NULL)) THEN [Project1].[BirthDate]
       END                      AS [C10],
       CASE
         WHEN (([Project1].[C1] = 1)
               AND ([Project1].[C1] IS NOT NULL)) THEN [Project1].[HeightInches]
       END                      AS [C11],
       CASE
         WHEN (([Project1].[C1] = 1)
               AND ([Project1].[C1] IS NOT NULL)) THEN [Project1].[WeightPounds]
       END                      AS [C12],
       CASE
         WHEN (([Project1].[C1] = 1)
               AND ([Project1].[C1] IS NOT NULL)) THEN [Project1].[DietaryRestrictions]
       END                      AS [C13]
FROM   [dbo].[Contact] AS [Extent1]
       LEFT OUTER JOIN (SELECT [Extent2].[ContactID]            AS [ContactID],
                               [Extent2].[BirthDate]            AS [BirthDate],
                               [Extent2].[HeightInches]         AS [HeightInches],
                               [Extent2].[WeightPounds]         AS [WeightPounds],
                               [Extent2].[DietaryRestrictions]  AS [DietaryRestrictions],
                               [Extent3].[CustomerTypeID]       AS [CustomerTypeID],
                               [Extent3].[InitialDate]          AS [InitialDate],
                               [Extent3].[PrimaryDesintation]   AS [PrimaryDesintation],
                               [Extent3].[SecondaryDestination] AS [SecondaryDestination],
                               [Extent3].[PrimaryActivity]      AS [PrimaryActivity],
                               [Extent3].[SecondaryActivity]    AS [SecondaryActivity],
                               [Extent3].[Notes]                AS [Notes],
                               [Extent3].[RowVersion]           AS [RowVersion],
                               cast(1 as bit)                   AS [C1]
                        FROM   [dbo].[ContactPersonalInfo] AS [Extent2]
                               INNER JOIN [dbo].[Customers] AS [Extent3]
                                 ON [Extent2].[ContactID] = [Extent3].[ContactID]) AS [Project1]
         ON [Extent1].[ContactID] = [Project1].[ContactID]
       LEFT OUTER JOIN (SELECT TOP (1) [c].[C1] AS [C1]
                        FROM   (SELECT [UnionAll3].[C1] AS [C1]
                                FROM   (SELECT [UnionAll2].[C1] AS [C1]
                                        FROM   (SELECT [UnionAll1].[C1] AS [C1]
                                                FROM   (SELECT 1 AS [C1]
                                                        FROM   (SELECT 1 AS X) AS [SingleRowTable1]
                                                        UNION ALL


                                                        SELECT 2 AS [C1]
                                                        FROM   (SELECT 1 AS X) AS [SingleRowTable2]) AS [UnionAll1]
                                                UNION ALL


                                                SELECT 3 AS [C1]
                                                FROM   (SELECT 1 AS X) AS [SingleRowTable3]) AS [UnionAll2]
                                        UNION ALL


                                        SELECT 4 AS [C1]
                                        FROM   (SELECT 1 AS X) AS [SingleRowTable4]) AS [UnionAll3]
                                UNION ALL


                                SELECT 5 AS [C1]
                                FROM   (SELECT 1 AS X) AS [SingleRowTable5]) AS [c]) AS [Limit1]
         ON 1 = 1
       LEFT OUTER JOIN (SELECT TOP (1) [c].[C1] AS [C1]
                        FROM   (SELECT [UnionAll7].[C1] AS [C1]
                                FROM   (SELECT [UnionAll6].[C1] AS [C1]
                                        FROM   (SELECT [UnionAll5].[C1] AS [C1]
                                                FROM   (SELECT 1 AS [C1]
                                                        FROM   (SELECT 1 AS X) AS [SingleRowTable6]
                                                        UNION ALL


                                                        SELECT 2 AS [C1]
                                                        FROM   (SELECT 1 AS X) AS [SingleRowTable7]) AS [UnionAll5]
                                                UNION ALL


                                                SELECT 3 AS [C1]
                                                FROM   (SELECT 1 AS X) AS [SingleRowTable8]) AS [UnionAll6]
                                        UNION ALL


                                        SELECT 4 AS [C1]
                                        FROM   (SELECT 1 AS X) AS [SingleRowTable9]) AS [UnionAll7]
                                UNION ALL


                                SELECT 5 AS [C1]
                                FROM   (SELECT 1 AS X) AS [SingleRowTable10]) AS [c]) AS [Limit2]
         ON 1 = 1
       CROSS JOIN (SELECT MAX([UnionAll12].[C1]) AS [A1]
                   FROM   (SELECT [UnionAll11].[C1] AS [C1]
                           FROM   (SELECT [UnionAll10].[C1] AS [C1]
                                   FROM   (SELECT [UnionAll9].[C1] AS [C1]
                                           FROM   (SELECT 1 AS [C1]
                                                   FROM   (SELECT 1 AS X) AS [SingleRowTable11]
                                                   UNION ALL


                                                   SELECT 2 AS [C1]
                                                   FROM   (SELECT 1 AS X) AS [SingleRowTable12]) AS [UnionAll9]
                                           UNION ALL


                                           SELECT 3 AS [C1]
                                           FROM   (SELECT 1 AS X) AS [SingleRowTable13]) AS [UnionAll10]
                                   UNION ALL


                                   SELECT 4 AS [C1]
                                   FROM   (SELECT 1 AS X) AS [SingleRowTable14]) AS [UnionAll11]
                           UNION ALL


                           SELECT 5 AS [C1]
                           FROM   (SELECT 1 AS X) AS [SingleRowTable15]) AS [UnionAll12]) AS [GroupBy1]
WHERE  [Extent1].[ContactID] IN ([GroupBy1].[A1], (CASE
                                                     WHEN ([Limit1].[C1] IS NULL) THEN 0
                                                     ELSE [Limit2].[C1]
                                                   END))

Based on this, is it possible that Linq2Entities provider indeed doesn't execute non-Linq and Linq-to-Object methods, but instead creates equivalent SQL statements for some of them ( and for others it throws an exception )?

Thank you in advance

Best Answer

Presumably the data provider looks at each sub-expression and if it can be computed it just does a compile and execute while it's constructing the query. This would be no different than saying:

where c.ContactID == (4 + 5)

That's because numbers in your example is an int[] type.

Interestingly this is only evaluated once before the query is generated, so it's only calculated once. If you did something like this:

where c.ContactID == myObject.NextID()

...then it evaluates myObject.NextID() during the construction of the query, so it only calls it once. I have witnessed, in Linq2NHibernate, that not only does it only call it once during the creating of the query, but the data provider has built-in query caching, and it won't re-evaluate it the next time you execute the same query. YMMV.

Also remember it's lazy evaluated, so if you never execute the query, the subexpression is never compiled and executed. Also, if you modify numbers after your statement but before executing the foreach then the query will be built using your modified value.


Edit:

Here's an article called Walkthrough: Creating an IQueryable LINQ Provider. Check the section on "Adding the Expression Evaluator". Specifically the Nominator class:

/// <summary> 
/// Performs bottom-up analysis to determine which nodes can possibly 
/// be part of an evaluated sub-tree. 
/// </summary> 

Basically it's searching the expression tree for anything that can be evaluated, and compiling and executing it, then replacing that sub-expression with a constant expression.

Here's the function that determines if an expression can be evaluated:

private static bool CanBeEvaluatedLocally(Expression expression)
{
    return expression.NodeType != ExpressionType.Parameter;
}

So as long as no sub-expression contains a parameter of the expression, then it can be evaluated.


Edit 2:

Based on your edit, it does look like it's putting the values from the array into a table in SQL and selecting MAX from that. I honestly have no idea why it would bother doing that when none of the elements of the sub-expression numbers.Max() are parameters. I certainly know that you can write an expression like numbers.Contains(c.ID) and that will actually create an IN clause such as c.ID IN (@p0, @p1, @p2, @p3, @p4) where @p0 = 1, @p1 = 2, etc.

So, to answer your question: "is it possible that Linq2Entities provider indeed doesn't execute non-Linq and Linq-to-Object methods, but instead creates equivalent SQL statements for some of them"

Yes, it's possible.

Related Topic