Entity Framework: Queries involving many to many relationship tables

Entity framework handles tables participating in Many to Many relationship in a nice manner. If the junction table (sometimes called bridge table, association table, link table, etc) only consists of the foreign keys and no other columns, then that table is abstracted by EF and the two sides get a navigational property exposing a collection of the other side. If the junction table contains other fields as well (e.g. the table has its own primary key column), then that junction table is included in the model with many-to-one relationships with both the participating tables. This post will present some common LINQ queries involving the many to many relationship tables in both scenarios: when the junction table is abstracted by EF, and when it is exposed. Read the rest of this entry »

LINQ: Where are my extension methods?

The powerful Extension Methods due to LINQ have made our development really simple and easy. However in some cases these methods are not available as depicted in the following screenshot.

Here are a few scenarios in which we may find our extension methods missing:

  • Iterating through all the controls in a Windows Form (or any other container control) via their Controls collection
  • Iterating through items in our classic non-generic collections (e.g. ArrayList) under System.Collections namespace
  • Iterating through Columns in a DataGridView control
  • Iterating through Matches while working with Regular expressions
  • Retrieving enumeration values via Enum.GetValues

Read the rest of this entry »

LINQ: How to write queries with complex join conditions

This small post is a follow-up of my previous post in which I discussed some common scenarios and how we can handle them in our LINQ queries. This time, we will see how to write queries that have not-so-simple conditions as join predicates. Note that LINQ directly supports only equi-joins, that is, joins based on equal conditions. Hence, for simple primary / foreign key fields, we can simply use the syntax:

from c in Customers
join o in Orders on c.CustomerID equals o.CustomerID

For composite primary / foreign keys, we need to create an anonymous type with same data type and member names in the join predicate:

from c in Customers
join o in Orders on new {c.CompanyID, c.CustomerID} equals {o.CompanyID, o.CustomerID}

However, sometimes, we have join predicates that do not check equality, e.g. our join condition is based on a between check as in the following T-SQL query.

Select ..
 from Order o
 join AccountingPeriod ap
  on o.OrderDate between ap.PeriodStart and ap.PeriodEnd

Such situation can be handled in LINQ by using an always true join condition and transferring the join predicate to the where part. Here’s the LINQ translation for the above SQL query:

from o in dc.Orders
join ap in dc.AccountingPeriods on true equals true
where o.OrderDate >= ap.PeriodStart && o.OrderDate <= ap.PeriodEnd
select …

Ane even better version as pointed by zn00p in the comments section is:

from o in dc.Orders
from ap in dc.AccountingPeriods 
where o.OrderDate >= ap.PeriodStart && o.OrderDate <= ap.PeriodEnd
select …

Notice that this concept is valid for inner joins only. Since for inner joins, the join predicate can be moved to the where part of the TSQL query. The above trick is valid for both LINQ to SQL and LINQ to Entities (Entity Framework).

LINQ: How to build complex queries utilizing deferred execution and anonymous types

While working with LINQ to SQL and LINQ to Entities (Entitiy Framework), I really appreciate and utilize the deferred execution capabilities, i.e. the queries are not materialized (sent to server) until they are needed. This great feature can help us to construct our query in several steps and then send a single optimized TSQL version to the database for processing. This approach is similar to our application of CTE (Common Table Expressions) in T-SQL, with which we can build our T-SQL queries in small manageable steps and then finally run a single optimized query on the database. This post will present some scenarios to show how we can utilize the deferred execution capabilities and anonymous types to build complex queries in LINQ.

Read the rest of this entry »

LINQ: How to get the latest/last record with a group by clause

Sometimes, we need to select a record that has the maximum or minimum value from a group. Some examples are:

  • We have a PersonOrders table and we want to find the last Order for each Person
  • We have a UserOperations table and we need to find the first Operation for each User
  • We have a Persons table and we need to find the person with maximum salary for each Department

Such scenarios are beautifully handled by TSQL’s Ranking_functions over partition by clauses. Here’s a typical example that uses Row_Number() function to assign a rank to each row per Person.

;With PersonOrderWithRank 
as
(
	Select *, Rnk = ROW_NUMBER() over (partition by PersonID order by OrderDate desc)
	from PersonOrders
)

Select * 
from PersonOrderWithRank
where Rnk=1

In LINQ, similar result can be achieved by using the let keyword. Here’s an example:

from p in PersonOrders
//where conditions or joins with other tables to be included here
group p by p.PersonID into grp
let MaxOrderDatePerPerson = grp.Max ( g=>g.OrderDate )

from p in grp
where p.OrderDate == MaxOrderDatePerPerson
select p

Another more compact method would be to retrieve the first record in the grouping like this:

from p in PersonOrders
//where conditions or joins with other tables to be included here
group p by p.PersonID into grp
select grp.OrderByDescending(g=>g.OrderDate).First()

The above LINQ approaches can be used for both LINQ To SQL as well as Entity Framework. Although the SQL and the LINQ approaches I described above are not exactly identical (since I used Row_Number() and not Rank() or Dense_Rank() ) but the purpose of this post is to provide an starting point to write similar queries using LINQ.

LINQ to SQL: Visual Studio designer failed to autogenerate .designer.cs data classes

I had a really strange observation today. I opened one of my LINQ to SQL dbml files, made some changes, and then saved it back to have my designer generated data classes updated. But instead of reflecting my changes in the .designer.cs class, Visual studio deleted that designer generated file. I tried several times but every time LINQ designer was deleting my autogenerated data classes. I googled and found an amazing answer by Marc Gravell at this stackoverflow question. The position of using statements was the source of problem!!! I had extended the LINQ generated partial classes and the first statement in that file was “using System”. I removed that and the VS designer was happy again.

So for anyone else that experiences the same problem, try moving your using statements after the namespace declaration. For example, if you have extended your data context or any other data class like this:

using System;
namespace MyNamespace
{
    partial class MyDataContext
    {
        ...
    }
    ...
}

Try rearranging the declarations like this: 

namespace MyNamespace
{
    using System;

    partial class MyDataContext
    {
        ...
    }
    ...
}

Hope this post helps someone else as well.

C#: Left outer joins with LINQ

I always considered Left Outer Join in LINQ to be complex until today when I had to use it in my application. I googled and the first result gave a very nice explanation. The only difference between ordinary joins (inner joins) and left joins in LINQ is the use of “join into” and “DefaultIfEmpty()” expressions.

Consider this very simple query (Assuming a scenario that not all the TimesheetLines are associated with a Job)

Select TL.EntryDate, TL.Hours, J.JobName
From TimeSheetLines TL
Left Join Jobs J on TL.JobNo=J.JobNo

A LINQ query using inner join is

var lines =
    from tl in db.TimeSheetLines
    join j  in db.Jobs on tl.JobNo equals j.JobNo
    where tl.ResourceNo == resourceNo

    select new
    {
        EntryDate = tl.EntryDate,
        Hours = tl.Hours,
        Job = j.JobName
    };

And a LINQ query performing left join is

var lines =
    from tl in db.TimeSheetLines
    join j  in db.Jobs on tl.JobNo equals j.JobNo into tl_j
    where tl.ResourceNo == resourceNo

    from j in tl_j.DefaultIfEmpty()
    select new
    {
        EntryDate = tl.EntryDate,
        Hours = tl.Hours,
        Job = j.JobName
    };

Notice that the only difference is the use of “into” with the join statement followed by reselecting the result using “DefaultIfEmpty()” expression. And here’s the generated SQL for the above LINQ expression.

SELECT [t0].[EntryDate] as [EntryDate], [t0].[Hours] as [Hours], [t1].[JobName] AS [Job]
FROM [dbo].[TimeSheetLine] AS [t0]
LEFT OUTER JOIN [dbo].[Jobs] AS [t1] ON [t0].[JobNo] = [t1].[JobNo]
WHERE [t0].[ResourceNo] = @p0

Another LINQ version which is more compact is:

var lines =
    from tl in db.TimeSheetLines
    from j in db.Jobs.Where(j=>j.JobNo == tl.JobNo).DefaultIfEmpty()
    select new
    {
        EntryDate = tl.EntryDate,
        Hours = tl.Hours,
        Job = j.JobName
    };

Similarly, this concept can be expanded for multiple left joins. Assuming that a TimeSheetLine will either have a JobNo or an IndirectCode, consider this SQL query:

Select TL.EntryDate, TL.Hours, J.JobName, I.IndirectName
From TimeSheetLines TL
Left Join Jobs J on TL.JobNo=J.JobNo
Left Join Indirects I on TL.IndirectCode=I.IndirectCode

The equivalent LINQ query is:

var lines =
    from tl in db.TimeSheetLines
    join j in db.Jobs      on tl.JobNo        equals j.JobNo         into tl_j
    join i in db.Indirects on tl.IndirectCode equals i.IndirectCode  into tl_i
    where tl.ResourceNo == resourceNo

    from j in tl_j.DefaultIfEmpty()
    from i in tl_i.DefaultIfEmpty()
    select new
    {
        EntryDate = tl.EntryDate,
        Hours = tl.Hours,
        Job = j.JobName,
        Indirect = i.IndirectName,
    };

And the generated SQL is:

SELECT [t0].[EntryDate] as [EntryDate], [t0].[Hours] as [Hours], [t1].[JobName] AS [Job], [t2].[IndirectName] As [Indirect]
LEFT OUTER JOIN [dbo].[Jobs] AS [t1] ON [t0].[JobNo] = [t1].[JobNo]
LEFT OUTER JOIN [dbo].[Indirects] AS [t2] ON [t0].[IndirectCode] = [t2].[IndirectCode]
WHERE [t0].[ResourceNo] = @p0

That’s all, left outer joins in LINQ are as easy as in T-SQL. Happy joining.

Update:
Notice that this post describes the approach to perform a Left Outer Join in LINQ To SQL as well as Entity Framework (version 4). The same is not true for Entity Framework version 3.5 since it does not support the DefaultIfEmpty keyword. To perform Left Outer Joins with Entity Framework 3.5, we need to create appropriate relationships (e.g 0..1 to 0..Many) in our Entity Model and they will be automatically translated into TSQL’s Left Join clause.