The earlier sections described the basic LINQ commands that you might expect to use regularly, but there’s much more to LINQ than these simple queries. The following sections describe some of the more advanced LINQ commands that are less intuitive and that you probably won’t need to use as often.
The Join keyword selects data from multiple data sources matching up corresponding fields. The following pseudo-code shows the Join command’s syntax:
From variable1 In data source1
Join variable2 In data source2
On variable1.field1 Equals variable2.field2
For example, the following query selects objects from the all_customers list. For each object it finds, it also selects objects from the all_orders list where the two records have the same CustId value.
Dim query = From cust As Customer In all_customers
Join ord In all_orders
On cust.CustId Equals ord.CustId
A LINQ Join is similar to a SQL join except the On clause only allows you to select objects where fields are equal and the Equals keyword is required.
The following query selects a similar set of objects without using the Join keyword. Here the Where clause makes the link between the all_customers and all_orders lists:
Dim query = From cust As Customer In all_customers, ord In all_orders
Where cust.CustId = ord.CustId
This is slightly more flexible because the Where clause can make tests that are more complicated than the Join statement’s Equals clause.
The Group Join statement selects data much as a Join statement does, but it returns the results differently. The Join statement returns an IEnumerable object that holds whatever is selected by the query (the cust and ord objects in this example).
The Group Join statement returns the same objects but in a different arrangement. Each item in the IEnumerable result contains an object of the first type (cust in this example) plus another IEnumerable that holds the corresponding objects of the second type (ord in this example).
For example, the following query selects customers and their corresponding orders much as the earlier examples do. The new clause Into CustomerOrders means the IEnumerable containing the orders for each customer should be called CustomerOrders. The = Group part means CustomerOrders should contain the results of the grouping.
Dim query =
From cust In all_customers
Group Join ord In all_orders
On cust.CustId Equals ord.CustId
Into CustomerOrders = Group
The following code shows how a program might display these results:
For Each c In query
' Display the customer.
Debug.WriteLine(c.cust.ToString())
' Display the customer's orders.
For Each o In c.CustomerOrders
Debug.WriteLine(Space$(4) & "OrderId: " & o.OrderId &
", Date: " & o.OrderDate & vbCrLf
Next o
Next c
Each item in the main IEnumerable contains a cust object and an IEnumerable named CustomerOrders. Each CustomerOrders object contains ord objects corresponding to the cust object.
This code loops through the query’s results. Each time through the loop, it displays the cust object’s information and then loops through its CustomerOrders, displaying each ord object’s information indented.
Example program JoinExamples, which is available for download on the book’s website, demonstrates these types of Join queries.
Like the Group Join clause, the Group By clause lets a program select data from a flat, relational style format and build a hierarchical arrangement of objects. It also returns an IEnumerable that holds objects, each containing another IEnumerable.
The following code shows the basic Group By syntax:
From variable1 In datasource1
Group items By value Into groupname = Group
Here, items is a list of items whose properties you want selected into the group. In other words, the properties of the items variables are added to the objects in the nested IEnumerable.
If you omit the items parameter, the query places the objects selected by the rest of the query into the nested IEnumerable.
The value property tells LINQ on what field to group objects. This value is also stored in the top-level IEnumerable values.
The groupname parameter gives a name for the group. The objects contained in the top-level IEnumerable get a property with this name that is an IEnumerable containing the grouped values.
Finally, the = Group clause indicates that the group should contain the fields selected by the query.
If this definition seems a bit confusing, an example should help. The following query selects objects from the all_orders list. The Group By statement makes the query group orders with the same CustId value.
Dim query1 = From ord In all_orders
Order By ord.CustId, ord.OrderId
Group ord By ord.CustId Into CustOrders = Group
The result is an IEnumerable that contains objects with two fields. The first field is the CustId value used to define the groups. The second field is an IEnumerable named CustOrders that contains the group of order objects for each CustId value.
The following code shows how a program might display the results in a TreeView control:
Dim root1 As TreeNode = trvResults.Nodes.Add("Orders grouped by CustId")
For Each obj In query1
' Display the customer id.
Dim cust_node As TreeNode = root1.Nodes.Add("Cust Id: " & obj.CustId)
' List this customer's orders.
For Each ord In obj.CustOrders cust_node.Nodes.Add("OrderId: " & ord.OrderId &
", Date: " & ord.OrderDate)
Next ord
Next obj
The code loops through the top-level IEnumerable. Each time through the loop, it displays the group’s CustId and then loops through the group’s CustOrders IEnumerable displaying each order’s ID and date.
Example program SimpleGroupBy, which is available for download on the book’s website, demonstrates this type of Group By statement.
Another common type of query uses the Group By clause to apply some aggregate function to the items selected in a group. The following query selects order and order item objects, grouping each order’s items and displaying each order’s total price:
Dim query1 = From ord In all_orders, ord_item In all_order_items
Order By ord.CustId, ord.OrderId
Where ord.OrderId = ord_item.OrderId
Group ord_item By ord Into
TotalPrice = Sum(ord_item.Quantity * ord_item.UnitPrice),
OrderItems = Group
The query selects objects from the all_orders and all_order_items lists using a Where clause to join them.
The Group ord_item piece places the fields of the ord_item object in the group. The By ord piece makes each group hold items for a particular ord object.
The Into clause selects two values. The first is a sum over all of the group’s ord_item objects adding up the ord_items’ Quantity times UnitPrice fields. The second value selected is the group named OrderItems.
The following code shows how a program might display the results in a TreeView control named trvResults:
Dim root1 As TreeNode = trvResults.Nodes.Add("Orders")
For Each obj In query1
' Display the order id.
Dim cust_node As TreeNode =
root1.Nodes.Add("Order Id: " & obj.ord.OrderId &
", Total Price: " & FormatCurrency(obj.TotalPrice))
' List this order's items.
For Each ord_item In obj.OrderItems
cust_node.Nodes.Add(ord_item.Description & ": " &
ord_item.Quantity & " @ " & FormatCurrency(ord_item.UnitPrice))
Next ord_item
Next obj
Each loop through the query results represents an order. For each order, the program creates a tree node showing the order’s ID and the TotalPrice value that the query calculated for it.
Next, the code loops through the order’s items stored in the OrderItems group. For each item, it creates a tree node showing the item’s Description, Quantity, and TotalPrice fields.
Example program GroupByWithTotals, which is available for download on the book’s website, demonstrates this Group By statement.
The preceding section explained how a Group By query can use the Sum aggregate function. LINQ also supports the reasonably self-explanatory aggregate functions Average, Count, LongCount, Max, and Min.
The following query selects order objects and their corresponding order items. It uses a Group By clause to calculate aggregates for each of the orders’ items.
Dim query1 = From ord In all_orders, ord_item In all_order_items
Order By ord.CustId, ord.OrderId
Where ord.OrderId = ord_item.OrderId
Group ord_item By ord Into
TheAverage = Average(ord_item.UnitPrice * ord_item.Quantity),
TheCount = Count(ord_item.UnitPrice * ord_item.Quantity),
TheLongCount = LongCount(ord_item.UnitPrice * ord_item.Quantity),
TheMax = Max(ord_item.UnitPrice * ord_item.Quantity),
TheMin = Min(ord_item.UnitPrice * ord_item.Quantity),
TheSum = Sum(ord_item.Quantity * ord_item.UnitPrice)
The following code loops through the query’s results and adds each order’s aggregate values to a string named txt. It displays the final results in a text box named txtResults.
For Each obj In query1
' Display the order info.
txt &= "Order " & obj.ord.OrderId &
", Average: " & obj.TheAverage &
", Count: " & obj.TheCount &
", LongCount: " & obj.TheLongCount &
", Max: " & obj.TheMax &
", Min: " & obj.TheMin &
", Sum: " & obj.TheSum &
vbCrLf
Next obj
txtResults.Text = txt
If you add the Distinct keyword to a query, LINQ keeps only one instance of each value selected. For example, the following query returns a list of IDs for customers who placed an order before 4/15/2012:
Dim query = From ord In all_orders
Where ord.OrderDate < #4/15/2012#
Select ord.CustId
Distinct
The code examines objects in the all_orders list with OrderDate fields before 4/15/2012. It selects those objects’ CustId fields and uses the Distinct keyword to remove duplicates. If a particular customer placed several orders before 4/15/2012, this query lists that customer’s ID only once.
LINQ also provides Union, Intersection, and Except extension methods, but they are not supported by Visual Basic’s LINQ syntax. See the section “LINQ Functions” later in this chapter for more information.
Example program SetExamples, which is available for download on the book’s website, demonstrates these set operations.
LINQ includes several keywords for limiting the results returned by a query.
The following code demonstrates each of these commands:
Dim q1 = From cust In all_customers Take 5
Dim q2 = From cust In all_customers Take While cust.FirstName.Contains("n")
Dim q3 = From cust In all_customers Skip 3
Dim q4 = From cust In all_customers Skip While cust.FirstName.Contains("n")
The first query selects the first five customers and ignores the rest.
The second query selects customers as long as the FirstName field contains the letter “n.” It then discards any remaining results, even if a later customer’s FirstName contains an “n.”
The third query discards the first three customers and then selects the rest.
The final query skips customers as long as their FirstName values contain the letter “n” and then keeps the rest.
Example program LimitingExamples, which is available for download on the book’s website, demonstrates these commands.