LINQ changes the way you write code. This is something that you will often hear when talking about LINQ. It provides the ability to query different data sources and take advantage of the same syntax, background compiler, IntelliSense, and Common Language Runtime (CLR) control with its intuitive programming model. LINQ’s syntax is so straightforward that you become familiar with the technology quickly. In this chapter you learn about LINQ to Objects, which is the starting point of every LINQ discussion. You see how easy querying .NET objects, such as in-memory collections, is. In the next chapters, you’ll find out how you can use the same approach against different data sources and LINQ providers. Take your time to read this chapter; it’s important—especially the second part, which is about standard query operators. It explains concepts that pervade every LINQ provider and will not be explained again in other chapters, except where expressly required.
In the previous chapter, I provided an overview of the LINQ technology and told you that it provides a unified programming model for querying various types of data sources using the same syntax constructs. You got a few examples of LINQ syntax, but now you’ll get to see LINQ in action in different scenarios and with more examples. This chapter is about LINQ to Objects, which is the standard provider for querying in-memory objects. This definition considers collections, arrays, and any other object that implements the IEnumerable
or IEnumerable(Of T)
interface (or interfaces deriving from them). LINQ to Objects can be considered as the root of LINQ providers, and it’s important to understand how it works because the approach is essentially the same in accessing databases and XML documents. We focus on code more than on introducing the provider with annoying discussions, so let’s start.
LINQ to Object’s purpose is querying in-memory collections in a strongly typed fashion using recently added keywords that recall SQL instructions syntax and that are now integrated in the Visual Basic language. This enables the compiler to manage your actions at compile time. Before querying data, you need a data source. For example, imagine you have the following Product
class that represents some food products of your company:
Class Product
Property ProductID As Integer
Property ProductName As String
Property UnitPrice As Decimal
Property UnitsInStock As Integer
Property Discontinued As Boolean
End Class
Coding Tip: Using Object Initializers
In this chapter and the next ones dedicated to LINQ, you notice that in most cases classes do not implement an explicit constructor. You see the advantages of object initializers in both normal code and in LINQ query expressions, which is the reason custom classes have no explicit constructors. This is not a mandatory rule; instead it is an approach specific to LINQ that you need to be familiar with.
At this point consider the following products, as a demonstration:
Dim prod1 As New Product With {.ProductID = 0,
.ProductName = "Pasta",
.UnitPrice = 0.5D,
.UnitsInStock = 10,
.Discontinued = False}
Dim prod2 As New Product With {.ProductID = 1,
.ProductName = "Mozzarella",
.UnitPrice = 1D,
.UnitsInStock = 50,
.Discontinued = False}
Dim prod3 As New Product With {.ProductID = 2,
.ProductName = "Crabs",
.UnitPrice = 7D,
.UnitsInStock = 20,
.Discontinued = True}
Dim prod4 As New Product With {.ProductID = 3,
.ProductName = "Tofu",
.UnitPrice = 3.5D,
.UnitsInStock = 40,
.Discontinued = False}
The code is simple; it just creates several instances of the Product
class populating its properties with some food names and characteristics. Usually, you collect instances of your products in a typed collection. The following code accomplishes this, taking advantage of collection initializers:
Dim products As New List(Of Product) From {prod1,
prod2,
prod3,
prod4}
Because the List(Of T)
collection implements IEnumerable(Of T)
, it can be queried with LINQ. The following query shows how you can retrieve all non-discontinued products in which the UnitsInStock
property value is greater than 10:
Dim query = From prod In products
Where prod.UnitsInStock > 10 _
And prod.Discontinued = False
Order By prod.UnitPrice
Select prod
Coding Tip: Implicit Line Continuation
In LINQ queries, you can take advantage of the feature known as implicit line continuation that avoids the need of writing the underscore character at the end of a line. An exception in LINQ queries is when you use logical operators, as in the preceding code snippet, in which the underscore is required. The code provides an easy view of this necessity. As an alternative, you can place the logical operator (And
, in the example) on the preceding line to avoid the underscore.
This kind of LINQ query is also known as query expression. The From
keyword points to the data source; the prod
identifier represents one product in the products list. The Where
keyword allows filtering data in which the specified condition is evaluated to True
; the Order By
keywords allow sorting data according to the specified property. Select
pushes each item that matches the specified Where
conditions into an IEnumerable(Of T)
result. Notice how local type inference avoids the need of specifying the query result type that is inferred by the compiler as IEnumerable(Of Product)
. Later in this chapter you will see why type inference is important in LINQ queries for anonymous types’ collections. At this point, you can work with the result of your query; for example, you can iterate the preceding query
variable to get information on the retrieved products:
For Each prod In query
Console.WriteLine("Product name: {0}, Unit price: {1}",
prod.ProductName, prod.UnitPrice)
Next
This code produces on your screen a list of products that are not discontinued and where there is a minimum of 11 units in stock. For your convenience, Listing 23.1 shows the complete code for this example, providing a function that returns the query result via the Return
instruction. Iteration is executed later within the caller.
Module Module1
Sub Main()
Dim result = QueryingObjectsDemo1()
For Each prod In result
Console.WriteLine("Product name: {0}, Unit price: {1}",
prod.ProductName, prod.UnitPrice)
Next
Console.Readline()
End Sub
Function QueryingObjectsDemo1() As IEnumerable(Of Product)
Dim prod1 As New Product With {.ProductID = 0,
.ProductName = "Pasta",
.UnitPrice = 0.5D,
.UnitsInStock = 10,
.Discontinued = False}
Dim prod2 As New Product With {.ProductID = 1,
.ProductName = "Mozzarella",
.UnitPrice = 1D,
.UnitsInStock = 50,
.Discontinued = False}
Dim prod3 As New Product With {.ProductID = 2,
.ProductName = "Crabs",
.UnitPrice = 7D,
.UnitsInStock = 20,
.Discontinued = True}
Dim prod4 As New Product With {.ProductID = 3,
.ProductName = "Tofu",
.UnitPrice = 3.5D,
.UnitsInStock = 40,
.Discontinued = False}
Dim products As New List(Of Product) From {prod1,
prod2,
prod3,
prod4}
Dim query = From prod In products
Where prod.UnitsInStock > 10 _
And prod.Discontinued = False
Order By prod.UnitPrice
Select prod
Return query
End Function
End Module
Class Product
Property ProductID As Integer
Property ProductName As String
Property UnitPrice As Decimal
Property UnitsInStock As Integer
Property Discontinued As Boolean
End Class
If you run the code in Listing 23.1, you get the following result:
Product name: Mozzarella, Unit price: 1
Product name: Tofu, Unit price: 3.5
Such a result contains only those products that are not discontinued and that are available in more than 11 units. You can perform complex queries with LINQ to Objects.
In Chapter 16, “Working with Collections and Iterators,” you learned about the new feature of Iterators in Visual Basic 2012 and the Yield
keyword. LINQ queries are the perfect candidates for using iterator functions because, in most cases, you will write For..Each
loops against query results. You can rewrite the sample shown in Listing 23.1 by decorating the QueryingObjectsDemo1
method with the Iterator modifier and then replace the Return
statement with the following code:
For Each item In query
Yield item
Next
In this way, your code will be much more efficient and responsive in case of very large in-memory collections.
The following example provides a LINQ to Objects representation of what you get with relational databases and LINQ to SQL or LINQ to Entities. Consider the following class that must be added to the project:
Class ShippingPlan
Property ProductID As Integer
Property ShipDate As Date
End Class
The purpose of the ShippingPlan
is storing the ship date for each product, represented by an ID. Both the ShippingPlan
and Product
classes expose a ProductID
property that provides a basic relationship. Now consider the following code that creates four instances of the ShippingPlan
class, one for each product and a collection of items:
Dim shipPlan1 As New ShippingPlan With {.ProductID = 0,
.ShipDate = New Date(2012, 1, 1)}
Dim shipPlan2 As New ShippingPlan With {.ProductID = 1,
.ShipDate = New Date(2012, 2, 1)}
Dim shipPlan3 As New ShippingPlan With {.ProductID = 2,
.ShipDate = New Date(2012, 3, 1)}
Dim shipPlan4 As New ShippingPlan With {.ProductID = 3,
.ShipDate = New Date(2012, 4, 1)}
Dim shipPlans As New List(Of ShippingPlan) From {
shipPlan1,
shipPlan2,
shipPlan3,
shipPlan4}
At this point, the goal is to retrieve a list of product names and the related ship date. This can be accomplished as follows:
Dim queryPlans = From prod In products
Join plan In shipPlans On plan.ProductID Equals prod.ProductID
Select New With {.ProductName = prod.ProductName,
.ShipDate = plan.ShipDate}
As you can see, the Join
clause enables the joining of data from two different data sources having in common one property. This works similarly to the JOIN
SQL instruction. Notice how you can take advantage of anonymous types to generate a new type on-the-fly that stores only the necessary information, without the need of creating a custom class for handling that information. The problem is now another one. If you need to use such a query result within a method body, no problem. The compiler can distinguish which members and how many members an anonymous type exposes so that you can use these members in a strongly typed way. The problem is when you need to return a query result that is the result of a function. You cannot declare a function as an IEnumerable(Of anonymous type)
, so you should return a nongeneric IEnumerable
, which returns an IEnumerable(Of Object)
. Therefore, you cannot invoke members from anonymous types except if you recur to late binding. This makes sense because anonymous types’ members have visibility only within the parent member that defines them. To solve this problem, you need to define a custom class holding query results. For example, consider the following class:
Class CustomProduct
Property ProductName As String
Property ShipDate As Date
End Class
It stores information from both Product
and ShippingPlan
classes. Now consider the following query:
Dim queryPlans = From prod In products
Join plan In shipPlans On plan.ProductID Equals prod.ProductID
Select New CustomProduct With {.ProductName = prod.ProductName,
.ShipDate = plan.ShipDate}
It creates an instance of the CustomProduct
class each time an object matching the condition is encountered. In this way, you can return the query result as the result of a function returning IEnumerable(Of CustomProduct)
. This scenario is represented for your convenience in Listing 23.2.
Function QueryObjectsDemo2() As IEnumerable(Of CustomProduct)
Dim prod1 As New Product With {.ProductID = 0,
.ProductName = "Pasta",
.UnitPrice = 0.5D,
.UnitsInStock = 10,
.Discontinued = False}
Dim prod2 As New Product With {.ProductID = 1,
.ProductName = "Mozzarella",
.UnitPrice = 1D,
.UnitsInStock = 50,
.Discontinued = False}
Dim prod3 As New Product With {.ProductID = 2,
.ProductName = "Crabs",
.UnitPrice = 7D,
.UnitsInStock = 20,
.Discontinued = True}
Dim prod4 As New Product With {.ProductID = 3,
.ProductName = "Tofu",
.UnitPrice = 3.5D,
.UnitsInStock = 40,
.Discontinued = False}
Dim products As New List(Of Product) From {prod1,
prod2,
prod3,
prod4}
Dim shipPlan1 As New ShippingPlan With {.ProductID = 0,
.ShipDate = New Date(2012, 1, 1)}
Dim shipPlan2 As New ShippingPlan With {.ProductID = 1,
.ShipDate = New Date(2012, 2, 1)}
Dim shipPlan3 As New ShippingPlan With {.ProductID = 2,
.ShipDate = New Date(2012, 3, 1)}
Dim shipPlan4 As New ShippingPlan With {.ProductID = 3,
.ShipDate = New Date(2012, 4, 1)}
Dim shipPlans As New List(Of ShippingPlan) From {
shipPlan1,
shipPlan2,
shipPlan3,
shipPlan4}
Dim queryPlans = From prod In products
Join plan In shipPlans On _
plan.ProductID Equals prod.ProductID
Select New CustomProduct _
With {.ProductName = prod.ProductName,
.ShipDate = plan.ShipDate}
Return queryPlans
End Function
Sub QueryPlans()
Dim plans = QueryObjectsDemo2()
For Each plan In plans
Console.WriteLine("Product name: {0} will be shipped on {1}",
plan.ProductName, plan.ShipDate)
Next
Console.ReadLine()
End Sub
'If you invoke the QueryPlans method, you get the following output:
'Product name: Pasta will be shipped on 01/01/2012
'Product name: Mozzarella will be shipped on 02/01/2012
'Product name: Crabs will be shipped on 03/01/2012
'Product name: Tofu will be shipped on 04/01/2012
You would obtain the same result with anonymous types if the iteration were performed within the method body and not outside the method itself. This approach is always useful and becomes necessary in scenarios such as LINQ to XML on Silverlight applications. LINQ to Objects offers a large number of operators, known as standard query operators; these are discussed in this chapter. An important thing you need to consider is that you can also perform LINQ queries via extension methods. Language keywords for LINQ have extension methods counterparts that can be used with lambda expressions for performing queries. The following snippet provides an example of how the previous query expression can be rewritten invoking extension methods:
Dim query = products.Where(Function(p) p.UnitsInStock > 10 And _
p.Discontinued = False).
OrderBy(Function(p) p.UnitPrice).
Select(Function(p) p)
Notice how extension methods are instance methods of the data source you are querying. Each method requires a lambda expression that returns instances of the Product
class, letting you perform the required tasks. Before studying operators, there is an important concept that you must understand; it’s related to the actual moment when queries are executed.
When the Common Language Runtime (CLR) encounters a LINQ query, the query is not executed immediately. LINQ queries are executed only when they are effectively used. This concept is known as deferred execution and is part of all LINQ providers you encounter, both standard and custom ones. For example, consider the query that is an example in the previous discussion:
Dim query = From prod In products
Where prod.UnitsInStock > 10 _
And prod.Discontinued = False
Order By prod.UnitPrice
Select prod
This query is not executed until you effectively use its result. If the query is defined within a function and is the result of the method, the Return
instruction causes the query to be executed:
Dim query = From prod In products
Where prod.UnitsInStock > 10 _
And prod.Discontinued = False
Order By prod.UnitPrice
Select prod
'The query is executed here:
Return query
Iterating the result also causes the query to be executed:
Dim query = From prod In products
Where prod.UnitsInStock > 10 _
And prod.Discontinued = False
Order By prod.UnitPrice
Select prod
'The query is executed here, the Enumerator is invoked
For Each prod In query
Console.WriteLine("Product name: {0}, Unit price: {1}",
prod.ProductName, prod.UnitPrice)
Next
Another example of query execution is when you just invoke a member of the result as in the following example:
Console.WriteLine(query.Count)
You can also force queries to be executed when declared, invoking methods on the query itself. For example, converting a query result into a collection causes the query to be executed, as demonstrated here:
Dim query = (From prod In products
Where prod.UnitsInStock > 10 _
And prod.Discontinued = False
Order By prod.UnitPrice
Select prod).ToList 'the query is executed here
This is also important at debugging time. For example, consider the case where you have a query and want to examine its result while debugging, taking advantage of Visual Studio’s data tips. If you place a breakpoint on the line of code immediately after the query and then pass the mouse pointer over the variable that receives the query result, the data tips feature pops up a message saying that the variable is an in-memory query and that clicking to expand the result processes the collection. This is shown in Figure 23.1.
At this point, the debugger executes the query in memory. When executed, you can inspect the query result before it is passed to the next code. Data tips enable you to examine every item in the collection. Figure 23.2 demonstrates this.
The preceding discussion is also valid if you are adding query result variables to the Watch window.
Deferred execution is a key topic in LINQ development, and you always need to keep in mind how it works to avoid problems in architecting your code.
Querying objects with LINQ (as much as XML documents and ADO.NET models) is accomplished via the standard query operators that are a set of Visual Basic keywords allowing performing both simple and complex tasks within LINQ queries. This chapter covers standard query operators illustrating their purpose. For this, remember that this topic is important because you need operators in other LINQ providers as well. You can also notice that standard query operators have extension methods counterparts that you can use as well.
Coding Tips
The following examples consistently use local type inference and array literals. They dramatically speed up writing LINQ queries, and therefore using such a feature is something that you should try in practical examples. Nothing prevents you from using old-fashioned coding techniques, but showing the power of Visual Basic 2012 is one of this book’s purposes.
LINQ query expressions extract values from data sources and then push results into a sequence of elements. This operation of pushing items into a sequence is known as projection. Select
is the projection operator for LINQ queries. Continuing the example of products shown in the previous section, the following query creates a new sequence containing all product names:
Dim productNames = From prod In products
Select prod.ProductName
The query result is now an IEnumerable(Of String)
. If you need to create a sequence of objects, you can select each single item as follows:
Dim productSequence = From prod In products
Select prod
This returns an IEnumerable(Of Product)
. You can also pick more than one member for an item:
Dim productSequence = From prod In products
Select prod.ProductID, prod.ProductName
This returns an IEnumerable(Of Anonymous type)
. Of course, for an anonymous type, you can use the extended syntax that also allows specifying custom properties names:
Dim productSequence = From prod In products
Select ID = prod.ProductID,
Name = prod.ProductName
This is the equivalent of the following:
Dim productSequence = From prod In products
Select New With {.ID = prod.ProductID,
.Name = prod.ProductName}
An extension counterpart exists for Select
, which is the Select
extension method that receives a lambda expression as an argument that allows specifying the object or member that must be projected into the sequence and that works like this:
'IEnumerable(Of String)
Dim prodSequence = products.Select(Function(p) p.ProductName)
Query expressions can require more complex projections, especially if you work on different sources. Another projection operator known as SelectMany
is an extension method, but it can be performed in expressions, too. To continue with the next examples, refer to the “Querying In-Memory Objects” section and retake the ShippingPlan
and Product
classes and code that populates new collections of such objects. After you’ve done this, consider the following query:
Dim query = From prod In products
From ship In shipPlans
Where prod.ProductID = ship.ProductID
Select prod.ProductName, ship.ShipDate
With nested From
clauses, we can query different data sources, and the Select
clause picks data from both collections, acting as SelectMany
that has an extension method counterpart that accepts lambda expressions as arguments pointing to the desired data sources.
LINQ offers an operator named Where
that allows filtering query results according to the specified condition. For example, continuing with the previous examples of a collection of products, the following code returns only non-discontinued products:
Dim query = From prod In products
Where prod.Discontinued = False
Select prod
The same result can be accomplished by invoking a same-named extension method that works as follows:
Dim result = products.Where(Function(p) p.Discontinued = False).
Select(Function(p) p)
Where
supports lots of operators on the line that are summarized in Table 23.1.
The following code provides a more complex example of filtering using Where
and logical operators to retrieve the list of executable files that have been accessed within two dates:
Dim fileList = From item In My.Computer.FileSystem.
GetDirectoryInfo("C:").GetFiles
Where item.LastAccessTime < Date.Today _
AndAlso item.LastAccessTime > New Date(2009, 9, 10) _
AndAlso item.FullName Like "*.exe"
Select item
The preceding code uses AndAlso
to ensure that the three conditions are True
. Using AndAlso
shortcircuiting offers the benefit of making evaluations more efficient in one line. Notice how Like
is used to provide a pattern comparison with the filename. You find a lot of examples about Where
in this book, so let’s discuss other operators.
Aggregation operators allow performing simple mathematic calculations on a sequence’s items using the Aggregate
and Into
clauses. The combination of such clauses can affect the following methods:
• Sum
, which returns the sum of values of the specified property for each item in the collection
• Average
, which returns the average calculation of values of the specified property for each item in the collection
• Count
and LongCount
, which return the number of items within a collection, respectively as Integer
and Long
types
• Min
, which returns the lowest value for the specified sequence
• Max
, which returns the highest value for the specified sequence
For example, you can get the sum of unit prices for your products as follows:
'Returns the sum of product unit prices
Dim totalAmount = Aggregate prod In products
Into Sum(prod.UnitPrice)
You need to specify the object property that must be affected by the calculation (UnitPrice
in this example) and that also works the same way in other aggregation operators. The following code shows how you can retrieve the average price:
'Returns the average price
Dim averagePrice = Aggregate prod In products
Into Average(prod.UnitPrice)
The following snippet shows how you can retrieve the number of products in both Integer
and Long
formats:
'Returns the number of products
Dim numberOfItems = Aggregate prod In products
Into Count()
'Returns the number of products as Long
Dim longNumberOfItems = Aggregate prod In products
Into LongCount()
The following code shows how you can retrieve the lowest and highest prices for your products:
'Returns the lowest value for the specified
'sequence
Dim minimumPrice = Aggregate prod In products
Into Min(prod.UnitPrice)
'Returns the highest value for the specified
'sequence
Dim maximumPrice = Aggregate prod In products
Into Max(prod.UnitPrice)
All the preceding aggregation operators have extension method counterparts that work similarly. For example, you can compute the minimum unit price as follows:
Dim minimum = products.Min(Function(p) p.UnitPrice)
Such extension methods require you to specify a lambda expression pointing to the member you want to be part of the calculation. Other extension methods work the same way.
Note
When using aggregation operators, you do not get back an IEnumerable
type. You instead get a single value type.
The Visual Basic syntax offers a keyword named Let
that can be used for defining temporary identifiers within query expressions. The following code shows how you can query Windows Forms controls to get a sequence of text boxes:
Dim query = From ctrl In Me.Controls _
Where TypeOf (ctrl) Is TextBox _
Let txtBox = DirectCast(ctrl, TextBox) _
Select txtBox.Name
The Let
keyword enables the defining of a temporary identifier so that you can perform multiple operations on each item of the sequence and then invoke the item by its temporary identifier.
LINQ query results are returned as IEnumerable(Of T)
(or IQueryable(Of T)
, as you see in the next chapters), but you often need to convert this type into a most appropriate one. For example, query results cannot be edited unless you convert them into a typed collection. To do this, LINQ offers some extension methods whose job is converting query results into other .NET types such as arrays or collections. Let’s consider the Products
collection of the previous section’s examples and first perform a query expression that retrieves all products that are not discontinued:
Dim query = From prod In products
Where prod.Discontinued = False
Select prod
The result of this query is IEnumerable(Of Product)
. The result can easily be converted into other .NET types. First, you can convert it into an array of Product
invoking the ToArray
extension method:
'Returns Product()
Dim productArray = query.ToArray
Similarly, you can convert the query result into a List(Of T)
invoking ToList
. The following code returns a List(Of Product)
collection:
'Returns List(Of Product)
Dim productList = query.ToList
You can perform a more complex conversion with ToDictionary
and ToLookup
. ToDictionary
generates a Dictionary(Of TKey, TValue)
and receives only an argument that, via a lambda expression, specifies the key for the dictionary. The value part of the key/value pair is always the type of the query (Product
in this example). This is an example:
'Returns Dictionary(Of Integer, Product)
Dim productDictionary = query.ToDictionary(Function(p) _
p.ProductID)
Because the value is a typed object, the Value
property of each KeyValuePair
in the Dictionary
is an instance of your type; therefore, you can access members from Value
. The following snippet demonstrates this:
For Each prod In productDictionary
Console.WriteLine("Product ID: {0}, name: {1}", prod.Key,
prod.Value.ProductName)
Next
The next operator is ToLookup
that returns an ILookup(Of TKey, TElement)
, where TKey
indicates a key similarly to a Dictionary
and TElement
represents a sequence of elements. Such a type can be used in mapping one-to-many relationships between an object. Continuing the example of the Product class, we can provide an elegant way for getting the product name based on the ID. Consider the following code snippet that returns an ILookup(Of Integer, String)
:
Dim productLookup = query.ToLookup(Function(p) p.ProductID, _
Function(p) p.ProductName & " has " & _
p.UnitsInStock & " units in stock")
We can now query the products sequence based on the ProductID
property and extract data such as product name and units in stock. Because of the particular structure of ILookup
, a nested For..Each
loop is required and works like the following:
For Each prod In productLookup
Console.WriteLine("Product ID: {0}", prod.Key)
For Each item In prod
Console.WriteLine(" {0}", item)
Next
Next
Prod
is of type IGrouping(Of Integer, String)
, a type that characterizes a single item in an ILookup
. If you run this code, you get the following result:
Product ID: 0
Pasta has 10 units in stock
Product ID: 1
Mozzarella has 50 units in stock
Product ID: 3
Tofu has 40 units in stock
Opposite from operators that convert into typed collections or arrays, two methods convert from typed collections into IEnumerable
or IQueryable
. They are named AsEnumerable
and AsQueryable
, and their usage is pretty simple:
Dim newList As New List(Of Product)
'Populate your collection here..
Dim anEnumerable = newList.AsEnumerable
Dim aQueryable = newList.AsQueryable
Another operator filters a sequence and retrieves only items of the specified type, generating a new sequence of that type. It is named OfType
and is considered a conversion operator. The following code provides an example in which from an array of mixed types only Integer
types are extracted and pushed into an IEnumerable(Of Integer)
:
Dim mixed() As Object = {"String1", 1, "String2", 2}
Dim onlyInt = mixed.OfType(Of Integer)()
This is the equivalent of using the TypeOf
operator in a query expression. The following query returns the same result:
Dim onlyInt = From item In mixed
Where TypeOf item Is Integer
Select item
It is not uncommon to need to immediately convert a query result into a collection, so you can invoke conversion operators directly in the expression as follows:
Dim query = (From prod In products
Where prod.Discontinued = False
Select prod).ToList
Remember that invoking conversion operators causes the query to be executed.
Most of LINQ members are offered by the System.Enumerable
class. This also exposes two shared methods, Range
and Repeat
, which provide the capability to generate sequences of elements. Range
allows generating a sequence of integer numbers, as shown in the following code:
'The sequence will contain 100 numbers
'The first number is 40
Dim numbers = Enumerable.Range(40, 100)
It returns IEnumerable(Of Integer)
, so you can then query the generated sequence using LINQ. Repeat
enables you to generate a sequence where the specified item repeats the given number of times. Repeat
is generic in that you need to specify the item’s type first. For example, the following code generates a sequence of 10 Boolean values and True
is repeated 10 times:
Dim stringSequence = Enumerable.Repeat(Of Boolean)(True, 10)
Repeat returns IEnumerable(Of T)
, where T
is the type specified for the method itself.
Ordering operators allow sorting query results according to the given condition. Within LINQ queries, this is accomplished via the Order By
clause. This clause allows ordering query results in both ascending and descending order, where ascending is the default. The following example sorts the query result so that products are ordered from the one that has the lowest unit price to the one having the highest unit price:
Dim query = From prod In products
Order By prod.UnitPrice
Select prod
To get a result ordered from the highest value to the lowest, you use the Descending
keyword as follows:
Dim queryDescending = From prod In products
Order By prod.UnitPrice Descending
Select prod
This can shape the query result opposite of the first example. You can also provide more than one Order By
clause to get subsequent ordering options. Using extension methods provides a bit more granularity in ordering results. For example, you can use the OrderBy
and ThenBy
extension methods for providing multiple ordering options, as demonstrated in the following code:
Dim query = products.OrderBy(Function(p) p.UnitPrice).ThenBy(Function(p) p.ProductName)
As usual, both methods take lambdas as arguments. In addition, the OrderByDescending
and ThenByDescending
extension methods order the result from the highest value to the lowest. The last ordering method is Reverse
that reverses the query result and that you can use as follows:
Dim revertedQuery = query.Reverse()
Set operators let you remove duplicates and merge sequences and exclude specified elements. For instance, you could have duplicate items within a sequence or collection; you can remove duplicates using the Distinct
operator. The following provides an example on a simple array of integers:
Dim someInt = {1, 2, 3, 3, 2, 4}
'Returns {1, 2, 3, 4}
Dim result = From number In someInt Distinct
Select number
The result is a new IEnumberable(Of Integer)
. In real scenarios, you could find this operator useful in LINQ to SQL or the Entity Framework for searching duplicate records in a database table. The next operator is Union
, which is an extension method and merges two sequences into a new one. The following is an example:
Dim someInt = {1, 2, 3, 4}
Dim otherInt = {4, 3, 2, 1}
Dim result = someInt.Union(otherInt)
The preceding code returns an IEnumerable(Of Integer)
containing 1, 2, 3, 4, 4, 3, 2, 1. The first items in the new sequences are those from the collection that you invoke Union
on. Next operator is Intersect
, which is another extension method. This method creates a new sequence with elements that two other sequences have in common. The following code demonstrates this:
Dim someInt = {1, 2, 3, 4}
Dim otherInt = {1, 2, 5, 6}
Dim result = someInt.Intersect(otherInt)
The new sequence is an IEnumerable(Of Integer)
containing only 1 and 2 because they are the only values that both original sequences have in common. The last set operator is Except
that generates a new sequence taking only those values that two sequences do not have in common. The following code is an example, which then requires a further explanation:
Dim someInt = {1, 2, 3, 4}
Dim otherInt = {1, 2, 5, 6}
Dim result = someInt.Except(otherInt)
Surprisingly, this code returns a new IEnumerable(Of Integer)
containing only 3 and 4, although 5 and 6 also are values that the two sequences do not have in common. This is because the comparison is executed only on the sequence that you invoke Except
on, and therefore all other values are excluded.
The grouping concept is something that you already know if you ever worked with data. Given a products collection, dividing products into categories would provide a better organization of information. For example, consider the following Category
class:
Class Category
Property CategoryID As Integer
Property CategoryName As String
End Class
Now consider the following review of the Product
class, with a new CategoryID
property:
Class Product
Property ProductID As Integer
Property ProductName As String
Property UnitPrice As Decimal
Property UnitsInStock As Integer
Property Discontinued As Boolean
Property CategoryID As Integer
End Class
At this point we can write code that creates instances of both classes and populates appropriate collections, as in the following snippet:
Sub GroupByDemo()
Dim cat1 As New Category With {.CategoryID = 1,
.CategoryName = "Food"}
Dim cat2 As New Category With {.CategoryID = 2,
.CategoryName = "Beverages"}
Dim categories As New List(Of Category) From {cat1,
cat2}
Dim prod1 As New Product With {.ProductID = 0,
.ProductName = "Pasta",
.UnitPrice = 0.5D,
.UnitsInStock = 10,
.Discontinued = False,
.CategoryID = 1}
Dim prod2 As New Product With {.ProductID = 1,
.ProductName = "Wine",
.UnitPrice = 1D,
.UnitsInStock = 50,
.Discontinued = False,
.CategoryID = 2}
Dim prod3 As New Product With {.ProductID = 2,
.ProductName = "Water",
.UnitPrice = 0.5D,
.UnitsInStock = 20,
.Discontinued = False,
.CategoryID = 2}
Dim prod4 As New Product With {.ProductID = 3,
.ProductName = "Tofu",
.UnitPrice = 3.5D,
.UnitsInStock = 40,
.Discontinued = True,
.CategoryID = 1}
Dim products As New List(Of Product) From {prod1,
prod2,
prod3,
prod4}
To make things easier to understand, only two categories have been created. Notice also how each product now belongs to a specific category. To group foods into the Food category and beverages into the Beverages category, you use the Group By
operator. This is the closing code of the preceding method, which is explained just after you write it:
Dim query = From prod In products
Group prod By ID = prod.CategoryID
Into Group
Select CategoryID = ID,
ProductsList = Group
' "prod" is inferred as anonymous type
For Each prod In query
Console.WriteLine("Category {0}", prod.CategoryID)
' "p" is inferred as Product
For Each p In prod.ProductsList
Console.WriteLine(" Product {0}, Discontinued: {1}",
p.ProductName, p.Discontinued)
Next
Next
End Sub
The code produces the following result:
Category 1
Product Pasta, Discontinued: False
Product Tofu, Discontinued: True
Category 2
Product Wine, Discontinued: False
Product Water, Discontinued: False
Group By
requires you to specify a key for grouping. This key is a property of the type composing the collection you are querying. The result of the grouping is sent to a new IEnumerable(Of T)
sequence represented by the Into Group
statement. Finally, you invoke Select
to pick up the key and items grouped according to the key; the projection generates an IEnumerable(Of anonymous type)
. Notice how you need a nested For..Each
loop. This is because each item in the query result is composed of two objects: the key and a sequence of object (in this case sequence of Product
) grouped based on the key. The same result can be accomplished using extension methods’ counterpart that works like this:
Dim query = products.GroupBy(Function(prod) prod.CategoryID,
Function(prod) prod.ProductName)
You often need to create sequences or collections with items taken from different data sources. If you consider the example in the previous “Grouping Operators” section, it would be interesting to create a collection of objects in which the category name is also available so that the result can be more human-readable. This is possible in LINQ using union operators (not to be confused with the union Set
operator keyword), which perform operations that you know as joining. To complete the following steps, recall the previously provided implementation of the Product
and Category
classes and the code that populates new collections of products and categories. The goal of the first example is to create a new sequence of products in which the category name is also available. This can be accomplished as follows:
Dim query = From prod In products
Join cat In categories On _
prod.CategoryID Equals cat.CategoryID
Select CategoryName = cat.CategoryName,
ProductName = prod.ProductName
The code is quite simple to understand. Both products and categories collections are queried, and a new sequence is generated to keep products and categories whose CategoryID
is equal. This is accomplished via the Join
keyword in which the On
operator requires the condition to be evaluated as True
. Notice that Join
does not accept the equality operator (=), but it does require the Equals
keyword. In this case the query result is an IEnumerable(Of Anonymous type)
. You could, however, create a helper class exposing properties to store the result. You can then iterate the result to get information on your products, as in the following snippet:
For Each obj In query
Console.WriteLine("Category: {0}, Product name: {1}",
obj.CategoryName, obj.ProductName)
Next
The code produces the following output:
Category: Food, Product name: Pasta
Category: Beverages, Product name: Wine
Category: Beverages, Product name: Water
Category: Food, Product name: Tofu
This is the simplest joining example and is known as Cross Join, but you are not limited to this. For example, you might want to group items based on the specified key, which is known as Group Join. This allows you to rewrite the same example of the previous paragraph but taking advantage of joining can get the category name. This is accomplished as follows:
Dim query = From cat In categories
Group Join prod In products On _
prod.CategoryID Equals cat.CategoryID
Into Group
Select NewCategory = cat,
NewProducts = Group
Notice that now the main data source is Categories
. The result of this query is generating a new sequence in which groups of categories store groups of products. This is notable if you take a look at the Select
clause, which picks sequences instead of single objects or properties. The following iteration provides a deeper idea on how you access information from the query result:
For Each obj In query
Console.WriteLine("Category: {0}", obj.NewCategory.CategoryName)
For Each prod In obj.NewProducts
Console.WriteLine(" Product name: {0}, Discontinued: {1}",
prod.ProductName, prod.Discontinued)
Next
Next
Such nested iteration produces the following output:
Category: Food
Product name: Pasta, Discontinued: False
Product name: Tofu, Discontinued: True
Category: Beverages
Product name: Wine, Discontinued: False
Product name: Water, Discontinued: False
The Cross Join with Group Join technique is similar. The following code shows how you can perform a cross group join to provide a simplified version of the previous query result:
Dim query = From cat In categories
Group Join prod In products On _
prod.CategoryID Equals cat.CategoryID
Into Group
From p In Group
Select CategoryName = cat.CategoryName,
ProductName = p.ProductName
Notice that by providing a nested From
clause pointing to the group, you can easily select what you need from both sequences—for example, the category name and the product name. The result, which is still a sequence of anonymous types, can be iterated as follows:
For Each item In query
Console.WriteLine("Product {0} belongs to {1}",
item.ProductName,
item.CategoryName)
Next
It produces the following output:
Product Pasta belongs to Food
Product Tofu belongs to Food
Product Wine belongs to Beverages
Product Water belongs to Beverages
The last union operator is known as Left Outer Join. It is similar to the cross group join, but it differs in that you can provide a default value in case no item is available for the specified key. Consider the following code:
Dim query = From cat In categories
Group Join prod In products On _
prod.CategoryID Equals cat.CategoryID
Into Group
From p In Group.DefaultIfEmpty
Select CategoryName = cat.CategoryName,
ProductName = If(p IsNot Nothing,
p.ProductName, "No available product")
Notice the invocation of the Group.DefaultIfEmpty
extension method that is used with the If
ternary operator to provide a default value. You can then retrieve information from the query result as in the cross group join sample.
You might want to compare two sequences to check whether they are perfectly equal. The SequenceEqual
extension method allows performing this kind of comparison. It compares whether a sequence is equal considering both items and the items order within a sequence, returning a Boolean value. The following code returns True
because both sequences contain the same items in the same order:
Dim first = {"Visual", "Basic", "2010"}
Dim second = {"Visual", "Basic", "2010"}
'Returns True
Dim comparison = first.SequenceEqual(second)
The following code returns instead False
because, although both sequences contain the same items, they are ordered differently:
Dim first = {"Visual", "Basic", "2010"}
Dim second = {"Visual", "2010", "Basic"}
'Returns False
Dim comparison = first.SequenceEqual(second)
LINQ offers two interesting extension methods for sequences, Any
and All
. Any
checks whether at least one item in the sequence satisfies the specified condition. For example, the following code checks whether at least one product name contains the letters “of”:
Dim result = products.Any(Function(p) p.ProductName.Contains("of"))
The method receives a lambda as an argument that specifies the condition and returns True
if the condition is matched. All
checks whether all members in a sequence match the specified condition. For example, the following code checks whether all products are discontinued:
Dim result = products.All(Function(p) p.Discontinued = True)
As previously noted, the lambda argument specifies the condition to be matched.
Improving Code Performances with Any
Very often you will need to know if a collection contains at least one element (that is, not empty). A common error is using the following syntax:
If myCollection.Count > 0 Then ...
This had to be used prior to LINQ (and lots of Visual Basic 6 users migrating to .NET use that), but with LINQ you can write:
If myCollection.Any Then...
The difference is huge: if you use Count
, the runtime will move the iterator through all the elements in the collection. But if you use Any
, the runtime will move the iterator just one position, which is enough to return True or False. This dramatically increases performance, especially if the collection has hundreds or thousands of elements.
Sequences (that is, IEnumerable(Of T)
objects) expose a method named Concat
that enables the creation of a new sequence containing items from two sequences. The following code shows an example in which a new sequence of strings is created from two existing arrays of strings:
Dim firstSequence = {"One", "Two", "Three"}
Dim secondSequence = {"Four", "Five", "Six"}
Dim concatSequence = firstSequence.Concat(secondSequence)
The result produced by this code is that the concatSequence
variable contains the following items: “One”, “Two”, “Three”, “Four”, “Five”, and “Six”. The first items in the new sequence are taken from the one you invoke the Concat
method on.
Some extension methods enable you to get the instance of a specified item in a sequence. The first one is Single
, and it gets the instance of only the item that matches the specified condition. The following code gets the instance of the only product whose product name is Mozzarella
:
Try
Dim uniqueElement = products.Single(Function(p) p.
ProductName = "Mozzarella")
Catch ex As InvalidOperationException
'The item does not exist
End Try
Single
takes a lambda expression as an argument in which you can specify the condition that the item must match. It returns an InvalidOperationException
if the item does not exist in the sequence (or if more than one element matches the condition). As an alternative, you can invoke SingleOrDefault
, which returns a default value if the item does not exist instead of throwing an exception. The following code returns Nothing
because the product name does not exist:
Dim uniqueElement = products.SingleOrDefault(Function(p) p.
ProductName = "Mozzarell")
The next method is First
. It can return either the first item in a sequence or the first item that matches a condition. You can use it as follows:
'Gets the first product in the list
Dim firstAbsolute = products.First
Try
'Gets the first product where product name starts with P
Dim firstElement = products.First(Function(p) p.ProductName.
StartsWith("P"))
Catch ex As InvalidOperationException
'No item available
End Try
The previous example is self-explanatory: If multiple products have their names starting with the letter P, First
returns just the first one in the sequence or throws an InvalidOperationException
if no item is available. Additionally, a FirstOrDefault
method returns a default value, such as Nothing
, if no item is available. Last
and LastOrDefault
return the last item in a sequence and that work like the preceding illustrated ones.
Partitioning operators enable you to use a technique known as paging, which is common in data access scenarios. There are two main operators in LINQ: Skip
and Take
. Skip
avoids selecting the specified number of elements, and Take
puts the specified number of elements into a sequence. The code in Listing 23.3 shows an example of paging implementation using the two operators.
Module Partitioning
Private pageCount As Integer
Private Products As List(Of Product)
Sub PopulateProducts()
Dim prod1 As New Product With {.ProductID = 0,
.ProductName = "Pasta",
.UnitPrice = 0.5D,
.UnitsInStock = 10,
.Discontinued = False}
Dim prod2 As New Product With {.ProductID = 1,
.ProductName = "Mozzarella",
.UnitPrice = 1D,
.UnitsInStock = 50,
.Discontinued = False}
Dim prod3 As New Product With {.ProductID = 2,
.ProductName = "Crabs",
.UnitPrice = 7D,
.UnitsInStock = 20,
.Discontinued = True}
Dim prod4 As New Product With {.ProductID = 3,
.ProductName = "Tofu",
.UnitPrice = 3.5D,
.UnitsInStock = 40,
.Discontinued = False}
Products = New List(Of Product) From {prod1,
prod2,
prod3,
prod4}
End Sub
Function QueryProducts() As IEnumerable(Of Product)
Dim query As IEnumerable(Of Product)
'If pageCount = 0 we need to retrieve the first 10 products
If pageCount = 0 Then
query = From prod In Products _
Order By Prod.ProductID _
Take 10
Else
'Skips the already shown products
'and takes next 10
query = From prod In Products _
Order By Prod.ProductID _
Skip pageCount Take 10
End If
'In real applications ensure that query is not null
Return query
End Function
End Module
The private field pageCount
acts as a counter. According to its value, the query skips the number of elements already visited, represented by the value of pageCount
. If no elements were visited, the query skips nothing. The code invoking QueryProducts
increases or decreases the pageCount
value by 10 units depending on whether you want to move forward or backward to the collection items.
In this chapter, you got a high-level overview of LINQ key concepts. In this particular discussion, you got information about LINQ to Objects as the built-in provider for querying in-memory collections. We showed LINQ in action via specific Visual Basic keywords that recall the SQL syntax, such as From
, Select
, Where
, and Join
. You can build LINQ queries while writing Visual Basic code, taking advantage of the background compiler, IntelliSense, and CLR control. Such queries written in the code editor are known as query expressions. Query expressions return an IEnumerable(Of T)
but are not executed immediately. According to the key concept of deferred execution, LINQ queries are executed only when they are effectively utilized. This is something that you find in subsequent LINQ providers. With LINQ, you can build complex query expressions to query your data sources via the standard query operators, which were covered in the last part of the chapter. LINQ to Objects is the basis of LINQ, and most of the concepts shown in this chapter will be revisited in the next ones.