Chapter 20
LINQ

Wrox.com Code Downloads for this Chapter

You can find the wrox.com code downloads for this chapter at www.wrox.com/go/beginningvisualc#2015programming on the Download Code tab. The code is in the Chapter 20 download and individually named according to the names throughout the chapter.

This chapter introduces Language INtegrated Query (LINQ). LINQ is an extension to the C# language that integrates data query directly into the programming language itself.

Before LINQ this sort of work required writing a lot of looping code, and additional processing such as sorting or grouping the found objects required even more code that would differ depending on the data source. LINQ provides a portable, consistent way of querying, sorting, and grouping many different kinds of data (XML, JSON, SQL databases, collections of objects, web services, corporate directories, and more).

First you'll build on the previous chapter by learning the additional capabilities that the system.xml.linq namespace adds for creating XML. Then you'll get into the heart of LINQ by using query syntax, method syntax, lambda expressions, sorting, grouping, and joining related results.

LINQ is large enough that complete coverage of all its facilities and methods is beyond the scope of a beginning book. However, you will see examples of each of the different types of statements and operators you are likely to need as a user of LINQ, and you will be pointed to resources for more in-depth coverage as appropriate.

LINQ to XML

LINQ to XML is an alternate set of classes for XML that enables the use of LINQ for XML data and also makes certain operations with XML easier even if you are not using LINQ. We will look at a couple of specific cases where LINQ to XML has advantages over the XML DOM (Document Object Model) introduced in the previous chapter.

LINQ to XML Functional Constructors

While you can create XML documents in code with the XML DOM, LINQ to XML provides an easier way to create XML documents called functional construction. In formal construction the constructor calls can be nested in a way that naturally reflects the structure of the XML document. In the following Try It Out, you use functional constructors to make a simple XML document containing customers and orders.

Working with XML Fragments

Unlike the XML DOM, LINQ to XML works with XML fragments (partial or incomplete XML documents) in very much the same way as complete XML documents. When working with a fragment, you simply work with XElement as the top-level XML object instead of XDocument.

In the following Try It Out, you load, save, and manipulate an XML element and its child nodes, just as you did for an XML document.

LINQ Providers

LINQ to XML is just one example of a LINQ provider. Visual Studio 2015 and the .NET Framework 4.5 come with a number of built-in LINQ providers that provide query solutions for different types of data:

  • LINQ to Objects — Provides queries on any kind of C# in-memory object, such as arrays, lists, and other collection types. All of the examples in the previous chapter use LINQ to Objects. However, you can use the techniques you learn in this chapter with all of the varieties of LINQ.
  • LINQ to XML — As you have just seen, this provides creation and manipulation of XML documents using the same syntax and general query mechanism as the other LINQ varieties.
  • LINQ to Entities — The Entity Framework is the newest set of data interface classes in .NET 4, recommended by Microsoft for new development. In this chapter you will add an ADO.NET Entity Framework data source to your Visual C# project, then query it using LINQ to Entities.
  • LINQ to Data Set — The DataSet object was introduced in the first version of the .NET Framework. This variety of LINQ enables legacy .NET data to be queried easily with LINQ.
  • LINQ to SQL — This is an alternative LINQ interface that has been superseded by LINQ to Entities.
  • PLINQ — PLINQ, or Parallel LINQ, extends LINQ to Objects with a parallel programming library that can split up a query to execute simultaneously on a multicore processor.
  • LINQ to JSON — Included in the Newtonsoft package you used in the previous chapter, this library supports creation and manipulation of JSON documents using the same syntax and general query mechanism as the other LINQ varieties.

With so many varieties of LINQ, it is impossible to cover them all in a beginning book, but the syntax and methods you will see apply to all. Let's next look at the LINQ query syntax using the LINQ to Objects provider.

LINQ Query Syntax

In the following Try It Out, you use LINQ to create a query to find some data in a simple in-memory array of objects and print it to the console.

Declaring a Variable for Results Using the var Keyword

The LINQ query starts by declaring a variable to hold the results of the query, which is usually done by declaring a variable with the var keyword:

var queryResult =

var is a keyword in C# created to declare a general variable type that is ideal for holding the results of LINQ queries. The var keyword tells the C# compiler to infer the type of the result based on the query. That way, you don't have to declare ahead of time what type of objects will be returned from the LINQ query — the compiler takes care of it for you. If the query can return multiple items, then it acts like a collection of the objects in the query data source (technically, it is not a collection; it just looks that way).

By the way, the name queryResult is arbitrary — you can name the result anything you want. It could be namesBeginningWithS or anything else that makes sense in your program.

Specifying the Data Source: from Clause

The next part of the LINQ query is the from clause, which specifies the data you are querying:

    from n in names

Your data source in this case is names, the array of strings declared earlier. The variable n is just a stand-in for an individual element in the data source, similar to the variable name following a foreach statement. By specifying from, you are indicating that you are going to query a subset of the collection, rather than iterate through all the elements.

Speaking of iteration, a LINQ data source must be enumerable — that is, it must be an array or collection of items from which you can pick one or more elements to iterate through.

The data source cannot be a single value or object, such as a single int variable. You already have such a single item, so there is no point in querying it!

Specify Condition: where Clause

In the next part of the LINQ query, you specify the condition for your query using the where clause, which looks like this:

    where n.StartsWith("S")

Any Boolean (true or false) expression that can be applied to the items in the data source can be specified in the where clause. Actually, the where clause is optional and can even be omitted, but in almost all cases you will want to specify a where condition to limit the results to only the data you want. The where clause is called a restriction operator in LINQ because it restricts the results of the query.

Here, you specify that the name string starts with the letter S, but you could specify anything else about the string instead — for example, a length greater than 10 (where n.Length > 10) or containing a Q (where n.Contains("Q")).

Selecting Items: select Clause

Finally, the select clause specifies which items appear in the result set. The select clause looks like this:

      select n

The select clause is required because you must specify which items from your query appear in the result set. For this set of data, it is not very interesting because you have only one item, the name, in each element of the result set. You'll look at some examples with more complex objects in the result set where the usefulness of the select clause will be more apparent, but first, you need to finish the example.

Finishing Up: Using the foreach Loop

Now you print out the results of the query. Like the array used as the data source, the results of a LINQ query like this are enumerable, meaning you can iterate through the results with a foreach statement:

WriteLine("Names beginning with S:");
foreach (var item in queryResults) {
       WriteLine(item);
}

In this case, you matched five names — Smith, Smythe, Small, Singh, and Samba — so that is what you display in the foreach loop.

Deferred Query Execution

You may be thinking that the foreach loop really isn't part of LINQ itself — it's only looping through your results. While it's true that the foreach construct is not itself part of LINQ, nevertheless, it is the part of your code that actually executes the LINQ query! The assignment of the query results variable only saves a plan for executing the query; with LINQ, the data itself is not retrieved until the results are accessed. This is called deferred query execution or lazy evaluation of queries. Execution will be deferred for any query that produces a sequence — that is, a list — of results.

Now, back to the code. You've printed out the results; it's time to finish the program:

Write("Program finished, press Enter/Return to continue:");
ReadLine();

These lines just ensure that the results of the console program stay on the screen until you press a key, even if you press F5 instead of Ctrl+F5. You'll use this construct in most of the other LINQ examples as well.

LINQ Method Syntax

There are multiple ways of doing the same thing with LINQ, as is often the case in programming. As noted, the previous example was written using the LINQ query syntax; in the next example, you will write the same program using LINQ's method syntax (also called explicit syntax, but the term method syntax is used here).

LINQ Extension Methods

LINQ is implemented as a series of extension methods to collections, arrays, query results, and any other object that implements the IEnumerable<T> interface. You can see these methods with the Visual Studio IntelliSense feature. For example, in Visual Studio 2015, open the Program.cs file in the FirstLINQquery program you just completed and type in a new reference to the names array just below it:

string[] names = { "Alonso", "Zheng", "Smith", "Jones", "Smythe", "Small",
"Ruiz", "Hsieh", "Jorgenson", "Ilyich", "Singh", "Samba", "Fatimah" };
names.

Just as you type the period following names, you will see the methods available for names listed by the Visual Studio IntelliSense feature.

The Where<T> method and most of the other available methods are extension methods (as shown in the documentation appearing to the right of the Where<T> method, it begins with extension). You can see that they are LINQ extensions by commenting out the using System.Linq directive at the top; you will find that Where<T>, Union<T>, Take<T>, and most of the other methods in the list no longer appear. The from…where…select query expression you used in the previous example is translated by the C# compiler into a series of calls to these methods. When using the LINQ method syntax, you call these methods directly.

Query Syntax versus Method Syntax

The query syntax is the preferred way of programming queries in LINQ, as it is generally easier to read and is simpler to use for the most common queries. However, it is important to have a basic understanding of the method syntax because some LINQ capabilities either are not available in the query syntax, or are just easier to use in the method syntax.

In this chapter, you will mostly use the query syntax, but the method syntax is pointed out in situations where it is needed, and you'll learn how to use the method syntax to solve the problem.

Most of the LINQ methods that use the method syntax require that you pass a method or function to evaluate the query expression. The method/function parameter is passed in the form of a delegate, which typically references an anonymous method.

Luckily, LINQ makes doing this much easier than it sounds! You create the method/function by using a lambda expression, which encapsulates the delegate in an elegant manner.

Lambda Expressions

A lambda expression is a simple way to create a method on-the-fly for use in your LINQ query. It uses the => operator, which declares the parameters for your method followed by the method logic all on a single line!

For example, consider the lambda expression:

n => n < 0

This declares a method with a single parameter named n. The method returns true if n is less than zero, otherwise false. It's dead simple. You don't have to come up with a method name, put in a return statement, or wrap any code with curly braces.

Returning a true/false value like this is typical for methods used in LINQ lambdas, but it doesn't have to be done. For example, here is a lambda that creates a method that returns the sum of two variables. This lambda uses multiple parameters:

(a, b) => a + b

This declares a method with two parameters named a and b. The method logic returns the sum of a and b. You don't have to declare what type a and b are. They can be int or double or string. The C# compiler infers the types.

Finally, consider this lambda expression:

n => n.StartsWith("S")

This method returns true if n starts with the letter S, otherwise false. Try this out in an actual program to see this more clearly.

Ordering Query Results

Once you have located some data of interest with a where clause (or Where() method invocation), LINQ makes it easy to perform further processing — such as reordering the results — on the resulting data. In the following Try It Out, you put the results from your first query in alphabetical order.

Understanding the orderby Clause

The orderby clause looks like this:

orderby n

Like the where clause, the orderby clause is optional. Just by adding one line, you can order the results of any arbitrary query, which would otherwise require at least several lines of additional code and probably additional methods or collections to store the results of the reordered result, depending on the sorting algorithm you chose to implement. If multiple types needed to be sorted, you would have to implement a set of ordering methods for each one. With LINQ, you don't need to worry about any of that; just add one additional clause in the query statement and you're done.

By default, orderby orders in ascending order (A to Z), but you can specify descending order (from Z to A) simply by adding the descending keyword:

orderby n descending

This orders the example results as follows:

Smythe
Smith
Small
Singh
Samba

Plus, you can order by any arbitrary expression without having to rewrite the query; for example, to order by the last letter in the name instead of normal alphabetical order, you just change the orderby clause to the following:

    orderby n.Substring(n.Length - 1)

This results in the following output:

Samba
Smythe
Smith
Singh
Small

Querying a Large Data Set

All this LINQ syntax is well and good, you may be saying, but what is the point? You can see the expected results clearly just by looking at the source array, so why go to all this trouble to query something that is obvious by just looking? As mentioned earlier, sometimes the results of a query are not so obvious. In the following Try It Out, you create a very large array of numbers and query it using LINQ.

Using Aggregate Operators

Often, a query returns more results than you might expect. For example, if you were to change the condition of the large-number query program you just created to list the numbers greater than 1,000, rather than the numbers less than 1,000, there would be so many query results that the numbers would not stop printing!

Luckily, LINQ provides a set of aggregate operators that enable you to analyze the results of a query without having to loop through them all. Table 20.1 shows the most commonly used aggregate operators for a set of numeric results such as those from the large-number query. These may be familiar to you if you have used a database query language such as SQL.

Table 20.1 Aggregate Operators for Numeric Results

Operator Description
Count() Count of results
Min() Minimum value in results
Max() Maximum value in results
Average() Average value of numeric results
Sum() Total of all of numeric results

There are more aggregate operators, such as Aggregate(), for executing arbitrary code in a manner that enables you to code your own aggregate function. However, those are for advanced users and therefore beyond the scope of this book.

In the following Try It Out, you modify the large-number query and use aggregate operators to explore the result set from the greater-than version of the large-number query using LINQ.

Using the Select Distinct Query

Another type of query that those of you familiar with the SQL data query language will recognize is the SELECT DISTINCT query, in which you search for the unique values in your data — that is, the query removes any repeated values from the result set. This is a fairly common need when working with queries.

Suppose you need to find the distinct regions in the customer data used in the previous examples. There is no separate region list in the data you just used, so you need to find the unique, nonrepeating list of regions from the customer list itself. LINQ provides a Distinct() method that makes it easy to find this data. You'll use it in the following Try It Out.

Ordering by Multiple Levels

Now that you are dealing with objects with multiple properties, you might be able to envision a situation where ordering the query results by a single field is not enough. What if you wanted to query your customers and order the results alphabetically by region, but then order alphabetically by country or city name within a region? LINQ makes this very easy, as you will see in the following Try It Out.

Using Group Queries

A group query divides the data into groups and enables you to sort, calculate aggregates, and compare by group. These are often the most interesting queries in a business context (the ones that really drive decision-making). For example, you might want to compare sales by country or by region to decide where to open another store or hire more staff. You'll do that in the next Try It Out.

Using Joins

A data set such as the customers and orders list you just created, with a shared key field (ID), enables a join query, whereby you can query related data in both lists with a single query, joining the results together with the key field. This is similar to the JOIN operation in the SQL data query language; and as you might expect, LINQ provides a join command in the query syntax, which you will use in the following Try It Out.

image What You Learned in This Chapter

Topic Key Concepts
What LINQ is and when to use it LINQ is a query language built into C#. Use LINQ to query data from large collections of objects, XML, or databases.
Parts of a LINQ query A LINQ query includes the from, where, select, and orderby clauses.
How to get the results of a LINQ query Use the foreach statement to iterate through the results of a LINQ query.
Deferred execution LINQ query execution is deferred until the foreach statement is executed.
Method syntax and query syntax Use the query syntax for most LINQ queries and method queries when required. For any given query, the query syntax or the method syntax will give the same result.
Lambda Expressions Lambda expressions let you declare a method on-the-fly for use in a LINQ query using the method syntax.
Aggregate operators Use LINQ aggregate operators to obtain information about a large data set without having to iterate through every result.
Group queries Use group queries to divide data into groups, then sort, calculate aggregates, and compare by group.
Ordering Use the orderby operator to order the results of a query.
Joins Use the join operator to query related data in multiple collections with a single query.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset