Lessons 34 and 35 explain how you can use Visual Studio's wizards to build simple database programs. They show one of many ways to connect a program to a data source.
Language-Integrated Query (LINQ) provides another method for bridging the gap between a program and data. Instead of simply providing another way to access data in a database, however, LINQ can help a program access data stored in many places. LINQ lets a program use the same techniques to access data stored in databases, arrays, collections, or files.
LINQ provides four basic technologies that give you access to data stored in various places:
In this lesson you learn how to use LINQ to Objects. You learn how to extract data from lists, collections, and arrays and how to process the results.
Using LINQ to process data takes three steps:
You might expect the third step to be two separate steps, “Execute the query” and “Process the result.” In practice, however, LINQ doesn't actually execute the query until it must—when the program tries to access the results. This is called deferred execution.
For example, the following code displays the even numbers between 0 and 99:
// Display the even numbers between 0 and 99.
private void Form1_Load(object sender, EventArgs e)
{
// 1. Create the data source.
int[] numbers = new int[100];
for (int i = 0; i < 100; i++) numbers[i] = i;
// 2. Build a query to select data from the data source.
var evenQuery =
from int num in numbers
where (num % 2 == 0)
select num;
// 3. Execute the query and process the result.
foreach (int num in evenQuery) Console.WriteLine(num.ToString());
}
The program starts by creating the data source: an array containing the numbers 0 through 99. In this example the data source is quite simple, but in other programs it could be much more complex. Instead of an array of numbers, it could be a list of Customer
objects or an array of Order
objects that contain lists of OrderItem
objects.
Next the program builds a query to select the even numbers from the list. I explain queries in more detail later, but the following list describes the key pieces of this query:
var
—This is the data type of whatever is returned by the query. In this example the result will be an IEnumerable<int>
but in general the results of LINQ queries can have some very strange data types. Rather than trying to figure out what a query will return, most developers use the implicit data type var
. The var
keyword tells the C# compiler to figure out what the data type is and use that so you don't need to use a specific data type.evenQuery
—This is the name the code is giving to the query. You can think of it as a variable that represents the result that LINQ will later produce.from int num in numbers
—This means the query will select data from the numbers
array. It will use the int
variable num
to range over the values in the array. Because num
ranges over the values, it is called the query's range variable. (If you omit the int
data type, the compiler will implicitly figure out its data type.)where (num % 2 == 0)
—This is the query's where clause. It determines which items are selected from the array. This example selects the even numbers (where num
mod 2 is 0).select num
—This tells the query what to return. In this case the query returns whatever is in the range variable num
for the values that are selected. Often you will want to return the value of the range variable but you could return something else such as 2 * num
or a new object created with a constructor that takes num
as a parameter.In the final step to performing the query, the code loops through the result produced by LINQ. The code displays each int
value in the Console window. It's only when the program tries to iterate over the results of the query that the query is actually executed.
The following sections provide more detailed descriptions of some of the key pieces of a LINQ query: where
clauses, order by
clauses, and select
clauses.
Probably the most common reason to use LINQ is to filter the data with a where clause. The where
clause can include normal boolean expressions that use &&
, ||
, >
, and other boolean operators. It can use the range variable and any properties or methods that it provides (if it's an object). It can even perform calculations and invoke functions.
For example, the following query is similar to the earlier one that selects even numbers, except this one's where
clause uses the IsPrime
method to select only prime numbers. (How the IsPrime
function works isn't important to this discussion, so it isn't shown here. You can see it in the Find Primes program in this lesson's download.)
var primeQuery =
from int num in numbers
where (IsPrime(num))
select num;
The Find Customers example program shown in Figure 36.1 (and available in this lesson's code download on the website) demonstrates several where
clauses.
The following code shows the Customer
class used by the Find Customers program. It includes some auto-implemented properties and an overridden ToString
method that displays the Customer
's values:
class Customer
{
public string FirstName { get; set; }
public string LastName { get; set; }
public decimal Balance { get; set; }
public DateTime DueDate { get; set; }
public override string ToString()
{
return FirstName + " " + LastName + " " +
Balance.ToString("C") + " " + DueDate.ToString("d");
}
}
The following code shows how the Find Customers program displays the same customer data selected with different where
clauses:
// Display customers selected in various ways.
private void Form1_Load(object sender, EventArgs e)
{
DateTime today = new DateTime(2020, 4, 1);
//DateTime today = DateTime.Today;
this.Text = "Find Customers (" + today.ToString("d") + ")";
// Make the customers.
Customer[] customers =
{
new Customer() { FirstName="Ann", LastName="Ashler",
Balance = 100, DueDate = new DateTime(2020, 3, 10)},
new Customer() { FirstName="Bob", LastName="Boggart",
Balance = 150, DueDate = new DateTime(2020, 2, 5)},
// … Other Customers omitted …
};
// Display all customers.
allListBox.DataSource = customers;
// Display customers with negative balances.
var negativeQuery =
from Customer cust in customers
where cust.Balance < 0
select cust;
negativeListBox.DataSource = negativeQuery.ToArray();
// Display customers who owe at least $50.
var owes50Query =
from Customer cust in customers
where cust.Balance <= -50
select cust;
owes50listBox.DataSource = owes50Query.ToArray();
// Display customers who owe at least $50
// and are overdue at least 30 days.
var overdueQuery =
from Customer cust in customers
where (cust.Balance <= -50) &&
(DateTime.Now.Subtract(cust.DueDate).TotalDays > 30)
select cust;
overdueListBox.DataSource = overdueQuery.ToArray();
}
The program starts by creating a DateTime
named today
and setting it equal to April 1, 2020. In a real application you would probably use the current date (commented out), but this program uses that specific date so it works well with the sample data. The program then displays the date in its title bar (so you can compare it to the Customer
s' due dates) and creates an array of Customer
objects.
Next the code sets the allListBox
control's DataSource
property to the array so that ListBox
displays all of the Customer
objects. The Customer
class's overridden ToString
method makes it display each Customer
's name, balance, and due date.
The program then executes the following LINQ query:
// Display customers with negative balances.
var negativeQuery =
from Customer cust in customers
where cust.Balance < 0
select cust;
negativeListBox.DataSource = negativeQuery.ToArray();
This query's where
clause selects Customer
s with Balance
properties less than 0. The query returns an IEnumerable
, but a ListBox
's DataSource
property requires an IList
or IListSource
and IEnumerable
doesn't satisfy either of those interfaces. To handle that problem, the program calls the result's ToArray
method to convert it into an array that the DataSource
property can handle.
After displaying this result, the program executes two other LINQ queries and displays their results similarly. The first query selects Customer
s who owe at least $50. The final query selects Customer
s who owe at least $50 and who have a DueDate
more than 30 days in the past.
Often the result of a query is easier to read if you sort the selected values. You can do this by inserting an order by clause between the where
clause and the select
clause.
The order by
clause begins with the keyword orderby
followed by one or more values separated by commas that determine how the results are ordered.
Optionally you can follow a value by the keyword ascending
(the default) or descending
to determine whether the results are ordered in ascending (1-2-3 or A-B-C) or descending (3-2-1 or C-B-A) order.
For example, the following query selects Customers
with negative balances and orders them so those with the smallest (most negative) values come first:
var negativeQuery =
from Customer cust in customers
where cust.Balance < 0
orderby cust.Balance ascending
select cust;
The following version orders the results first by balance and then, if two customers have the same balance, by last name:
var negativeQuery =
from Customer cust in customers
where cust.Balance < 0
orderby cust.Balance, cust.LastName
select cust;
The select
clause determines what data is pulled from the data source and stored in the result. All of the previous examples select the data over which they are ranging. For example, the Find Customers example program ranges over an array of Customer
objects and selects certain Customer
objects.
Instead of selecting the objects in the query's range, a program can select only some properties of those objects, a result calculated from those properties, or even completely new objects. Selecting a new kind of data from the existing data is called transforming or projecting the data.
The Find Students example program shown in Figure 36.2 (and available in this lesson's code download on the website) uses the following simple Student
class:
class Student
{
public string FirstName { get; set; }
public string LastName { get; set; }
public List<int> TestScores { get; set; }
}
The program uses the following query to select all of the students' names and test averages ordered by name:
// Select all students and their test averages ordered by name.
var allStudents =
from Student student in students
orderby student.LastName, student.FirstName
select String.Format("{0} {1} {2:0.00}",
student.FirstName, student.LastName,
student.TestScores.Average());
allListBox.DataSource = allStudents.ToArray();
This query's select
clause does not select the range variable student
. Instead it selects a string
that holds the student's first and last names and the student's test score average. (Notice how the code calls the TestScore
list's Average
method to get the average of the test scores.) The result of the query is a List<string>
instead of a List<Student>
.
The program next uses the following code to list the students who have averages of at least 60, giving them passing grades:
// Select passing students ordered by name.
var passingStudents =
from Student student in students
orderby student.LastName, student.FirstName
where student.TestScores.Average() >= 60
select student.FirstName + " " + student.LastName;
passingListBox.DataSource = passingStudents.ToArray();
This code again selects a string
instead of a Customer
object. The code that selects failing students is similar, so it isn't shown here.
The program uses the following code to select students with averages below the class average:
// Select all scores and compute a class average.
var allAverages =
from Student student in students
select student.TestScores.Average();
double classAverage = allAverages.Average();
// Display the average.
this.Text = "FindStudents: Class Average = " +
classAverage.ToString("0.00");
// Select students with average below the class average ordered by average.
var belowAverageStudents =
from Student student in students
orderby student.TestScores.Average()
where student.TestScores.Average() < classAverage
select new {Name = student.FirstName + " " + student.LastName,
Average = student.TestScores.Average()};
foreach (var info in belowAverageStudents)
belowAverageListBox.Items.Add(info.Name + " " + info.Average);
This snippet starts by selecting all of the students' test score averages. This returns a List<double>
. The program calls that list's Average
function to get the class average.
Next the code queries the student data again, this time selecting students with averages below the class average.
This query demonstrates a new kind of select
clause that creates a list of objects. The new objects have two properties, Name
and Average
, that are given values by the select
clause. The data type of these new objects is created automatically and isn't given an explicit name so this is known as an anonymous type.
After creating the query, the code loops through its results, using each object's Name
and Average
property to display the below average students in a ListBox
. Notice that the code gives the looping variable info
the implicit data type var
so it doesn't need to figure out what data type it really has.
LINQ provides plenty of other features that won't fit in this lesson. It lets you:
true
true
Average
(which you've already seen), Count
, Min
, Max
, and Sum
Microsoft's “Language-Integrated Query (LINQ)” page at msdn.microsoft.com/library/bb397926.aspx
provides a good starting point for learning more about LINQ.
In Lesson 29's Try It, you built a program that used the DirectoryInfo
class's GetFiles
method to search for files matching a pattern and containing a target string. For example, the program could search the directory hierarchy starting at C:C#Projects to find files with the .cs
extension and containing the string “DirectoryInfo.”
In this Try It, you modify that program to perform the same search with LINQ. Instead of writing code to loop through the files returned by GetFiles
and examining each, you make LINQ examine the files for you.
In this lesson, you:
DirectoryInfo
object's GetFiles
method in the query's from
clause.where
clause, use the File
class's ReadAllText
method to get the file's contents. Convert it to lowercase and use Contains
to see if the file holds the target string.Click
event handler so it looks like the following. The lines in bold show the modified code:
// Search for files matching the pattern
// and containing the target string.
private void searchButton_Click(object sender, EventArgs e)
{
// Get the file pattern and target string.
string pattern = patternComboBox.Text;
string target = targetTextBox.Text.ToLower();
// Search for the files.
DirectoryInfo dirinfo =
new DirectoryInfo(directoryTextBox.Text);
var fileQuery =
from FileInfo fileinfo
in dirinfo.GetFiles(pattern,
SearchOption.AllDirectories)
where
File.ReadAllText(fileinfo.FullName).ToLower().Contains(target)
select fileinfo.FullName;
// Display the result.
fileListBox.DataSource = fileQuery.ToArray();
}
If you compare this code to the version used by the Try It in Lesson 29, you'll see that this version is much shorter.
Enumerable
class's Range
method to initialize the source data.)
For Exercises 4 through 8 download the Customer Orders program. This program defines the following classes:
class Person
{
public string Name { get; set; }
}
class OrderItem
{
public string Description { get; set; }
public int Quantity { get; set; }
public decimal UnitPrice { get; set; }
}
class Order
{
public int OrderId { get; set; }
public Person Customer { get; set; }
public List<OrderItem> OrderItems { get; set; }
}
The program's Form_Load
event handler creates an array of Order
objects. The program's buttons, which are shown in Figure 36.3, let the user display the data in various ways although initially they don't contain any code. In Exercises 4 through 8, you add that code to give the program its features.
Order
objects, but it doesn't fill in those objects' TotalCost
properties. Use LINQ to do that. (Hints: Use a foreach
loop to loop through the objects. For each object, use a LINQ query to go through the order's OrderItems
list and select each OrderItem
's UnitPrice
times its Quantity
. After you define the query, call its Sum
function to get the total cost for the order.)resultListBox
by setting that control's DataSource
property to the query.ComboBox
. (If no name is selected, don't do anything.)TextBox
.