CHAPTER 9

image

Advanced WHERE Clauses

In Chapter 3, you learned how to write queries with SELECT, FROM, WHERE, and ORDER BY clauses. Chapter 3 covered a lot of options for filtering results with the WHERE clause, however, there is still more to learn. This chapter will discuss using the comparison operator LIKE and matching against a partial value or a pattern, full-text search, and WHERE clauses with more than two predicates and the PATINDEX function.

Pattern Matching

Sometimes you know only part of the value that will match the data stored in the table. For example, you may need to search for one word within a description. You can perform searches with pattern matching using wildcards to find one value within another value.

Using LIKE

Pattern matching is possible by using the LIKE operator in the expression instead of equal to (=) or one of the other operators. Most of the time, the percent sign (%) is used as a wildcard along with LIKE to represent zero or more characters. You will also see the underscore (_) used as a wildcard to replace exactly one character, but it’s not used as often. Type in and run the code from Listing 9-1 to learn how to use LIKE.

Listing 9-1. Using LIKE with the Percent Sign

--1
SELECT DISTINCT LastName
FROM Person.Person
WHERE LastName LIKE 'Sand%';

--2
SELECT DISTINCT LastName
FROM Person.Person
WHERE LastName NOT LIKE 'Sand%';

--3
SELECT DISTINCT LastName
FROM Person.Person
WHERE LastName LIKE '%Z%';

--4
SELECT DISTINCT LastName
FROM Person.Person
WHERE LastName LIKE 'Bec_';

The queries in Listing 9-1 contain the keyword DISTINCT to eliminate duplicates in the results shown in Figure 9-1. Query 1 returns all LastName values where the last name starts with Sand. Query 2 returns the opposite—it returns all the LastName values not returned by query 1, which are those rows where the last name does not start with Sand. Query 3 returns all LastName values that contain a Z anywhere in the last name. Query 4 will return only the last name Beck or any last name starting with Bec and one more character, but not the last name Becker because the underscore matches one and only one character.

9781484200476_Fig09-01.jpg

Figure 9-1. The partial results of queries with LIKE

Restricting the Characters in Pattern Matches

The value matching a wildcard may be restricted to a list or range of characters. To do this, surround the possible values or range by square brackets ([]). Alternately, include the caret (^) symbol to list characters or the range of characters that you don’t want to use as replacements. Here is the syntax for using brackets as the wildcard:

SELECT <column1>,<column2>
FROM <schema>.<table>
WHERE <column> LIKE 'value[a-c]';

SELECT <column1>,<column2>
FROM <schema>.<table>
WHERE <column> LIKE 'value[abc]';

SELECT <column1>,<column2>
FROM <schema>.<table>
WHERE <column> LIKE 'value[^d]';

Type in and execute the code from Listing 9-2, which shows some examples. You will probably not encounter the square bracket technique very often, but you should be familiar with the syntax in case you run into it.

Listing 9-2. Using Square Brackets with LIKE

--1
SELECT DISTINCT LastName
FROM Person.Person
WHERE LastName LIKE 'Cho[i-k]';

--2
SELECT DISTINCT LastName
FROM Person.Person
WHERE LastName LIKE 'Cho[ijk]';

--3
SELECT DISTINCT LastName
FROM Person.Person
WHERE LastName LIKE 'Cho[^i]';

Figure 9-2 displays the results of running Listing 9-2. Queries 1 and 2 could return unique rows with a last name of Choi, Choj, or Chok because the pattern specifies the range i to k. Only Choi is actually found in the data. Query 1 specifies the range of values, while query 2 explicitly lists the allowable values that may be replaced. Query 3 returns unique rows that have a last name beginning with Cho and ending with any character except for i.

9781484200476_Fig09-02.jpg

Figure 9-2. The results of queries restricting characters in matches

Searching for Wildcards

Occasionally you will need to search for the percent sign, the underscore, or one of the brackets in a string. If you just include them in your search term, the database engine will ignore them. To get around this, always escape these symbols by including them within brackets when you are actually trying to return values that contain them. Table 9-1 shows the strings to use when searching for one of these symbols within a string.

Table 9-1. Symbols Used in Search Patterns

Symbol

Search Pattern

Percent %

'%[%]%'

Underscore _

'%[_]%'

Left Bracket [

'%[[]%'

Right Bracket ]

'%[]]%'

Combining Wildcards

You may combine wildcards to create even more elaborate patterns. Remember that the percent sign replaces any number of characters, the underscore replaces one character, and the square brackets replace one character based on the values within the brackets. Listing 9-3 demonstrates some examples. Type in and execute the code to see how this works.

Listing 9-3. Combining Wildcards in One Pattern

--1
SELECT LastName
FROM Person.Person
WHERE LastName LIKE 'Ber[rg]%';

--2
SELECT LastName
FROM Person.Person
WHERE LastName LIKE 'Ber[^r]%';

--3
SELECT LastName
FROM Person.Person
WHERE LastName LIKE 'Be%n_';

View Figure 9-3 to see the results of running this code. Query 1 returns all rows with a last name beginning with Ber followed by either r or g (which is signified by the characters within the brackets) and then by any number of characters. Query 2 returns all rows with a last name beginning with Ber followed by any letter except for r and then by any number of characters. Query 3 returns all rows with a last name beginning with Be followed by any number of characters, except that the next-to-last character must be an n.

9781484200476_Fig09-03.jpg

Figure 9-3. The results of queries with multiple wildcards

You will probably find LIKE used frequently in queries, so it’s important to understand how it works. Practice the skills you have just learned by completing Exercise 9-1.

EXERCISE 9-1

Use the AdventureWorks database to complete this exercise. Follow the steps in this exercise to test your knowledge of pattern matching and wildcard queries. You can find the solutions at the end of the chapter.

  1. Write a query that displays the product ID and name for each product from the Production.Product table with a name starting with Chain.
  2. Write a query like the one in question 1 that displays the products with Paint in the name.
  3. Change the last query so that the products without Paint in the name are displayed.
  4. Write a query that displays the business entity ID number, first name, middle name, and last name from the Person.Person table for only those rows that include E or B in the middle name column.
  5. Explain the difference between the following two queries:
    SELECT FirstName
    FROM Person.Person
    WHERE LastName LIKE 'Ja%es';

    SELECT FirstName
    FROM Person.Person
    WHERE LastName LIKE 'Ja_es';

Using PATINDEX

In Chapter 4 you learned to use the CHARINDEX function that returns the position of one string inside another. There is also a function called PATINDEX that works similarly but includes the use of wildcards in the search condition. Here is the syntax:

PATINDEX(‘%pattern%’,expression)

You can use both the percent sign and the underscore and the other patterns used with LIKE. If the pattern is found in the expression, the position where the expression is found is returned. If the pattern is not found, then the function returns 0. Listing 9-4 demonstrates how to use PATINDEX in the SELECT list and the WHERE clause.

Listing 9-4. Using PATINDEX

--1
SELECT LastName, PATINDEX('Ber[rg]%', LastName) AS Position
FROM Person.Person
WHERE PATINDEX('Ber[r,g]%', LastName) > 0;

--2
SELECT LastName, PATINDEX('%r%',LastName) Position
FROM Person.Person
WHERE PATINDEX('%[r]%',LastName) > 0;

Figure 9-4 shows the partial results of running this code. Query 1 returns any rows where the LastName column starts with Ber, either r or g, and then followed by any number of characters. Query 2 returns all rows where the LastName contains the letter R.

9781484200476_Fig09-04.jpg

Figure 9-4. Using PATINDEX

Using WHERE Clauses with Three or More Predicates

In Chapter 3, you learned how to write a WHERE clause with one or two predicates. A WHERE clause can contain more than two predicates combined by the logical operators AND and OR. If a WHERE clause contains more than two predicates using both AND and OR, you must be careful to ensure that the query returns the expected results. Type in and execute the code in Listing 9-5 to see how the order of the predicates affects the results and how to use parentheses to enforce the correct logic.

Listing 9-5. WHERE Clauses with Three Predicates

--1
SELECT BusinessEntityID,FirstName,MiddleName,LastName
FROM Person.Person
WHERE FirstName = 'Ken' AND LastName = 'Myer'
    OR LastName = 'Meyer';

--2
SELECT BusinessEntityID,FirstName,MiddleName,LastName
FROM Person.Person
WHERE LastName = 'Myer' OR LastName = 'Meyer'
    AND FirstName = 'Ken';

--3
SELECT BusinessEntityID,FirstName,MiddleName,LastName
FROM Person.Person
WHERE LastName = 'Meyer'
    AND FirstName = 'Ken' OR LastName = 'Myer';

--4
SELECT BusinessEntityID,FirstName,MiddleName,LastName
FROM Person.Person
WHERE FirstName = 'Ken' AND (LastName = 'Myer'
    OR LastName = 'Meyer'),

You can see the results of Listing 9-5 in Figure 9-5. Once both logical operators AND and OR are used in the WHERE clause, things can get complicated. The logical operator AND takes precedence over OR; therefore, the database engine evaluates AND first. For example, suppose you want to find a name in the Person.Person table, Ken Meyer, but you can’t remember the spelling of the last name. It could be Myer. Listing 9-5 shows four attempts to solve this problem, but only the last one is logically correct.

9781484200476_Fig09-05.jpg

Figure 9-5. The results of queries that force precedence to ensure the correct results

Query 1 returns the rows with the name Ken Myer but also returns any row with the last name Meyer. Queries 2 and 3 return identical results—the row with Ken Meyer and any rows with the last name Myer. Finally, by using the parentheses, query 4 returns the expected results.

When using multiple conditions, you must be very careful about the precedence, or order, that the expressions are evaluated. The database engine evaluates the conditions in the WHERE clause from left to right, but AND takes precedence over OR. Rearranging the terms can produce different but possibly still invalid results as in the previous example. To guarantee that the query is correct, always use parentheses to enforce the logic once the logical operator OR is added to the WHERE clause. The code is also more readable with the parentheses in the case where they are not needed. Later when you look at the code, you will know exactly what you had intended.

Using NOT with Parentheses

Another interesting twist when using parentheses is that you can negate the meaning of the expression within them by specifying the keyword NOT. For example, you could try to find the rows where the first name is Ken and the last name can’t be Myer or Meyer. Type in and execute Listing 9-6 to see two ways to write the query.

Listing 9-6. Using NOT with Parentheses

--1
SELECT BusinessEntityID,FirstName,MiddleName,LastName
FROM Person.Person
WHERE FirstName='Ken' AND LastName <> 'Myer'
    AND LastName <> 'Meyer';

--2
SELECT BusinessEntityID,FirstName,MiddleName,LastName
FROM Person.Person
WHERE FirstName='Ken'
    AND NOT (LastName = 'Myer' OR LastName = 'Meyer'),

Often multiple ways exist to solve the same problem, as in this case. Query 1 contains three expressions. One expression restricts the rows to those where FirstName is Ken. The other two expressions compare LastName to a value using not equal to (<>). In query 2, the expressions within the parentheses are evaluated first. Next, that result is negated by the NOT operator to find all last names that are not Myer or Meyer. Finally, only the rows that also have the first name Ken are returned. You can see the results in Figure 9-6.

9781484200476_Fig09-06.jpg

Figure 9-6. The identical results of two queries with different techniques

As a best practice, always employ parentheses whether they are needed or not to enforce precedence when the WHERE clause includes the logical operator OR. Not only will this decrease the possibility of an incorrect WHERE clause, but it will also increase the readability of the query.

Complete Exercise 9-2 to practice the use of these skills.

EXERCISE 9-2

Use the AdventureWorks database to complete this exercise. The solution can be found at the end of the chapter.

  1. Write a query that returns any orders that were placed in 2006. The orders must have a total due over $1,000 or have a non-NULL credit card. Display the order ID, order date, total due, and credit card information.
  2. Write a query that returns a list of the products where the subcategory or the product name contains the word Mountain. Display only those rows where the color is silver. Return the subcategory and product names, the product ID, and the color information.

Performing a Full-Text Search

You have learned how to use LIKE and PATINDEX to find a character match in data. Full-text search provides the ability to search for words or phrases within string or binary data columns similar to a web search such as Google or Bing. You can only use LIKE for pattern matching and not for searching binary data. Full-text search has support for multiple languages and other features such as synonym searches. Full-text search is especially beneficial for documents stored as binary data in the database.

Full-text search must be installed by running the SQL Server installation either during the initial set up or added later. A special full-text index needs to be created on the table. It is beyond the scope of this book to teach you how to set up and manage full-text search, but here you will see how to write some of the basic queries. For more information about full-text search and some of the features not covered here, like CONTAINSTABLE, see the book Pro Full-Text Search in SQL Server 2008 by Hilary Cotter and Michael Coles (Apress, 2008). There are a number of enhancements to full-text searching introduced with SQL Server 2012 that allow you to search the meta-data, for example, the document author, as well. The AdventureWorks database ships with three full-text indexes already in place. Table 9-2 lists the columns with full-text indexes included by default in AdventureWorks.

Table 9-2. AdventureWorks Columns with Full-Text Indexes

Table Name

Column Name

Data Type

Production.ProductReview

Comments

NVARCHAR(3850)

Production.Document

DocumentSummary

NVARCHAR(MAX)

Production.Document

Document

VARBINARY(MAX)

HumanResources.JobCandidate

Resume

XML

Using CONTAINS

CONTAINS is one of the functions used to search full-text indexes. The simplest way to use CONTAINS is to search a column for a particular word or phrase. Here is the syntax for CONTAINS:

SELECT <column1>,<column2>
FROM <schema>.<tablename>
WHERE CONTAINS(<indexed column>,<searchterm>);

Listing 9-7 shows how to use CONTAINS. Notice that the second query has a regular predicate in the WHERE clause as well. Be sure to type in and execute the code to learn how to use CONTAINS.

Listing 9-7. Using CONTAINS

--1
SELECT FileName
FROM Production.Document
WHERE CONTAINS(Document,'important'),

--2
SELECT FileName
FROM Production.Document
WHERE CONTAINS(Document,' "service guidelines" ')
    AND DocumentLevel = 2;

Figure 9-7 displays the results of running this code. Notice how double quotes are used within single quotes to designate a phrase in query 2. Query 2 also demonstrates that both a full-text predicate and a regular predicate can be used in the same query. You may be wondering why the Document column is not part of the results since that is the search term. The document is actually a binary file, such as a Microsoft Word document, that must be opened by the appropriate application.

9781484200476_Fig09-07.jpg

Figure 9-7. The results of a full-text search operation

Using Multiple Terms with CONTAINS

You can use CONTAINS to find words in data that are not even next to each other by using AND, OR, and NEAR. You can use the operator AND NOT to find results with one term and not another. Listing 9-8 demonstrates this technique.

Listing 9-8. Multiple Terms in CONTAINS

--1
SELECT FileName, DocumentSummary
FROM Production.Document
WHERE CONTAINS(DocumentSummary,'bicycle AND reflectors'),

--2
SELECT FileName, DocumentSummary
FROM Production.Document
WHERE CONTAINS(DocumentSummary,'bicycle AND NOT reflectors'),

--3
SELECT FileName, DocumentSummary
FROM Production.Document
WHERE CONTAINS(DocumentSummary,'maintain NEAR bicycle AND NOT reflectors'),

Figure 9-8 shows the results of running this code. In this case, a regular string data column, DocumentSummary, is searched so you can verify the results.

9781484200476_Fig09-08.jpg

Figure 9-8. The results from using multiple search terms

Searching Multiple Columns

You can search multiple columns or all full-text indexed columns at once without multiple CONTAINS predicates in the WHERE clause. Use the asterisk to specify that all possible columns are searched, or use a comma-delimited list in parentheses to specify a list of columns. Type in and execute the code in Listing 9-9, which demonstrates these techniques.

Listing 9-9. Using Multiple Columns

--1
SELECT FileName, DocumentSummary
FROM Production.Document
WHERE CONTAINS((DocumentSummary,Document),'maintain'),

--2
SELECT FileName, DocumentSummary
FROM Production.Document
WHERE CONTAINS((DocumentSummary),'maintain')
        OR CONTAINS((Document),'maintain')

--3
SELECT FileName, DocumentSummary
FROM Production.Document
WHERE CONTAINS(*,'maintain'),

The list of columns to be searched in query 1 is explicitly listed and contained within an inner set of parentheses. Query 2 is equivalent to query 1 by using two CONTAINS expressions, each searching a different column for the same term. By using the asterisk in query 3 within the CONTAINS expression, all columns with a full-text index are searched.

Using FREETEXT

FREETEXT is similar to CONTAINS except that it returns rows that don’t exactly match. It will return rows that have terms with similar meanings to your search terms by using a thesaurus. FREETEXT is less precise than CONTAINS, and it is less flexible. The keywords AND, OR, and NEAR can’t be used with CONTAINS. Avoid using double quotes that specify an exact phrase with FREETEXT, because then SQL Server won’t use the thesaurus and will search only for the exact phrase. The same rules about multiple columns apply. Type in and execute the code in Listing 9-10, which compares FREETEXT to LIKE.

Listing 9-10. Using FREETEXT

--1
SELECT FileName, DocumentSummary
FROM Production.Document
WHERE FREETEXT((DocumentSummary),'provides'),

--2
SELECT FileName, DocumentSummary
FROM Production.Document
WHERE DocumentSummary LIKE '%provides%'

Figure 9-9 displays the results from Listing 9-10. The DocumentSummary values in the rows returned from query 1 do not contain the word provides. Query 1 returns the rows anyway because FREETEXT will find similar words as well as exact matches. In each case, the word provided can be found in the data. Query 2 doesn’t return the rows because the LIKE operator is looking for an exact match.

9781484200476_Fig09-09.jpg

Figure 9-9. The results from using FREETEXT

Full-text search operations can get much more complicated than the information provided here. This was meant to be an overview of the basic syntax. Be sure to see the book Pro Full-Text Search in SQL Server 2008 by Hilary Cotter and Michael Coles (Apress, 2008) to learn more about full-text search. Practice what you have just learned about full-text search by completing Exercise 9-3.

EXERCISE 9-3

Use the AdventureWorks database to complete the following tasks. Be sure to take advantage of the full-text indexes in place when writing the queries. You can find the solutions at the end of the chapter.

  1. Write a query using the Production.ProductReview table. Use CONTAINS to find all the rows that have the word socks in the Comments column. Return the ProductID and Comments columns.
  2. Write a query using the Production.Document table. Use CONTAINS to find all the rows that have the word reflector in any column that is indexed with full-text search. Display the Title and FileName columns.
  3. Change the query in question 2 so that the rows containing seat are not returned in the results.
  4. Write a query that returns the rows containing any forms of the word replaced from the Production.Document table. Return the title, file name, and document summary columns.

Thinking About Performance

You learned in Chapter 4 that using a function on a column in the WHERE clause can decrease performance, especially if that column is the key column of an index that could have been used for the query. If all the columns required for the query are part of a nonclustered index, either as a key column or included column, it is still possible that the optimizer will choose to scan the nonclustered index instead of scanning the table. You may be wondering what the difference is. The nonclustered index will almost always be a smaller structure than the table and will be less work to scan. Imagine that you had to find a particular phrase in a book. It would be easier for you to read the index instead of reading the entire book because the index is smaller. Type and execute Listing 9-11 to see some examples.

Listing 9-11. Comparing LIKE with CHARINDEX

--1
SET STATISTICS IO ON;
GO
SELECT Name
FROM Production.Product
WHERE CHARINDEX('Bear',Name) = 1;

--2
SELECT Name
FROM Production.Product
WHERE Name LIKE 'Bear%';

--3
SELECT Name, Color
FROM Production.Product
WHERE CHARINDEX('B',Color) = 1;

--4
SELECT Name, Color
FROM Production.Product
WHERE Color LIKE 'B%';

Query 1 uses the CHARINDEX function to find all rows where the Name value starts with Bear. It will return the exact same results as query 2, which uses the LIKE operator. If you take a look at the Messages tab to view the Logical reads, you will see that query 1 takes six logical reads compared to only two logical reads from query 2. The reason for this is that query 1 is scanning the nonclustered index while query 2 is doing a seek against the same index. In this case, a seek of the index is less work.

Queries 3 and 4 do the same test against a non-indexed column, Color. In this case, there is no difference between using a function and LIKE. Both queries take 15 logical reads, the work to scan the entire table. Figure 9-10 shows the Statistics IO information for the four queries.

9781484200476_Fig09-10.jpg

Figure 9-10. The logical reads comparison

A term you may see that describes the condition where the optimizer can take full advantage of an index is sargable. The term sarg is short for search argument, which is then turned into the adjective sargable describing the search predicate. Of the four queries in this section, only the predicate in query 2 is sargable.

Summary

This chapter introduced several of the more advanced topics you need to learn to fully take advantage of complex WHERE clauses. In addition, you learned about an optional feature of SQL Server, the full-text index. These indexes are used most often with data containing paragraphs or documents, not the keys and limited values of most databases.

Now that you have learned about all the topics involved with selecting data, it is time to learn how to manipulate data by inserting, updating, and deleting rows in Chapter 10.

Answers to the Exercises

This section provides solutions to the exercises found on writing queries with advanced feature WHERE clauses.

Solutions to Exercise 9-1: Using LIKE

Use the AdventureWorks database to complete this exercise.

  1. Write a query that displays the product ID and name for each product from the Production.Product table with a name starting with Chain.
    SELECT ProductID, Name
    FROM Production.Product
    WHERE Name LIKE 'Chain%';
  2. Write a query like the one in question 1 that displays the products with Paint in the name.
    SELECT ProductID, Name
    FROM Production.Product
    WHERE Name LIKE '%Paint%';
  3. Change the last query so that the products without Paint in the name are displayed.
    SELECT ProductID, Name
    FROM Production.Product
    WHERE Name NOT LIKE '%Paint%';
  4. Write a query that displays the business entity ID number, first name, middle name, and last name from the Person.Person table for only those rows that include E or B in the middle name column.
    SELECT BusinessEntityID,
         FirstName, MiddleName,
         LastName
    FROM Person.Person
    WHERE MiddleName LIKE '%[EB]%';
  5. Explain the difference between the following two queries:
    SELECT FirstName
    FROM Person.Person
    WHERE LastName LIKE 'Ja%es';

    SELECT FirstName
    FROM Person.Person
    WHERE LastName LIKE 'Ja_es';

The first query will find any rows with a last name that starts with Ja and ends with es. There can be any number of characters in between. The second query allows only one character in between Ja and es.

Solution to Exercise 9-2: Using WHERE Clauses with Three or More Predicates

Use the AdventureWorks database to complete this exercise.

  1. Write a query that returns any orders that were placed in 2006. The orders must have a total due over $1,000 or have a non-NULL credit card. Display the order ID, order date, total due, and credit card information.
    SELECT SalesOrderID, OrderDate, TotalDue, CreditCardID
    FROM Sales.SalesOrderHeader
    WHERE OrderDate >= '2006/01/01'
        AND OrderDate < '2007/01/01'
        AND (TotalDue > 1000 OR CreditCardID IS NOT NULL);
  2. Write a query that returns a list of the products where the subcategory or the product name contains the word mountain. Display only those rows where the color is silver. Return the subcategory and product names, the product ID, and the color information.
    SELECT SUB.Name AS [SubCategory Name],
        P.Name AS [Product Name], ProductID, Color
    FROM Production.Product P
    JOIN Production.ProductSubcategory SUB
        ON P.ProductSubcategoryID = SUB.ProductSubcategoryID
    WHERE (SUB.Name LIKE '%mountain%' OR P.name like '%mountain%')
        AND Color = 'Silver';

Solution to Exercise 9-3: Performing a Full-Text Search

Use the AdventureWorks database to solve the exercises.

  1. Write a query using the Production.ProductReview table. Use CONTAINS to find all the rows that have the word socks in the Comments column. Return the ProductID and Comments columns.
    SELECT ProductID, Comments
    FROM Production.ProductReview
    WHERE CONTAINS(Comments,'socks'),
  2. Write a query using the Production.Document table. Use CONTAINS to find all the rows that have the word reflector in any column that is indexed with full-text search. Display the Title and FileName columns.
    SELECT Title, FileName
    FROM Production.Document
    WHERE CONTAINS(*,'reflector'),
  3. Change the query in question 2 so that the rows containing seat are not returned in the results.
    SELECT Title, FileName
    FROM Production.Document
    WHERE CONTAINS(*,'reflector AND NOT seat'),
  4. Write a query that returns the rows containing any forms of the word replaced from the Production.Document table. Return the title, file name, and document summary columns.
    SELECT Title, FileName, DocumentSummary
    FROM Production.Document
    WHERE FREETEXT(*,'replaced'),
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset