Constructing a proper attribute query syntax

The construction of property attribute queries is critical to your success in creating geoprocessing scripts that query data from feature classes and tables. All attribute queries that you execute against feature classes and tables will need to have the correct SQL syntax and also follow various rules depending upon the data type that you execute the queries against.

Getting ready

Creating the syntax for attribute queries is one of the most difficult and time-consuming tasks that you'll need to master when creating Python scripts that incorporate the use of the Select by Attributes tool. These queries are basically SQL statements along with a few idiosyncrasies that you'll need to master. If you already have a good understanding of creating queries in ArcMap or perhaps experience in creating SQL statements in other programming languages, then this will be a little easier for you. In addition to creating valid SQL statements, you also need to be aware of some specific Python syntax requirements and some data type differences that will result in a slightly altered formatting of your statements for some data types. In this recipe, you'll learn how to construct valid query syntax and understand the nuances of how different data types alter the syntax as well as some Python-specific constructs.

How to do it…

Initially, we're going to take a look at how queries are constructed in ArcMap, so that you can get a feel of how they are structured.

  1. In ArcMap, open C:ArcpyBookCh7Crime_Ch7.mxd.
  2. Right-click on the Burglaries in 2009 layer and select Open Attribute Table. You should see an attribute table similar to the following screenshot. We're going to be querying the SVCAREA field:
    How to do it…
  3. With the attribute table open, select the Table Options button and then Select by Attributes to display a dialog box that will allow you to construct an attribute query.

    Notice the Select * FROM Burglary WHERE: statement on the query dialog box (shown in the following screenshot). This is a basic SQL statement that will return all the columns from the attribute table for Burglary that meet the condition that we define through the query builder. The asterisk (*) simply indicates that all fields will be returned:

    How to do it…
  4. Make sure that Create a new selection is the selected item in the Method dropdown list. This will create a new selection set.
  5. Double-click on SVCAREA from the list of fields to add the field to the SQL statement builder, as follows:
    How to do it…
  6. Click on the = button.
  7. Click on the Get Unique Values button.
  8. From the list of values generated, double-click on 'North' to complete the SQL statement, as shown in the following screenshot:
    How to do it…
  9. Click on the Apply button to execute the query. This should select 7520 records.

    Many people mistakenly assume that you can simply take a query that has been generated in this fashion and paste it into a Python script. That is not the case. There are some important differences that we'll cover next.

  10. Close the Select by Attributes window and the Burglaries in 2009 table.
  11. Clear the selected feature set by navigating to Selection | Clear Selected Features.
  12. Open the Python window and add the code to import arcpy:
    import arcpy
  13. Create a new variable to hold the query and add the same statement that you created earlier:
    qry = "SVCAREA" = 'North'
  14. Press Enter on your keyboard and you should see an error message similar to the following:
    Runtime error SyntaxError: can't assign to literal (<string>, line 1)
    

    Python interprets SVCAREA and North as strings, but the equal to sign between the two is not part of the string used to set the qry variable. There are several things we need to do to generate a syntactically correct statement for the Python interpreter.

    One important thing has already been taken care of though. Each field name used in a query needs to be surrounded by double quotes. In this case, SVCAREA is the only field used in the query and it has already been enclosed by double quotes. This will always be the case when you're working with shapefiles, file geodatabases, or ArcSDE geodatabases. Here is where it gets a little confusing though. If you're working with data from a personal geodatabase, the field names will need to be enclosed by square brackets instead of double quotes, as shown in the following code example. This can certainly leads to confusion for script developers:

    qry = [SVCAREA] = 'North'

    Now, we need to deal with the single quotes surrounding 'North'. When querying data from fields that have a text data type, the string being evaluated must be enclosed by quotes. If you examine the original query, you'll notice that we have in fact already enclosed North with quotes, so everything should be fine, right? Unfortunately, it's not that simple with Python. Quotes along with a number of other characters must be escaped with a forward slash followed by the character being escaped. In this case, the escape sequence would be ' as shown in the following steps:

  15. Alter your query syntax to incorporate the escape sequence:
    qry = "SVCAREA" = 'North'
  16. Finally, the entire query statement should be enclosed with quotes:
    qry = '"SVCAREA" = 'North''

In addition to the = sign, which tests for equality, there are a number of additional operators that you can use with strings and numeric data, including not equal (< >), greater than (>), greater than or equal to (>=), less than (<), and less than or equal to (<=).

Wildcard characters, including % and _, can also be used for shapefiles, file geodatabases, and ArcSDE geodatabases. These include % that represents any number of characters. The LIKE operator is often used with wildcard characters to perform partial string matching. For example, the following query would find all records with a service area that begins with N and has any number of characters after it:

qry = '"SVCAREA" LIKE 'N%''

The (_) underscore character can be used to represent a single character. For personal geodatabases, the (*) asterisk is used to represent a wildcard character for any number of characters, while (?) represents a single character.

You can also query for the absence of data, also known as NULL values. A NULL value is often mistaken for a value of zero, but this does not always hold true. The NULL values indicate the absence of data, which is different from a value of zero. Null operators include IS NULL and IS NOT NULL. The following code example will find all the records where the SVCAREA field contains no data:

qry = '"SVCAREA" IS NULL'

The final topic that we'll cover in this section are operators used to combine expressions where multiple query conditions need to be met. The AND operator requires that both query conditions be met for the query result to be true, resulting in selected records. The OR operator requires that at least one of the conditions be met.

How it works…

The creation of syntactically correct queries is one of the most challenging aspects of programming ArcGIS with Python. However, once you understand some basic rules, it gets a little easier. In this section, we'll summarize these rules. One of the more important things to keep in mind is that field names must be enclosed with double quotes for all datasets, with the exception of personal geodatabases, which require braces surrounding field names.

There is also an AddFieldDelimiters() function that you can use to add the correct delimiter to a field based on the datasource supplied as a parameter to the function. The syntax for this function is as follows:

AddFieldDelimiters(dataSource,field)

Additionally, most people, especially those new to programming with Python, struggle with the issue of adding single quotes to string values being evaluated by the query. In Python, quotes have to be escaped with a single forward slash followed by the quote. Using this escape sequence will ensure that Python does in fact see this as a quote rather than the end of the string.

Finally, take some time to familiarize yourself with the wildcard characters. For datasets other than personal geodatabases, you'll use the % character for multiple characters and an underscore character for a single character. If you're using a personal geodatabase, the * character is used to match multiple characters and the ? character is used to match a single character. Obviously, the syntax differences between personal geodatabases and all other types of datasets can lead to some confusion.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset