Chapter 8. Using the ArcPy Data Access Module with Feature Classes and Tables

In this chapter, we will cover the following recipes:

  • Retrieving features from a feature class with SearchCursor
  • Filtering records with a where clause
  • Improving cursor performance with geometry tokens
  • Inserting rows with InsertCursor
  • Updating rows with UpdateCursor
  • Deleting rows with UpdateCursor
  • Inserting and updating rows inside an edit session
  • Reading geometry from a feature class
  • Using Walk() to navigate directories

Introduction

We'll start this chapter with a basic question. What are cursors? Cursors are in-memory objects containing one or more rows of data from a table or feature class. Each row contains attributes from each field in a data source along with the geometry for each feature. Cursors allow you to search, add, insert, update, and delete data from tables and feature classes.

The arcpy data access module or arcpy.da was introduced in ArcGIS 10.1 and contains methods that allow you to iterate through each row in a cursor. Various types of cursors can be created depending on the needs of developers. For example, search cursors can be created to read values from rows. Update cursors can be created to update values in rows or delete rows, and insert cursors can be created to insert new rows.

There are a number of cursor improvements that have been introduced with the arcpy data access module. Prior to the development of ArcGIS 10.1, cursor performance was notoriously slow. Now, cursors are significantly faster. Esri has estimated that SearchCursors are up to 30 times faster, while InsertCursors are up to 12 times faster. In addition to these general performance improvements, the data access module also provides a number of new options that allow programmers to speed up processing. Rather than returning all the fields in a cursor, you can now specify that a subset of fields be returned. This increases the performance as less data needs to be returned. The same applies to geometry. Traditionally, when accessing the geometry of a feature, the entire geometric definition would be returned. You can now use geometry tokens to return a portion of the geometry rather than the full geometry of the feature. You can also use lists and tuples rather than using rows. There are also other new features, such as edit sessions and the ability to work with versions, domains, and subtypes.

There are three cursor functions in arcpy.da. Each returns a cursor object with the same name as the function. SearchCursor() creates a read-only SearchCursor object containing rows from a table or feature class. InsertCursor() creates an InsertCursor object that can be used to insert new records into a table or feature class. UpdateCursor() returns a cursor object that can be used to edit or delete records from a table or feature class. Each of these cursor objects has methods to access rows in the cursor. You can see the relationship between the cursor functions, the objects they create, and how they are used, as follows:

Function

Object created

Usage

SearchCursor()

SearchCursor

This is a read-only view of data from a table or feature class

InsertCursor()

InsertCursor

This adds rows to a table or feature class

UpdateCursor()

UpdateCursor

This edits or deletes rows in a table or feature class

The SearchCursor() function is used to return a SearchCursor object. This object can only be used to iterate through a set of rows returned for read-only purposes. No insertions, deletions, or updates can occur through this object. An optional where clause can be set to limit the rows returned.

Once you've obtained a cursor instance, it is common to iterate the records, particularly with SearchCursor or UpdateCursor. There are some peculiarities that you need to understand when navigating the records in a cursor. Cursor navigation is forward-moving only. When a cursor is created, the pointer of the cursor sits just above the first row in the cursor. The first call to next() will move the pointer to the first row. Rather than calling the next() method, you can also use a for loop to process each of the records without the need to call the next() method. After performing whatever processing you need to do with this row, a subsequent call to next() will move the pointer to row 2. This process continues as long as you need to access additional rows. However, after a row has been visited, you can't go back a single record at a time. For instance, if the current row is row 3, you can't programmatically back up to row 2. You can only go forward. To revisit rows 1 and 2, you would need to either call the reset() method or recreate the cursor and move back through the object. As I mentioned earlier, cursors are often navigated through the use of for loops as well. In fact, this is a more common way to iterate a cursor and a more efficient way to code your scripts. Cursor navigation is illustrated in the following diagram:

Introduction

The InsertCursor() function is used to create an InsertCursor object that allows you to programmatically add new records to feature classes and tables. To insert rows, call the insertRow() method on this object. You can also retrieve a read-only tuple containing the field names in use by the cursor through the fields property. A lock is placed on the table or feature class being accessed through the cursor. Therefore, it is important to always design your script in a way that releases the cursor when you are done.

The UpdateCursor() function can be used to create an UpdateCursor object that can update and delete rows in a table or feature class. As is the case with InsertCursor, this function places a lock on the data while it's being edited or deleted. If the cursor is used inside a Python's with statement, the lock will automatically be freed after the data has been processed. This hasn't always been the case. Prior to ArcGIS 10.1, cursors were required to be manually released using the Python del statement. Once an instance of UpdateCursor has been obtained, you can then call the updateCursor() method to update records in tables or feature classes and the deleteRow() method to delete a row.

The subject of data locks requires a little more explanation. The insert and update cursors must obtain a lock on the data source they reference. This means that no other application can concurrently access this data source. Locks are a way of preventing multiple users from changing data at the same time and thus, corrupting the data. When the InsertCursor() and UpdateCursor() methods are called in your code, Python attempts to acquire a lock on the data. This lock must be specifically released after the cursor has finished processing so that the running applications of other users, such as ArcMap or ArcCatalog, can access the data sources. If this isn't done, no other application will be able to access the data. Prior to ArcGIS 10.1 and the with statement, cursors had to be specifically unlocked through Python's del statement. Similarly, ArcMap and ArcCatalog also acquire data locks when updating or deleting data. If a data source has been locked by either of these applications, your Python code will not be able to access the data. Therefore, the best practice is to close ArcMap and ArcCatalog before running any standalone Python scripts that use insert or update cursors.

In this chapter, we're going to cover the use of cursors to access and edit tables and feature classes. However, many of the cursor concepts that existed before ArcGIS 10.1 still apply.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset