Chapter 11. ADO.NET

Topics in This Chapter

  • ADO.NET Architecture: ADO.NET provides access to data using custom or generic data providers. This section looks at the classes a provider must supply that enable an application to connect to a data source and interact with it using SQL commands.

  • Introduction to Using ADO.NET: ADO.NET supports data access using a connected or disconnected connectivity model. An introduction and comparison of the two architectures is provided.

  • The Connected Model: Use of the connected model requires an understanding of the role that Connection and Command classes have in retrieving data. Examples illustrate how to create a connection and use the Command class to issue SQL commands, invoke stored procedures, and manage multi-command transactions.

  • The Disconnected Model: Disconnected data is stored in an in-memory DataTable or DataSet. The latter is usually made up of multiple DataTables that serve as a local relational data store for an application. A DataAdapter is typically used to load this data and then apply updates to the original data source. Techniques are introduced for handling synchronization issues that arise when updating disconnected data.

  • XML Data Access: Although it does not provide intrinsic XML classes, ADO.NET supports XML integration through properties and methods of the DataSet class. Examples show how to use WriteXml and ReadXml to create an XML file from a DataSet and populate a DataSet from an XML file, respectively.

ADO.NET is based on a flexible set of classes that allow data to be accessed from within the managed environment of .NET. These classes are used to access a variety of data sources including relational databases, XML files, spreadsheets, and text files. Data access is through an API, known as a managed data provider. This provider may be written specifically for a database, or may be a more generic provider such as OLE DB or ODBC (Open DataBase Connectivity). Provider classes expose a connection object that binds an application to a data source, and a command object that supports the use of standard SQL commands to fetch, add, update, or delete data.

ADO.NET supports two broad models for accessing data: disconnected and connected. The disconnected model downloads data to a client's machine where it is encapsulated as an in-memory DataSet that can be accessed like a local relational database. The connected model relies on record-by-record access that requires an open and sustained connection to the data source. Recognizing the most appropriate model to use in an application is at the heart of understanding ADO.NET. This chapter examines both models—offering code examples that demonstrate the classes used to implement each.

Overview of the ADO.NET Architecture

The ADO.NET architecture is designed to make life easier for both the application developer and the database provider. To the developer, it presents a set of abstract classes that define a common set of methods and properties that can be used to access any data source. The data source is treated as an abstract entity, much like a drawing surface is to the GDI+ classes. Figure 11-1 depicts this concept.

ADO.NET data access options

Figure 11-1. ADO.NET data access options

For database providers, ADO.NET serves as a blueprint that describes the base API classes and interface specifications providers must supply with their product. Beneath the surface, the vendor implements the custom code for connecting to their database, processing SQL commands, and returning the results. Many database products, such as MySQL and Oracle, have custom .NET data provider implementations available. In addition, they have generic OLE DB versions. The .NET data provider should always be the first choice because it offers better performance and often supports added custom features. Let's look at both the OLE DB and native .NET data providers

OLE DB Data Provider in .NET

An OLE DB provider is the code that sits between the data consumer and the native API of a data source. It maps the generic OLE DB API to the data source's native APIs. It is a COM-based solution in which the data consumer and provider are COM objects that communicate through COM interfaces. Database vendors and third parties have written OLE DB providers for just about every significant data source. In contrast, far fewer .NET data providers exist. To provide a bridge to these preexisting OLE DB interfaces, .NET includes an OleDB data provider that functions as a thin wrapper to route calls into the native OLE DB. Because interoperability with COM requires switching between managed and unmanaged code, performance can be severely degraded.[1]

As we see in the next section, writing code to use OLE DB is essentially the same as working with a .NET data provider. In fact, new .NET classes provide a “factory” that can dynamically produce code for a selected provider. Consequently, responding to a vendor's upgrade from OLE DB to a custom provider should have no appreciable effect on program logic.

.NET Data Provider

The .NET data provider provides the same basic service to the client as the OLE DB provider: exposing a data source's API to a client. Its advantage is that it can directly access the native API of the data source, rather than relying on an intermediate data access bridge. Native providers may also include additional features to manipulate data types native to the data source and improve performance. For example, the Oracle provider, ODP.NET, includes adjustable settings to control connection pooling, the number of rows to be pre-fetched, and the size of non-scalar data being fetched.

Data Provider Objects for Accessing Data

A managed data provider exposes four classes that enable a data consumer to access the provider's data source. Although these classes are specific to the provider, they derive from abstract ADO.NET classes:

  • DbConnectionEstablishes a connection to the data source.

  • DbCommandUsed to query or send a command to the data source.

  • DbDataReaderProvides read-only and forward-only access to the data source.

  • DBDataAdapterServes as a channel through which a DataSet connects to a provider.

Because these are abstract classes, the developer is responsible for specifying the vendor's specific implementation within the code. As we see next, the object names can be hard coded or provided generically by a provider factory class.

Provider Factories

Each data provider registers a ProviderFactory class and a provider string in the machine.config file. The available providers can be listed using the static GetFactoryClasses method of the DbProviderFactories class. As this code shows, the method returns a DataTable containing four columns of information about the provider.

DataTable tb = DbProviderFactories.GetFactoryClasses();
foreach (DataRow drow in tb.Rows )
{
   StringBuilder sb = new StringBuilder("");
   for (int i=0; i<tb.Columns.Count; i++)
   {
      sb.Append((i+1).ToString()).Append(drow[i].ToString());
      sb.Append("
");
   }
   Console.WriteLine(sb.ToString());
}

Running this code for ADO.NET 2.0 lists four Microsoft written providers: Odbc, OleDb, OracleClient, and SqlClient. Figure 11-2 shows output for the SqlClient provider.

Data provider information returned by GetFactoryClasses()

Figure 11-2. Data provider information returned by GetFactoryClasses()

To use these providers, your code must create objects specific to the provider. For example, the connection object for each would be an OdbcConnection, OleDbConnection, OracleConnection, or SqlConnection type. You can create the objects supplied by the providers directly:

SqlConnection conn = new SqlConnection();
SqlCommand cmd     = new SqlCommand();
SqlDataReader dr   = cmd.ExecuteReader();

However, suppose your application has to support multiple data sources. A switch/case construct could be used, but a better—and more flexible—approach is to use a class factory to create these objects dynamically based on the provider selected. ADO.NET provides just that—a DbProviderFactory class that is used to return objects required by a specific data provider. It works quite simply. A string containing the provider name is passed to the GetFactory method of the DbProviderFactories class. The method returns a factory object that is used to create the specific objects required by the provider. Listing 11-1 demonstrates using a factory.

Example 11-1. Using the DbProviderFactory Class

// System.Data.Common namespace is required
DbProviderFactory factory ;
string provider = "System.Data.SqlClient";  // data provider
string connstr = "Data Source=MYSERVER;Initial Catalog=films;
           User Id=filmsadmin;Password=bogart;";
// Get factory object for SQL Server
factory = DbProviderFactories.GetFactory(provider);
// Get connection object. using ensures connection is closed.
using (DbConnection  conn = factory.CreateConnection())
{
   conn.ConnectionString = connstr;
   try
   {
      conn.Open();
      DbCommand cmd = factory.CreateCommand(); // Command object
      cmd.CommandText = "SELECT * FROM movies WHERE movie_ID=8" ;
      cmd.Connection = conn;
      DbDataReader dr;
      dr = cmd.ExecuteReader();
      dr.Read();
      MessageBox.Show((string)dr["movie_Title"]);
      conn.Close();
   }
   catch (DbException ex)
   { MessageBox.Show(ex.ToString()); }
   catch (Exception ex)
   { MessageBox.Show(ex.ToString()); }
   finally { conn.Close(); }
}

This approach requires only that a provider and connection string be provided. For example, we can easily switch this to an ODBC provider by changing two statements:

string provider= "System.Data.Odbc";
// The DSN (Data Source Name) is defined using an ODBC utility
string connstr = "DSN=movies;Database=films";

Note that the factory class provides a series of Create methods that returns the objects specific to the data provider. These methods include CreateCommand, CreateConnection, and CreateDataAdapter.

Data Access Models: Connected and Disconnected

This section offers an overview of using ADO.NET to access data stored in relational tables. Through simple examples, it presents the classes and concepts that distinguish the connected and disconnected access models.

All examples in this section—as well as the entire chapter—use data from the Films database defined in Figure 11-3. It consists of a movies table containing the top 100 movies as selected by the American Film Institute (AFI) in 1996, an actors table that lists the principal actors who performed in the movies, and an actor-movie helper table that links the two. The data is downloadable as a Microsoft Access (.mdb) file and an XML text (.xml) file.

Films database tables

Figure 11-3. Films database tables

Connected Model

In the ADO.NET connected mode, an active connection is maintained between an application's DataReader object and a data source. A row of data is returned from the data source each time the object's Read method is executed. The most important characteristic of the connected model is that it reads data from a resultset (records returned by a SQL command) one record at a time in a forward-only, read-only manner. It provides no direct way to update or add data. Figure 11-4 depicts the relation ship between the DataReader, Command, and Connection classes that comprise the connected model.

DataReader is used in ADO.NET connected mode

Figure 11-4. DataReader is used in ADO.NET connected mode

Working with the DataReader typically involves four steps:

  1. The connection object is created by passing a connection string to its constructor.

  2. A string variable is assigned the SQL command that specifies the data to fetch.

  3. A command object is created. Its overloads accept a connection object, a query string, and a transaction object (for executing a group of commands).

  4. The DataReader object is created by executing the Command.ExecuteReader() method. This object is then used to read the query results one line at a time over the active data connection.

The following code segment illustrates these steps with a SqlClient data provider. The code reads movie titles from the database and displays them in a ListBox control. Note that the DataReader, Command, and Connection objects are described in detail later in this chapter.

//System.Data.SqlClient namespace is required
// (1) Create Connection
SqlConnection conn = new SqlConnection(connstr);
conn.Open();
// (2) Query string
string sql = "SELECT movie_Title FROM movies ORDER BY
movie_Year";
// (3) Create Command object
SqlCommand cmd = new SqlCommand(sql, conn);
DbDataReader rdr;
// (4) Create DataReader
rdr = cmd.ExecuteReader(CommandBehavior.CloseConnection);
while (rdr.Read())
{
   listBox1.Items.Add(rdr["movie_Title"]);  // Fill ListBox
}
rdr.Close();  // Always close datareader

The parameter to ExecuteReader specifies that the connection is closed when the data reader object is closed.

Disconnected Model

The concept behind the disconnected model is quite simple: Data is loaded—using a SQL command—from an external source into a memory cache on the client's machine; the resultset is manipulated on the local machine; and any updates are passed from data in-memory back to the data source.

The model is “disconnected” because the connection is only open long enough to read data from the source and make updates. By placing data on the client's machine, server resources—data connections, memory, processing time—are freed that would otherwise be required to manipulate the data. The drawback is the time required to load the resultset, and the memory used to store it.

As Figure 11-5 illustrates, the key components of the disconnected model are the DataApdapter and DataSet. The DataAdapter serves as a bridge between the data source and the DataSet, retrieving data into the tables that comprise the DataSet and pushing changes back to the data source. A DataSet object functions as an in-memory relational database that contains one or more DataTables, along with optional relationships that bind the tables. A DataTable contains rows and columns of data that usually derive from a table in the source database.

DataAdapter is used in ADO.NET disconnected mode

Figure 11-5. DataAdapter is used in ADO.NET disconnected mode

Among the numerous methods and properties exposed by the DataAdapter class, the Fill and Update methods are the two most important. Fill passes a query to a database and stores the returned set of data in a selected DataTable; Update performs a deletion, insertion, or update operation based on changes within the DataSet. The actual update commands are exposed as DataAdapter properties. The DataAdapter is presented in much more detail in Section 11.4, “DataSets, DataTables, and the Disconnected Model.”

Core Note

Core Note

Each data provider supplies its own data adapter. Thus, if you look through the System.Data child namespaces (SqlClient, OracleClient, Oledb), you'll find a SqlDataAdapter, OracleDataAdapter, and OleDbDataAdapter, among others. An easy way to acquire the desired adapter in your application is to call the DbProviderFactory.CreateDataAdapter method to return an instance of it.

As a simple introduction to how a DataAdapter and DataSet work together, Listing 11-2 shows how to create a DataTable, fill it with data from a database, and add it to a DataSet.

Example 11-2. Using a DataAdapter to Load Data from a Database

string sql = "SELECT movie_Title, movie_Year FROM movies";
string connStr = " Data Source=MYSERVER;Initial Catalog=films;
                   User Id=filmsadmin;Password=bogart;";
// (1) Create data adapter object
SqlDataAdapter da = new SqlDataAdapter(sql,connStr);
// (2) Create dataset
DataSet ds = new DataSet();
// (3) Create table in dataset and fill with data
da.Fill(ds, "movies");  // Fill table with query results
DataTable dt = ds.Tables["movies"];
// (4) Add movie titles to list box
for (int i=0; i< dt.Rows.Count;i++)
{
   DataRow row = dt.Rows[i];
   listBox1.Items.Add(row["movie_Title"]);
}

The first step is to create an instance of a SqlDataAdapter by passing the select command and the connection string to its constructor. The data adapter takes care of creating the Connection object and opening and closing the connection as needed. After an empty DataSet is created, the DataAdapter's Fill method creates a table movies in the DataSet and fills it with rows of data returned by the SQL command. Each column of the table corresponds to a column in the source data table. Behind the scenes, the data transfer is performed by creating a SqlDataReader that is closed after the transfer is complete.

The data in the table is then used to populate a list box by looping through the rows of the table. As we see in the next chapter, we could achieve the same effect by binding the list control to the table—a mechanism for automatically filling a control with data from a bound source.

ADO.NET Connected Model

As described earlier, the connected model is based on establishing a connection to a database and then using commands to update, delete, or read data on the connected source. The distinguishing characteristic of this model is that commands are issued directly to the data source over a live connection—which remains open until the operations are complete. Whether working with a connected or disconnected model, the first step in accessing a data source is to create a connection object to serve as a communications pathway between the application and database.

Connection Classes

There are multiple connection classes in ADO.NET—each specific to a data provider. These include SqlConnection, OracleConnection, OleDBConnection, and OdbcConnection. Although each may include custom features, ADO.NET compatibility requires that a connector class implement the IDbConnection interface. Table 11-1 summarizes the members defined by this interface.

Table 11-1. Members of the IDbConnection Interface

Category

Name

Description

Property

ConnectionString

Gets or sets the string used to connect to a data source.

Property

ConnectionTimeout

The number of seconds to wait while trying to establish a connection to a data source before timing out.

Property

Database

Name of the database associated with the current connection.

Property

State

Current state of the connection. Returns a ConnectionState enumeration name: Broken, Closed, Connecting, Executing, Fetching, or Open.

Method

Open Close

Opens a connection. Rolls back any pending operations and closes the connection—returning it to a connection pool, if one is used.

Method

BeginTransaction

Initiates a database transaction.

Method

ChangeDatabase

Changes the current database for an open connection. The new database name is passed as string to the method.

Method

CreateCommand

Creates a command object to be associated with connection.

Core Note

Core Note

Even though connection classes implement the IDbConnection interface, they do not necessarily have to provide meaningful functionality. For example, the OracleConnection class does not support the ConnectionTimeOut, Database, or ChangeDatabase members.

Connection String

The connection string specifies the data source and necessary information required to access the data source, such as password and ID. In addition to this basic information, the string can include values for fields specific to a data provider. For example, a SQL Server connection string can include values for Connection Timeout and Packet Size (size of network packet).

Table 11-2 offers a representative list of commonly used connection strings.

Table 11-2. Connection String Examples

Connection Type

Description and Use

SqlConnection

Using SQL Server authentication.

"server=MYSERVER;
uid=filmsadmin;
pwd=bogart;
database=films;"

Or

"Data Source=MYSERVER;User ID=filmsadmin;
password=bogart;Initial Catalog=films;"

SqlConnection

Using Windows authentication.

"server=MYSERVER;
database=films;
Trusted_Connection=yes"

OleDbConnection

Connects to a Microsoft Access database.

"Provider=Microsoft.Jet.OLEDB.4.0;
Data Source=c:\movies.mdb;"

For Internet applications, you may not be able to specify a physical path. Use MapPath to convert a virtual path into a physical path:

string path=
   Server.MapPath("/data/movies.mdb");
Data Source="+path+";"

ODBC (DSN)

"DSN=movies;".

The connection string is used to create the connection object. This is typically done by passing the string to the constructor of the connection object.

string cn= "Data Source=MYSERVER;Initial Catalog=films;
            User Id=filmsadmin;Password=bogart;";
SqlConnection conn = new SqlConnection(cn);
conn.Open();   // Open connection

A connection string can also be built using a safer, object-oriented manner using one of the ConnectionStringBuilder classes supplied by a managed data provider.[2] As this code demonstrates, the values comprising the connection string are assigned to properties of this class. Internally, the object constructs a string from these properties and exposes it as a ConnectionString property.

SqlConnectionStringBuilder sqlBldr = new
      SqlConnectionStringBuilder();
scb.DataSource    = "MYSERVER";
// Or scp["Data Source"] = "MYSERVER";
sqlBldr.Password  = "bogart";
sqlBldr.UserID    = "filmsadmin";
sqlBldr.InitialCatalog = "films";
SqlConnection conn = new
SqlConnection(sqlBldr.ConnectionString);
conn.Open();

The ConnectionStringBuilder object is also useful for applications that input the connection string from a configuration file or other source. Setting the ConnectionString property to the connection string value exposes members that control the behavior of the connection. Table 11-3 lists selected properties of the SqlConnectionStringBuilder class.

Table 11-3. Selected Properties of the SqlConnectionStringBuilder Class

Method

Description

AsynchronousProcessing

Boolean value that indicates whether asynchronous process is permitted on the connection. The command object is responsible for making asynchronous requests.

ConnectionTimeout

Corresponds to the ConnectionTimeout property of the Connection object.

DataSource

Name or address of the SQL Server to connect to.

MaxPoolSize MinPoolSize

Sets or returns the maximum and minimum number of connections in the connection pool for a specific connection string.

Password

Password for accessing SQL Server account.

Pooling

A boolean value that indicates whether connection pooling is used.

UserID

User ID required to access a SQL Server account.

Core Note

Core Note

For demonstration purposes, the connection strings in these examples are shown as cleartext within the application's code. In reality, a connection string should be stored outside of an application's assembly. For Web applications, the Web.Config file is often a reasonable choice. As described in Chapter 17, “The ASP.NET Application Environment,” .NET includes a special configuration section to hold connection strings and supports techniques to encrypt the configuration information.

Desktop applications that access a central database can store the information on the client's machine (in the registry or a configuration file) or have it downloaded as part of an application's startup. The latter approach provides better scalability and security, particularly if the server returns a connection object rather than the string.

Connection Pooling

Creating a connection is a time-consuming process—in some cases taking longer than the subsequent commands take to execute. To eliminate this overhead, ADO.NET creates a pool of identical connections for each unique connection string request it receives. This enables future requests with that connection string to be satisfied from the pool, rather than by reconnecting to the server and performing the overhead to validate the connection.

There are several rules governing connection pooling that you should be aware of:

  • Connection pooling is turned on by default. It can be disabled for a SqlConnection by including "Pooling=false" in the connection string; an OleDbConnection requires "OLE DB Services=-4".

  • Each connection pool is associated with a distinct connection string. When a connection is requested, the pool handler compares the connection string with those of existing pools. If it matches, a connection is allocated from the pool.

  • If all connections in a pool are in use when a request is made, the request is queued until a connection becomes free. Connections are freed when the Close or Dispose method on a connection is called.

  • The connection pool is closed when all connections in it are released by their owners and have timed out.

Under SQL Server, you control the behavior of connection pooling by including key-value pairs in the connection string. These keywords can be used to set minimum and maximum numbers of connections in the pool, and to specify whether a connection is reset when it is taken from the pool. Of particular note is the Lifetime keyword that specifies how long a connection may live until it is destroyed. This value is checked when a connection is returned to the pool. If the connection has been open longer than its Lifetime value, it is destroyed.

This code fragment demonstrates the use of these keywords for SqlClient:

cnString = "Server=MYSERVER;Trusted_Connection=yes;
      database=films;"          +
      "connection reset=false;" +
      "connection Lifetime=60;" +  // Seconds
      "min pool size=1;"        +
      "max pool size=50";           // Default=100
SqlConnection conn = new SqlConnection(cnString);

The Command Object

After a connection object is created, the next step in accessing a database—for the connected model—is to create a command object that submits a query or action command to a data source. Command classes are made available by data providers and must implement the IDbCommand interface.

Creating a Command Object

You can use one of its several constructors to create a command object directly, or use the ProviderFactory approach mentioned in Section 11.1.

This segment demonstrates how to create a command object and explicitly set its properties:

SqlConnection conn = new SqlConnection(connstr);
Conn.open();
string sql = "insert into movies(movie_Title,movie_Year,
        movie_Director) values(@title,@yr,@bestpicture)";
SqlCommand cmd = new SqlCommand();
// Assign connection object and sql query to command object
cmd.Connection = conn;
cmd.commandText = sql;
// Fill in parameter values in query
// This is recommended over concatenation in a query string
cmd.Parameters.AddWithValue ("@title", "Schindler's List");
cmd.Parameters.AddWithValue ("@yr", "1993");
cmd.Parameters.AddWithValue ("@bestpic", "Y");

In situations where multiple data providers may be used, a provider factory provides a more flexible approach. The factory is created by passing its constructor a string containing the data provider. The factory's CreateCommand method is used to return a command object.

string provider = "System.Data.SqlClient";
DBProviderFactory factory =
   DbProviderFactories.GetFactory(provider);
DbCommand cmd   = factory.CreateCommand();
cmd.CommandText = sql;     // Query or command
cmd.Connection  = conn;    // Connection object

Note that DbCommand is an abstract class that implements the IDbCommand interface. It assumes the role of a generic command object. This can eliminate the need to cast the returned command object to a specific provider's command object such as SqlCommand. However, casting is required if you need to access custom features of a provider's command class—for example, only SqlCommand has an ExecuteXmlReader method.

Executing a Command

The SQL command assigned to the CommandText property is executed using one of the four command methods in Table 11-4.

Table 11-4. Command Executexxx Methods

Method

Description

ExecuteNonQuery

Executes an action query and returns the number of rows affected:

cmd.CommandText = "DELETE movies WHERE movie_ID=220";
int ct = cmd.ExecuteNonQuery();

ExecuteReader

Executes a query and returns a DataReader object that provides access to the query's resultset. This method accepts an optional CommandBehavior object that can improve execution efficiency.

cmd.CommandText="SELECT * FROM movies
   WHERE movie_year > '1945';
SqlDataReader rdr= cmd.ExecuteReader();

ExecuteScalar

Executes a query and returns the value of the first column in the first row of the resultset as a scalar value.

cmd.CommandText="SELECT COUNT(movie_title)
   FROM movies";
int movieCt = (int) cmd.ExecuteScalar();

ExecuteXmlReader

Available for SQL Server data provider only. Returns an XmlReader object that is used to access the resultset. XmlReader is discussed in Chapter 10, “Working with XML in .NET.”

The ExecuteReader method is the most important of these methods. It returns a DataReader object that exposes the rows returned by the query. The behavior of this method can be modified by using its overload that accepts a CommandBehavior type parameter. As an example, the following statement specifies that a single row of data is to be returned:

rdr = cmd.ExecuteReader(sql, CommandBehavior.SingleResult);

Some data providers take advantage of this parameter to optimize query execution. The list of values for the CommandBehavior enumeration includes the following:

  • SingleRowIndicates that the query should return one row. Default behavior is to return multiple resultsets.

  • SingleResultThe query is expected to return a single scalar value.

  • KeyInfoReturns column and primary key information. It is used with a data reader's GetSchema method to fetch column schema information.

  • SchemaOnlyUsed to retrieve column names for the resultset. Example:

    dr=cmd.ExecuteReader(CommandBehavior.SchemaOnly);
    string col1= dr.GetName(0);  // First column name
    
  • SequentialAccessPermits data in the returned row to be accessed sequentially by column. This is used with large binary (BLOB) or text fields.

  • CloseConnectionClose connection when reader is closed.

Executing Stored Procedures with the Command Object

A stored procedure is a set of SQL code stored in a database that can be executed as a script. It's a powerful feature that enables logic to be encapsulated, shared, and reused among applications. ADO.NET supports the execution of stored procedures for OleDb , SqlClient, ODBC, and OracleClient data providers.

Executing a stored procedure is quite simple: set the SqlCommand.CommandText property to the name of the procedure; set the CommandType property to the enumeration CommandType.StoredProcedure; and then call the ExecuteNonQuery method.

cmd.CommandText = "SP_AddMovie";  // Stored procedure name
cmd.CommandType = CommandType.StoredProcedure;
cmd.ExecuteNonQuery();

When a stored procedure contains input or output parameters, they must be added to the command object's Parameters collection before the procedure is executed. To demonstrate, let's execute the stored procedure shown in Listing 11-3. This procedure allows records to be fetched from the movies table as pages containing 10 rows of data. Input to the procedure is the desired page; the output parameter is the total number of pages available. This code fragment illustrates how parameters are set and how the procedure is invoked to return the first page:

SqlCommand cmd  = new SqlCommand();
cmd.CommandText = "SPMOVIES_LIST";  // Stored procedure name
cmd.CommandType = CommandType.StoredProcedure;
cmd.Parameters.Add(@PageRequest", SqlDbType.Int);
cmd.Parameters.Add(@TotalPages", SqlDbType.Int);
cmd.Parameters[0].Direction= ParameterDirection.Input;
cmd.Parameters[0].Value= 1;        // Retrieve first page
cmd.Parameters[1].Direction=ParameterDirection.Output;
cmd.CommandTimeout=10;  // Give command 10 seconds to execute
SqlDataReader rdr = cmd.ExecuteReader();
while (rdr.Read()){
//   do something with results
}
rdr.Close();  // Must close before reading parameters
int totpages=  cmd.Parameters[1].Value;

This example uses the SqlClient data provider. With a couple of changes, OleDb can be used just as easily. The primary difference is in the way they handle parameters. SqlClient requires that the parameter names match the names in the stored procedure; OleDb passes parameters based on position, so the name is irrelevant. If the procedure sends back a return code, OleDB must designate the first parameter in the list to handle it. SqlClient simply adds a parameter—the name is unimportant—that has its direction set to ReturnValue.

Example 11-3. Stored SQL Server Procedure to Return a Page of Records

CREATE PROCEDURE SPMOVIES_LIST
   @PageRequest int,
   @TotalPages int output
AS
   /*
      Procedure to return a resultset of movies ordered
      by title.
      Resultset contains 10 movies for the specified page.
   */
   SET NOCOUNT ON
   select @TotalPages = CEILING(COUNT(*)/10) from movies
   if @PageRequest = 1 or @PageRequest <1
   begin
      select top 10 * from movies order by movie_Title
      set @PageRequest = 1
      return 0
   end
   begin
      if @PageRequest > @TotalPages
         set @PageRequest = @TotalPages
      declare @RowCount int
      set @RowCount = (@PageRequest * 10)
      exec ('SELECT * FROM
      (SELECT TOP 10 a.* FROM
      (SELECT TOP ' + @RowCount + ' * FROM movies ORDER BY
          movie_Title) a
       ORDER BY movie_Title desc) b
       ORDER BY Movie_Title')

       return 0
   end

Using Parameterized Commands Without Stored Procedures

An earlier example (see “Creating a Command Object” on page 511) used this statement to create a SQL command to store a movie in the Films database:

string sql = "insert into movies(movie_Title,movie_Year,
      bestpicture) values(@title,@yr,@bestpic)";
// Parameters set values to be stored
cmd.Parameters.AddWithValue ("@title", "Schindler's List");
cmd.Parameters.AddWithValue ("@yr", "1993");
cmd.Parameters.AddWithValue ("@bestpic", "Y");

The alternative, which uses concatenation, looks like this:

string title = "Schindler''s List";  // Two single quotes needed
string yr = "1993";
string pic = "Y";
sql = "insert into movies(movie_Title,movie_Year,
      bestpicture) values";
sql += "('"+title+"',"+yr+",'"+pic+"') ";

Not only is the parameterized version more readable and less prone to syntactical error, but it also provides a significant benefit: It automatically handles the problem of placing double single quotes ('') in a SQL command. This problem occurs when attempting to store a value such as O'Quinn, which has an embedded quote that conflicts with SQL syntax. Parameters eliminate the usual approach to search each string and replace an embedded single quote with a pair of single quotes.

DataReader Object

As we have seen in several examples, a DataReader exposes the rows and columns of data returned as the result of executing a query. Row access is defined by the IDataReader interface that each DataReader must implement; column access is defined by the IDataRecord interface. We'll look at the most important members defined by these interfaces as well as some custom features added by data providers.

Accessing Rows with DataReader

A DataReader returns a single row from a resultset each time its Read method is executed. If no rows remain, the method returns false. The reader should be closed after row processing is completed in order to free system resources. You can check the DataReader.IsClosed property to determine if a reader is closed.

Although a DataReader is associated with a single command, the command may contain multiple queries that return multiple resultsets. This code fragment demonstrates how a DataReader processes the rows returned by two queries.

string q1 = "SELECT * FROM movies WHERE movie_Year < 1940";
string q2 = "SELECT * FROM movies WHERE movie_Year > 1980";
cmd.CommandText = q1 + ";" + q2;
DbDataReader rdr = cmd.ExecuteReader();
bool readNext = true;
while (readNext)
{
   while (rdr.Read())
   {
      MessageBox.Show(rdr.GetString(1));
   }
   readNext = rdr.NextResult(); // Another resultset?
}
rdr.Close();
conn.Close();

The two things to note are the construction of the CommandString with multiple queries and the use of the NextResult method to determine if results from another query are present.

Core Note

Core Note

The DataReader has no property or method that provides the number of rows returned in its resultset. Because data is received one row at a time, the resultset could be altered by additions and deletions to the database as records are read. However, there is a HasRows property that returns true if the data reader contains one or more rows.

Accessing Column Values with DataReader

There are numerous ways to access data contained in the columns of the current DataReader row: as an array with column number (zero-based) or name used as an index; using the GetValue method by passing it a column number; and using one of the strongly typed Getxxx methods that include GetString, GetInt32, GetDateTime, and GetDouble. The following code segment contains an example of each technique:

cmd.CommandText="SELECT movie_ID, movie_Title FROM movies";
rdr = cmd.ExecuteReader();
rdr.Read();
string title;
// Multiple ways to access data in a column
title = rdr.GetString(1);
title = (string)rdr.GetSqlString(1); // SqlClient provider
title = (string)rdr.GetValue(1);
title = (string)rdr["movie_Title"];  // Implicit item
title = (string)rdr[1];              // Implicit item

The GetString method has the advantage of mapping the database contents to a native .NET data type. The other approaches return object types that require casting. For this reason, use of the Get methods is recommended. Note that although GetString does not require casting, it does not perform any conversion; thus, if the data is not of the type expected, an exception is thrown.

Many applications rely on a separate data access layer to provide a DataReader. In such cases, the application may require metadata to identify column names, data types, and other columnar information. Column names, which are useful for generating report headings, are readily available through the GetName method:

// List column names for a DataReader
DbDataReader rdr = GetReader();        // Get a DataReader
for (int k = 0; k < rdr.FieldCount; k++)
{
   Console.WriteLine(rdr.GetName(k));  // Column name
}
rdr.Close();

Complete column schema information is available through the GetSchemaTable method. It returns a DataTable in which there is one row for each field (column) in the resultset. The columns in the table represent schema information. This code segment demonstrates how to access all the column information for a resultset. For brevity, only three of the 24 columns of information are shown:

DataTable schemaTable = rdr.GetSchemaTable();
int ict = 0;
foreach (DataRow r in schemaTable.Rows)
{
   foreach (DataColumn c in schemaTable.Columns){
      Console.WriteLine(ict.ToString()+"
                        "+c.ColumnName + ": "+r[c]);
      ict++;
   }
}
// Selected Output:
//       0 ColumnName: movie_ID
//       1 ColumnOrdinal: 0
//       12 DataType: System.Int32

DataSets, DataTables, and the Disconnected Model

The ADO.NET disconnected model is based on using a DataSet object as an in-memory cache. A DataAdapter serves as the intermediary between the DataSet and the data source that loads the cache with data. After it has completed its task, the DataAdapter returns the connection object to the pool, thus disconnecting the data from the data source. Interestingly, the DataAdapter is actually a wrapper around a data provider's DataReader, which performs the actual data loading.

The DataSet Class

In many ways, a DataSet object plays the role of an in-memory database. Its Tables property exposes a collection of DataTables that contain data and a data schema describing the data. The Relations property returns a collection of DataRelation objects that define how tables are interrelated. In addition, DataSet methods are available to Copy, Merge, and Clear the contents of the DataSet.

Keep in mind that the DataSet and DataTable are core parts of ADO.NET and—unlike the Connection, DataReader, and DataAdapter—they are not tied to a specific data provider. An application can create, define, and populate a DataSet with data from any source.

Besides tables and their relations, a DataSet can also contain custom information defined by the application. A look at Figure 11-6 shows the major collection classes in the DataSet hierarchy. Among these is PropertyCollection, which is a set of custom properties stored in a hash table and exposed through the DataSet.ExtendedProperties property. It is often used to hold a time stamp or descriptive information such as column validation requirements for tables in the data set.

DataSet class hierarchy

Figure 11-6. DataSet class hierarchy

The discussion of the DataSet class begins with its most important member—the DataTable collection.

DataTables

One step below the DataSet in the disconnected model hierarchy is the DataTable collection. This collection—accessed through the DataSet.Tables property—stores data in a row-column format that mimics tables in a relational database. The DataTable class has a rich set of properties and methods that make it useful as a stand-alone data source or as part of a table collection in a DataSet. The most important of these are the Columns and Rows properties, which define the layout and content of a table.

DataColumns

The DataTable.Columns property exposes a collection of DataColumn objects that represent each data field in the DataTable. Taken together, the column properties produce the data schema for the table. Table 11-5 summarizes the most important properties.

Table 11-5. Properties of the DataColumn Class

Method

Description

ColumnName

Name of the column.

DataType

Type of data contained in this column.

Example: col1.DataType = System.Type.GetType("System.String")

MaxLength

Maximum length of a text column. -1 if there is no maximum length.

ReadOnly

Indicates whether the values in the column can be modified.

AllowDBNull

Boolean value that indicates whether the column may contain null values.

Unique

Boolean value that indicates whether the column may contain duplicate values.

Expression

An expression defining how the value of a column is calculated.

Example: colTax.Expression = "colSales * .085";

Caption

The caption displayed in the user interface.

DataTable

The name of the DataTable containing this column.

DataTable columns are created automatically when the table is filled with the results of a database query or from reading an XML file. However, for applications that fill a table dynamically—such as from user input or real-time data acquisition—it may be necessary to write the code that defines the table structure. It's a worthwhile exercise in its own right that enhances a developer's understanding of the DataSet hierarchy.

The following segment creates a DataTable object, creates DataColumn objects, assigns property values to the columns, and adds them to the DataTable. To make things interesting, a calculated column is included.

DataTable tb = new DataTable("Order");
DataColumn dCol = new DataColumn("ID",
      Type.GetType("System.Int16"));
dCol.Unique = true;  // ID must be unique for each data row
dCol.AllowDBNull = false;
tb.Columns.Add(dCol);
dCol= new DataColumn("Price", Type.GetType("System.Decimal"));
tb.Columns.Add(dCol);
dCol=new DataColumn("Quan",Type.GetType("System.Int16"));
tb.Columns.Add(dCol);
dCol= new DataColumn("Total",Type.GetType("System.Decimal"));
dCol.Expression= "Price * Quan";
tb.Columns.Add(dCol);
// List column names and data type
foreach (DataColumn dc in tb.Columns)
{
   Console.WriteLine(dc.ColumnName);
   Console.WriteLine(dc.DataType.ToString());
}

Note that the ID column is defined to contain unique values. This constraint qualifies the column to be used as a key field in establishing a parent-child relationship with another table in a DataSet. To qualify, the key must be unique—as in this case—or defined as a primary key for the table. You assign a primary key to a table by setting its PrimaryKey field to the value of an array containing the column(s) to be used as the key. Here is an example that specifies the ID field a primary key:

DataColumn[] col = {tb.Columns["ID"]};
tb.PrimaryKey = col;

We'll see how to use a primary key to create table relationships and merge data later in this section.

Core Note

Core Note

If a primary key consists of more than one column—such as a first and last name—you can enforce a unique constraint on these columns in three steps: by creating an array to hold the columns, creating a UniqueConstraint object by passing the array to its constructor; and adding the constraint to the data table's Constraints collection:

DataColumn[] cols = {tb.Columns["fname"]
                     tb.Columns["lname"]};
tb.Constraints.Add(new UniqueConstraint("nameKey", cols));

DataRows

Data is added to a table by creating a new DataRow object, filling it with column data, and adding the row to the table's DataRow collection. Here is an example that places data in the table created in the preceding example.

DataRow row;
row = tb.NewRow();      // Create DataRow
row["Title"] = "Casablanca";
row["Price"] = 22.95;
row["Quan"] = 2;
row["ID"] = 12001;
tb.Rows.Add(row);       // Add row to Rows collection
Console.WriteLine(tb.Rows[0]["Total"].ToString()); // 45.90

A DataTable has methods that allow it to commit and roll back changes made to the table. In order to do this, it keeps the status of each row in the DataRow.RowState property. This property is set to one of five DataRowState enumeration values: Added, Deleted, Detached, Modifed, or Unchanged. Let's extend the preceding example to demonstrate how these values are set:

tb.Rows.Add(row);                    // Added
tb.AcceptChanges();                  // ...Commit changes
Console.Write(row.RowState);         // Unchanged
tb.Rows[0].Delete();                 // Deleted
// Undo deletion
tb.RejectChanges();                  // ...Roll back
Console.Write(tb.Rows[0].RowState);  // Unchanged
DataRow myRow;
MyRow = tb.NewRow();                 // Detached

The two DataTable methods AcceptChanges and RejectChanges are equivalent to the commit and rollback operations in a database. These apply to all changes made from the time the table was loaded or since AcceptChanges was previously invoked. In this example, we are able to restore a deleted row because the deletion is not committed before RejectChanges is called. Note that the changes are to the data table—not the original data source.

For each column value in a row, ADO.NET maintains a current and original value. When RejectChanges is called, the current values are set to the original values. The opposite occurs if AcceptChanges is called. The two sets of values can be accessed concurrently through the DataRowVersion enumerations Current and Original:

DataRow r = tb.Rows[0];
r["Price"]= 14.95;
r.AcceptChanges();
r["Price"]= 16.95;
Console.WriteLine("Current: {0} Original: {1} ",
      r["Price",DataRowVersion.Current],
      r["Price",DataRowVersion.Original]);
// output:  Current: 16.95  Original: 14.95

Keeping track of table row changes takes on added importance when the purpose is to update an underlying data source. We'll see later in this section how the DataAdapter updates database tables with changes made to DataTable rows.

Loading Data into a DataSet

Now that we have seen how to construct a DataTable and punch data into it row-by-row, let's look at how data and a data schema can be automatically loaded from a relational database into tables in a DataSet. For details on loading XML data, refer to Section 11.5, “XML and ADO.NET,” on page 533.

Using the DataReader to Load Data into a DataSet

A DataReader object can be used in conjunction with a DataSet or DataTable to fill a table with the rows generated by a query. This requires creating a DataReader object and passing it as a parameter to the DataTable.Load method:

cmd.CommandText = "SELECT * FROM movies WHERE movie_Year < 1945";
DBDataReader rdr =
      cmd.ExecuteReader(CommandBehavior.CloseConnection);
DataTable dt = new DataTable("movies");
dt.Load(rdr);   // Load data and schema into table
Console.WriteLine(rdr.IsClosed);   // True

The DataReader is closed automatically after all of the rows have been loaded. The CloseConnection parameter ensures that the connection is also closed.

If the table already contains data, the Load method merges the new data with the existing rows of data. Merging occurs only if rows share a primary key. If no primary key is defined, rows are appended. An overloaded version of Load takes a second parameter that defines how rows are combined. This parameter is a LoadOption enumeration type having one of three values: OverwriteRow, PreserveCurrentValues, or UpdateCurrentValues. These options specify whether the merge operation overwrites the entire row, original values only, or current values only. This code segment illustrates how data is merged with existing rows to overwrite the current column values:

cmd.CommandText = "SELECT * FROM movies WHERE movie_Year < 1945";
DBDataReader rdr = cmd.ExecuteReader( );
DataTable dt = new DataTable("movies");
dt.Load(rdr);                    // Load rows into table
Console.Write(dt.Rows[0]["movie_Title"]);  // Casablanca

// Assign primary key so rows can be merged
DataColumn[] col = new DataColumn[1];
col[0] = dt.Columns["movie_ID"];
dt.PrimaryKey = col;
DataRow r = dt.Rows[0];           // Get first row of data
r["movie_Title"] = "new title";   // Change current column value
// Since reader is closed, must fill reader again
rdr = cmd.ExecuteReader(CommandBehavior.CloseConnection);
// Merge data with current rows. Overwrites current values
dt.Load(rdr, LoadOption.UpdateCurrentValues );
// Updated value has been overwritten
Console.Write(dt.Rows[0]["movie_Title"]);  // Casablanca

Using the DataAdapter to Load Data into a DataSet

A DataAdapter object can be used to fill an existing table, or create and fill a new table, with the results from a query. The first step in this process is to create an instance of the DataAdapter for a specific data provider. As the following code shows, several constructor overloads are available:

// (1) The easiest: a query and connection string as arguments
String sql = "SELECT * FROM movies";
SqlDataAdapter da = new SqlDataAdapter(sql, connStr);

// (2) Assign a command object to the SelectCommand property
SqlDataAdapter da = new SqlDataAdapter();
SqlConnection conn = new SqlConnection(connStr);
da.SelectCommand = new SqlCommand(sql,conn);

// (3) Pass in a query string and connection object
SqlConnection conn = new SqlConnection(connStr);
SqlDataAdapter da = new SqlDataAdapter(sql, conn);

Of these, the first version is the simplest. It accepts two strings containing the query and connection. From these, it constructs a SqlCommand object that is assigned internally to its SelectCommand property. Unlike the other constructors, there is no need to write code that explicitly creates a SqlCommand or SqlConnection object.

In the overloads that accept a connection object as a parameter, the opening and closing of the connection is left to the DataAdapter. If you add a statement to explicitly open the connection, you must also include code to close it. Otherwise, the DataAdapter leaves it open, which locks the data in the database.

After the DataAdapter object is created, its Fill method is executed to load data into a new or existing table. In this example, a new table is created and assigned the default name Table:

DataSet ds = new DataSet();
// Create DataTable, load data, and add to DataSet
// Could use da.Fill(ds,"movies") to specify table name.
int numrecs = da.Fill(ds);  // Returns number of records loaded

For an existing table, the behavior of the Fill command depends on whether the table has a primary key. If it does, those rows having a key that matches the key of the incoming data are replaced. Incoming rows that do not match an existing row are appended to the DataTable.

Using the DataAdapter to Update a Database

After a DataAdapter has loaded data into a table, the underlying connection is closed, and subsequent changes made to the data are reflected only in the DataSet—not the underlying data source. To apply changes to the data source, a DataAdapter is used to restore the connection and send the changed rows to the database. The same DataAdapter used to fill the DataSet can be used to perform this task.

The DataAdapter has three properties—InsertCommand, DeleteCommand, and UpdateCommand—that are assigned the actual SQL commands to perform the tasks that correspond to the property name. These commands are executed when the Upate method of the DataAdapter is invoked. The challenge lies in creating the SQL commands that post the changes and assigning them to the appropriate DataAdapter properties. Fortunately, each data provider implements a CommandBuilder class that can be used to handle this task automatically.

The CommandBuilder Object

A CommandBuilder object generates the commands necessary to update a data source with changes made to a DataSet. It's amazingly self-sufficient. You create an instance of it by passing the related DataAdapter object to its constructor; then, when the DataAdapter.Update method is called, the SQL commands are generated and executed. The following segment shows how changes to a DataTable are flushed to the database associated with the DataAdapter:

DataTable dt= ds.Tables["movies"]; // Shortcut to reference table

// (1) Use command builder to generate update commands
SqlCommandBuilder sb = new SqlCommandBuilder(da);

// (2) Add movie to table
DataRow drow = dt.NewRow();
drow["movie_Title"] = "Taxi Driver";
drow["movie_Year"] = "1976";
dt.Rows.Add(drow);

// (3) Delete row from table
dt.Rows[4].Delete();

// (4) Edit Column value
dt.Rows[5]["movie_Year"] = "1944";

// (5) Update underlying Sql Server table
int updates = da.Update(ds, "movies");
MessageBox.Show("Rows Changed: " +updates.ToString());  // 3

There are a couple of restrictions to be aware of when using the CommandBuilder: The Select command associated with the DataAdapter must refer to a single table, and the source table in the database must include a primary key or a column that contains unique values. This column (or columns) must be included in the original Select command.

Core Note

Core Note

You can create your own update commands without using a CommandBuilder. Although it can be a lengthy process, it can also yield more efficient commands. For applications that require considerable database updates, you may want to consult an ADO.NET book for details on coding update logic directly.

Synchronizing the DataSet and the DataBase

As demonstrated in this example, the use of a DataAdapter simplifies and automates the process of updating a database—or any data store. However, there is a rock in this snowball: the problem of multi-user updates. The disconnected model is based on optimistic concurrency, an approach in which the rows of the underlying data source are not locked between the time they are read and the time updates are applied to the data source. During this interval, another user may update the data source. Fortunately, the Update method recognizes if changes have occurred since the previous read and fails to apply changes to a row that has been altered.

There are two basic strategies for dealing with a concurrency error when multiple updates are being applied: roll back all changes if a violation occurs, or apply the updates that do not cause an error and identify the ones that do so they can be reprocessed.

Using Transactions to Roll Back Multiple Updates

When the DataAdapter.ContinueUpdateonErrors property is set to false, an exception is thrown when a row update cannot be completed. This prevents subsequent updates from being attempted, but does not affect updates that occurred prior to the exception. Because updates may be interdependent, applications often require an all-or-none strategy. The easiest way to implement this strategy is to create a .NET transaction in which all of the update commands execute. To do so, create a SqlTransaction object and associate it with the SqlDataAdapater.SelectCommand by passing it to its constructor. If an exception occurs, the transaction's Rollback method is used to undo any changes; if no exceptions occur, the Commit method is executed to apply all the update commands. Listing 11-4 is an example that wraps the updates inside a transaction.

Example 11-4. Using Transaction to Roll Back Database Updates

SqlDataAdapter da = new SqlDataAdapter();
SqlCommandBuilder sb = new SqlCommandBuilder(da);
SqlTransaction tran;
SqlConnection conn = new SqlConnection(connStr);
conn.Open();      // Must open to use with transaction
// (1) Create a transaction
SqlTransaction tran = conn.BeginTransaction();
// (2) Associate the SelectCommand with the transaction
da.SelectCommand = new SqlCommand(sql, conn, tran);
DataSet ds = new DataSet();
da.Fill(ds, "movies");
//
// Code in this section makes updates to DataSet rows
try
{
   int updates = da.Update(ds, "movies");
   MessageBox.Show("Updates: "+updates.ToString());
}
// (3) If exception occurs, roll back all updates in transaction
catch (Exception ex)
{
   MessageBox.Show(ex.Message);   // Error updating
   if (tran != null)
   {
      tran.Rollback();      // Roll back any updates
      tran = null;
      MessageBox.Show("All updates rolled back.");
   }
}
finally
{
// (4) If no errors, commit all updates
   if (tran != null)
   {
       tran.Commit();
       MessageBox.Show("All updates successful. ");
       tran = null;
   }
}
conn.Close();

Identifying Rows That Cause Update Errors

When DataAdapter.ContinueUpdateonErrors is set to true, processing does not halt if a row cannot be updated. Instead, the DataAdapter updates all rows that do not cause an error. It is then up to the programmer to identify the rows that failed and determine how to reprocess them.

Rows that fail to update are easily identified by their DataRowState property (discussed earlier in the description of DataRows). Rows whose update succeeds have a value of Unchanged; rows that fail have their original Added, Deleted, or Modified value. A simple code segment demonstrates how to loop through the rows and identify those that are not updated (see Listing 11-5).

Example 11-5. Identify Attempts to Update a Database That Fails

// SqlDataAdapter da loads movies table
da.ContinueUpdateOnError = true;
DataSet ds = new DataSet();
try
{
   da.Fill(ds, "movies");
   DataTable dt = ds.Tables["movies"];
   SqlCommandBuilder sb = new SqlCommandBuilder(da);
   // ... Sample Update operations
   dt.Rows[29].Delete();                      // Delete
   dt.Rows[30]["movie_Year"] = "1933";        // Update
   dt.Rows[30]["movie_Title"] = "King Kong";  // Update
   dt.Rows[31]["movie_Title"] = "Fantasia";   // Update
   DataRow drow = dt.NewRow();
   drow["movie_Title"] = "M*A*S*H";
   drow["movie_Year"] = "1970";
   dt.Rows.Add(drow);                         // insert
   // Submit updates
   int updates = da.Update(ds, "movies");
   // Following is true if any update failed
   if (ds.HasChanges())
   {
      // Load rows that failed into a DataSet
      DataSet failures = ds.GetChanges();
      int rowsFailed = failures.Rows.Count;
      Console.WriteLine("Update Failures: "+rowsFailed);
      foreach (DataRow r in failures.Tables[0].Rows )
      {
         string state = r.RowState.ToString());
         // Have to reject changes to show deleted row
         if (r.RowState == DataRowState.Deleted)
               r.RejectChanges();
         string ID= ((int)r["movie_ID"]).ToString();
         string msg= state + " Movie ID: "+ID;
         Console.WriteLine(msg);
      }
   }

Note that even though the delete occurs first, it does not affect the other operations. The SQL statement that deletes or updates a row is based on a row's primary key value—not relative position. Also, be aware that updates on the same row are combined and counted as a single row update by the Update method. In this example, updates to row 30 count as one update.

Handling concurrency issues is not a simple task. After you identify the failures, the next step—how to respond to the failures—is less clear, and depends on the application. Often times, it is necessary to re-read the rows from the database and compare them with the rows that failed in order to determine how to respond. The ability to recognize RowState and the current and original values of rows is the key to developing code that resolves update conflicts.

Defining Relationships Between Tables in a DataSet

A DataRelation is a parent/child relationship between two DataTables. It is defined on matching columns in the two tables. The columns must be the same DataType, and the column in the parent table must have unique values. The syntax for its constructor is

public DataRelation(
   string relationName,
   DataColumn parentColumn,
   DataColumn childColumn)

A DataSet has a Relations property that provides access to the collection of DataRelations defined for tables contained in the DataSet. Use the Relations.Add method to place relations in the collection. Listing 11-6 illustrates these ideas. It contains code to set up a parent/child relationship between the directors and movies tables in order to list movies by each director.

Example 11-6. Create a Relationship Between the Directors and Movies Tables

DataSet ds = new DataSet();
// (1) Fill table with movies
string sql = "SELECT movie_ID,movie_Title,movie_DirectorID,
      movie_Year FROM movies";
SqlConnection conn = new SqlConnection(connStr);
SqlCommand cmd = new SqlCommand();
SqlDataAdapter da = new SqlDataAdapter(sql, conn);
da.Fill(ds, "movies");
// (2) Fill table with directors
sql = "SELECT director_id,(first_name + ' '+ last_name) AS
      fullname FROM directors";
da.SelectCommand.CommandText = sql;
da.Fill(ds, "directors");
// (3) Define relationship between directors and movies
DataTable parent = ds.Tables["directors"];
DataTable child = ds.Tables["movies"];
DataRelation relation = new DataRelation("directormovies",
      parent.Columns["director_ID"],
      child.Columns["movie_DirectorID"]);
// (4) Add relation to DataSet
ds.Relations.Add(relation);
// (5) List each director and his or her movies
foreach (DataRow r in parent.Rows)
{
   Console.WriteLine(r["fullname"];      // Director name
   foreach (DataRow rc in
      r.GetChildRows("directormovies"))
   {
      Console.WriteLine("  "+rc["movie_title"]);
   }
}
/*
   Sample Output:
      David Lean
         Lawrence of Arabia
         Bridge on the River Kwai, The
      Victor Fleming
         Gone with the Wind
         Wizard of Oz, The
*/

Relations and Constraints

When a relationship is defined between two tables, it has the effect of adding a ForeignKeyConstraint to the Constraints collections of the child DataTable. This constraint determines how the child table is affected when rows in a parent table are changed or deleted. In practical terms, this means that if you delete a row in the parent table, you can have the related child row(s) deleted—or optionally, have their key value set to null. Similarly, if a key value is changed in the parent table, the related rows in the child can have their key value changed or set to null.

The rule in effect is determined by the value of the DeleteRule and UpdateRule properties of the constraint. These can take one of four Rule enumeration values:

  • CascadeDeletes or updates related rows in child table. This is the default.

  • NoneTakes no action.

  • SetDefaultSets key values in child rows to column's default value.

  • SetNullSets key values in child rows to null.

This code segment illustrates how constraints affect the capability to add a row to a child table and delete or change a row in the parent table. The tables from the preceding example are used.

// (1) Try to add row with new key to child table
DataRow row = child.NewRow();
row["movie_directorID"] = 999;
child.Rows.Add(row);   // Fails – 999 does not exist in parent
// (2) Delete row in parent table
row = parent.Rows[0];
row.Delete();      // Deletes rows in child having this key
// (3) Relax constraints and retry adding row
ds.EnforceConstraints = false;
row["movie_directorID"] = 999;
child.Rows.Add(row);   // Succeeds
ds.EnforceConstraints = true;    // Turn back on
// (4) Change constraint to set rows to null if parent changed
((ForeignKeyConstraint)child.Constraints[0]).DeleteRule =
       Rule.SetNull ;

Note that setting the EnforceConstraints property to false turns off all constraints—which in database terms eliminates the check for referential integrity. [3] This allows a movie to be added even though its movie_DirectorID column (foreign key) does not have a corresponding row in the directors table. It also permits a director to be deleted even though a movie by that director exists in the movies table. This clearly compromises the integrity of the database and should be used only when testing or populating individual tables in a database.

Choosing Between the Connected and Disconnected Model

The DataReader and DataSet offer different approaches to processing data—each with its advantages and disadvantages. The DataReader provides forward-only, read-only access to data. By processing a row at a time, it minimizes memory requirements. A DataSet, on the other hand, offers read/write access to data, but requires enough memory to hold a copy of the data fetched from a data source. From this, you can derive a couple of general rules: If the application does not require the capability to update the data source and is used merely for display and selection purposes, a DataReader should be the first consideration; if the application requires updating data, a DataSet should be considered.

Of course, the general rules have to be weighed against other factors. If the data source contains a large number of records, a DataSet may require too many resources; or if the data requires only a few updates, the combination of DataReader and Command object to execute updates may make more sense. Despite the gray areas, there are many situations where one is clearly preferable to the other.

A DataSet is a good choice when the following apply:

  • Data need to be serialized and/or sent over the wire using HTTP.

  • Multiple read-only controls on a Windows Form are bound to the data source.

  • A Windows Form control such as a GridView or DataView is bound to an updatable data source.

  • A desktop application must edit, add, and delete rows of data.

A DataReader is a good choice when the following apply:

  • A large number of records must be handled so that the memory requirements and time to load make a DataSet impracticable.

  • The data is read-only and bound to a Windows or Web Form list control.

  • The database is highly volatile, and the contents of a DataSet might be updated often.

XML and ADO.NET

Just as relational data has a schema that defines its tables, columns, and relationships, XML uses a Schema Definition language (XSD) to define the layout of an XML document. Its main use is to validate the content of XML data. See Section 10.1, “Working with XML,” for a discussion of XML Schema.

The XML classes reside in the System.Xml namespace hierarchy and are not part of ADO.NET. However, the ADO.NET DataSet class provides a bridge between the two with a set of methods that interacts with XML data and schemas:

  • ReadXMLLoads XML data into a DatSet.

  • WriteXml and GetXml. Writes the DataSet's contents to an XML formatted stream.

  • WriteXmlSchema and GetXmlSchemaGenerates an XML Schema from the DataSet schema.

  • ReadXmlSchemaReads an XML Schema file and creates a database schema.

  • InferXmlSchemaCreates a DataSet schema from XML data.

  • GetXml and GetXmlSchema, Returns a string containing the XML representation of the data or the XSD schema for XML representation.

We'll first look at examples that show how to write XML from a DataSet. This XML output is then used as input in subsequent examples that create a DataSet from XML.

Using a DataSet to Create XML Data and Schema Files

When working with XML, the DataSet is used as an intermediary to convert between XML and relational data. For this to work, the XML data should be structured so that it can be represented by the relationships and row-column layout that characterizes relational data.

The following code segment illustrates how easy it is to create an XML data file and schema from a DataSet's contents. A DataAdapter is used to populate a DataSet with a subset of the movies table. The WriteXml and WriteXmlSchema methods are then used to translate this to XML output.

DataSet ds = new DataSet("films");
DataTable dt = ds.Tables.Add("movies");
string sql = "SELECT * FROM movies WHERE bestPicture='Y'";
SqlDataAdapter da = new SqlDataAdapter(sql, conn);
da.Fill(dt);
// Write Schema representing DataTable to a file
ds.WriteXmlSchema("c:\oscars.xsd");  // create schema
// Write Table data to an XML file
ds.WriteXml("c:\oscarwinners.xml");  // data in xml format
/* To place schema inline with XML data in same file:
   ds.WriteXml(("c:\oscarwinners.xml",
              XmlWriteMode.WriteSchema);
*/

The schema output shown in Listing 11-7 defines the permissible content of an XML document (file). If you compare this with Figure 11-3 on page 502, you can get a general feel for how it works. For example, each field in the movies table is represented by an element containing the permissible field name and type.

Example 11-7. XML Schema from Movies Table—oscars.xsd

<?xml version="1.0" encoding="utf-16"?>
<xs:schema id="films" xmlns=""
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:msdata="urn:schemas-microsoft-com:xml-msdata">
  <xs:element name="films" msdata:IsDataSet="true">
    <xs:complexType>
      <xs:choice minOccurs="0" maxOccurs="unbounded">
        <xs:element name="movies">
          <xs:complexType>
            <xs:sequence>
              <xs:element name="movie_ID" type="xs:int"
                        minOccurs="0" />
              <xs:element name="movie_Title" type="xs:string"
                        minOccurs="0" />
              <xs:element name="movie_Year" type="xs:int"
                        minOccurs="0" />
              <xs:element name="movie_DirectorID" type="xs:int"
                        minOccurs="0" />
              <xs:element name="AFIRank" type="xs:int"
                        minOccurs="0" />
              <xs:element name="bestPicture" type="xs:string"
                        minOccurs="0" />
            </xs:sequence>
          </xs:complexType>
        </xs:element>
      </xs:choice>
    </xs:complexType>
  </xs:element>
</xs:schema>

Listing 11-8 displays an abridged listing of the XML version of the relational data. The name of the DataSet is the root element. Each row in the table is represented by a child element (movies) containing elements that correspond to the columns in the data table.

Example 11-8. Movies Data as an XML Document—oscarwinners.xml

<?xml version="1.0" encoding="utf-16"?>
<?xml version="1.0" standalone="yes"?>
<films>
  <movies>
    <movie_ID>5</movie_ID>
    <movie_Title>Citizen Kane </movie_Title>
    <movie_Year>1941</movie_Year>
    <movie_DirectorID>1</movie_Director>
    <AFIRank>1</AFIRank>
    <bestPicture>Y</bestPicture>
  </movies>
  <movies>
    <movie_ID>6</movie_ID>
    <movie_Title>Casablanca </movie_Title>
    <movie_Year>1942</movie_Year>
    <movie_DirectorID>2</movie_Director>
    <AFIRank>2</AFIRank>
    <bestPicture>Y</bestPicture>
  </movies>
  ...
</films>

Creating a DataSet Schema from XML

Each ADO.NET DataSet has a schema that defines the tables, table columns, and table relationships that comprise the DataSet. As we saw in the preceding example, this schema can be translated into an XML schema using the WriteXmlSchema method. ReadXmlSchema mirrors the process—adding tables and relationships to a DataSet. In this example, the XML schema for the movies table (refer to Listing 11-7) is used to create a DataSet schema:

DataSet ds = new DataSet();
ds.ReadXmlSchema("c:\oscars.xsd");
DataTable tb = ds.Tables[0];
// List Columns for table
string colList = tb.TableName +": ";
for (int i = 0; i < tb.Columns.Count; i++)
   { colList += tb.Columns[i].Caption + "  "; }
Console.WriteLine(colList);
/* output is:
movies: movie_ID  movie_Title  movie_Year movie_DirectorID
        bestpicture AFIRank
*/

It is also possible to create a schema by inferring its structure from the XML data or using a DataAdapter to configure the schema:

// (1) Create schema by inferring it from XML data
ds.Tables.Clear();   // Remove tables from DataSet
ds.InferXmlSchema("c:\oscarwinners.xml",null);

// (2) Create schema using Data Adapter
ds.Tables.Clear();
string sql = "SELECT * FROM movies";
SqlDataAdapter da = new SqlDataAdapter(sql, connStr);
// Creates DataTable named "movies"
da.FillSchema(ds, SchemaType.Source, "movies");

Core Note

Core Note

By creating the DataSet schema(s) in a separate step from reading in XML data, you can control the data that is read from the source XML file. Only data for the columns defined by the schema are read in. Conversely, if the schema defines more columns than are in the XML file, these columns are empty in the DataSet.

Reading XML Data into a DataSet

The DataSet.ReadXml method provides a way to read either data only or both the data and schema into a DataSet. The method has several overloads that determine whether a schema is also created. The two overloads used with files are

XmlReadMode ReadXml(string XMLfilename);
XmlReadMode ReadXml(string XMLfilename, XmlReadMode mode);

Parameters:

XMLfilename

Name of file (.xml) containing XML data.

mode

One of the XmlReadMode enumeration values.

The XmlReadMode parameter merits special attention. Its value specifies how a schema is derived for the table(s) in a DataSet. It can specify three sources for the schema: from a schema contained (inline) in the XML file, from the schema already associated with the DataSet, or by inferring a schema from the contents of the XML file. Table 11-6 summarizes how selected enumeration members specify the schema source. The numbers in the table indicate the order in which a schema is selected. For example, ReadSchema specifies that the inline schema is the first choice; if it does not exist, the schema associated with the DataSet is used; if neither exists, a data table is not built.

Table 11-6. XmlReadMode Values Determine How a Schema Is Derived for a DataSet

 

Schema Source

 

XmlReadMode

Inline

DataSet

Infer

Comment

Auto

1

2

3

The default when no XmlReadMode is provided.

IgnoreSchema

 

1

 

Uses only the DataSet's schema. Data in the file that is not in the schema is ignored.

InferSchema

  

1

Ignores inline schema, and builds tables by inference from XML file. Error occurs if DataSet already contains conflicting schema.

ReadSchema

1

2

 

If tables created from inline schema already exist in DataSet, an exception is thrown.

The code segment in Listing 11-9 loads an XML file into a DataSet and then calls a method to display the contents of each row in the table created. Because the DataSet does not have a predefined schema, and the file does not include an inline schema, ReadXml infers it from the contents of the file.

Example 11-9. Using ReadXml to Load XML Data into a DataSet

//  Load XML data into dataset and create schema if one does
//  not exist
DataSet ds = new DataSet();
ds.ReadXml("c:\oscarwinners.xml");
// Save source of data in dataset
ds.ExtendedProperties.Add("source", "c:\oscarwinners.xml");
ShowTable(ds.Tables[0]);

// Display each row in table
private void ShowTable(DataTable t)
{
   foreach(DataRow dr in t.Rows)
   {
      StringBuilder sb = new StringBuilder("Table: ");
      sb.Append(t.TableName).Append("
");
      foreach(DataColumn c in t.Columns)
      {
          sb.Append(c.Caption).Append(": ");
          sb.Append(dr[c.ColumnName].ToString()).Append("
");
      }
      Console.WriteLine(sb.ToString());

}

Note the use of ExtendedProperties to store the name of the data source in the data set. Because this collection of custom properties is implemented as a Hashtable, it is accessed using that syntax:

string src = (string)ds.ExtendedProperties["source"];
ds.ExtendedProperties.Clear(); // clear hash table

Using ReadXml with Nested XML Data

The XML file used in the preceding example has a simple structure that is easily transformed into a single table: The <movies> tag (refer to Listing 11-8) represents a row in a table, and the elements contained within it become column values. Most XML is more complex than this example and requires multiple tables to represent it. Although ReadXml has limitations (it cannot handle attributes), it can recognize nested XML and render multiple tables from it. As an example, let's alter the oscarwinners.xml file to include a <director> tag within each <movies> block.

<films>
  <movies>
    <movie_ID>5</movie_ID>
    <movie_Title>Citizen Kane </movie_Title>
    <movie_Year>1941</movie_Year>
    <director>
      <first_name>Orson</first_name>
      <last_name>Welles</last_name>
    </director>
    <bestPicture>Y</bestPicture>
    <AFIRank>1</AFIRank>
  </movies>
  ... more movies here
</films>

Next, run this code to display the contents of the table(s) created:

DataSet ds = new DataSet();
ds.ReadXml("c:\oscarwinnersv2.xml");
foreach (DataTable dt in ds.Tables)
   ShowTable(dt);

Figure 11-7 depicts the DataSet tables created from reading the XML file. It creates two tables, automatically generates a movies_ID key for each table, and assigns values to this key, which link a row in the movies table to an associated row in the director table.

DataSet tables and relationship created from XML

Figure 11-7. DataSet tables and relationship created from XML

Summary

ADO.NET supports two database connectivity models: connected and disconnected. The connected model remains connected to the underlying database while traversing a resultset in a forward-only read-only manner; the disconnected model can retrieve a resultset into an in-memory cache and disconnect itself from the source data. Two distinctly separate data storage objects are available for implementing these models: the DataReader and the DataSet. The DataReader serves up data a row at a time; the DataSet functions as an in-memory relational database. Changes to the contents of the DataSet can be posted back to the original data source using the DataAdapter object. This object includes properties and methods designed to address synchronization issues that arise when disconnected data is used to update a database.

Although XML classes are not part of the ADO.NET namespaces, a level of interaction between relational and XML data is provided through the ADO.NET DataSet class. This class includes WriteXmlSchema and WriteXml methods that are used to create an XML schema and document. The versatile DataSet.ReadXml method has several overloads that are used to construct a DataSet from an XML data file or schema.

Test Your Understanding

1:

What four classes must a .NET data provider supply?

2:

Given the following code:

SqlCommand cmd = new SqlCommand();
cmd.Connection = conn;
string s = "insert into movies(movie_title, movie_year,
            bestpicture)values(&p1,&p2,&p3); "
cmd.CommandText= s;

which of the following code segments completes the SQL query string to insert the movie row?

  1. string p1="Star Wars";
    int p2 = 1977;
    string p3= "N";
    
  2. cmd.Parameters.AddWithValue ("@p1","Star Wars");
    cmd.Parameters.AddWithValue ("@p2", "1977");
    cmd.Parameters.AddWithValue ("@p3", "N");
    
  3. cmd.Parameters.Add("@p1", SqlDbType.NVarChar, 100);
    cmd.Parameters.Add("@p2", SqlDbType.NVarChar, 4);
    cmd.Parameters.Add("@p3", SqlDbType.NVarChar, 1);
    p1.Value = "Star Wars";
    p2.Value = "1977";
    p3.Value = "N";
    

3:

Describe the purpose of these three command object methods:

ExecuteNonQuery
ExecuteReader
ExecuteScalar

4:

Compare the role of the DataReader and the DataAdapter.

5:

What is the difference between a DataSet and a DataTable?

6:

The DataRowState property maintains the status of a row. Which of these is not a valid DataRowState value?

  1. Added
    
  2. Deleted
    
  3. Rejected
    
  4. Changed
    

7:

You have an XML file and want to create a DataSet that has rows and columns that match the layout of the XML file. The XML file does not have a schema (.xsd) file associated with it. What DataSet method is used to create the DataSet schema?



[1] Test results for .NET 1.1 have shown the SQL Client provider to be up to 10 times faster than OLE DB.

[2] SqlClient, Oracle, OleDB, and ODBC implementations are available with ADO.NET 2.0.

[3] The foreign key in any referencing table must always refer to a valid row in the referenced table.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset