Planning for Your SQL Server 2005 Deployment

As a SQL Server 2005 administrator or developer, you’ll fill several different roles, including database designer and database architect. The organization where you work may have dedicated database designers and database architects, but so much has changed in SQL Server that it is critical that you understand the new configuration and setup options before deploying SQL servers.

Building the Server System for Performance

As with SQL Server 2000, you have many basic options for deploying SQL Server 2005. You need to choose an edition of SQL Server and the version of Windows on which SQL Server will run. After you make this decision, you should spend some time thinking about the system configuration. In Chapter 1, you learned some key guidelines, but do not overlook the importance of the I/O subsystem.

The I/O subsystem is one of the most fundamental components of the server system and you should give considerable thought to its configuration. Start by choosing drives or storage systems that provide the appropriate level of performance. There really is a substantial difference in speed and performance between various drive specifications. When given a choice for a SQL server’s internal drives, look closely at both SATA II or higher and Ultra SCSI (preferable Ultra320 SCSI or higher).

Consider not only the capacity of the drive but also its rotational speed and average seek time. The rotational speed is a measurement of how fast the disk spins. The average seek time is a measurement of how long it takes to seek between disk tracks during sequential I/O operations. Generally speaking, when comparing drives that conform to the same specification, such as SATA II or Ultra320 SCSI, the higher the rotational speed (measured in thousands of rotations per minute) and the lower the average seek time (measured in milliseconds) the better. As an example, a drive with a rotational speed of 15,000 RPM will give you 45 percent to 50 percent more I/O per second than the average 10,000-RPM drive, all other things being equal. A drive with a seek time of 3.5 msec will give you a 25 percent to 30 percent response time improvement over a drive with a seek time of 4.7 msec.

Other factors to consider include the maximum sustained data transfer rate and the mean time to failure (MTTF). Most drives of comparable quality will have similar transfer rates and MTTF. For example, if you compare Ultra320 SCSI drives with a 15,000-RPM rotational speed, you will probably find similar transfer rates and MTTF. As an example, the Maxtor Atlas 15K II has a maximum sustained data transfer rate of up to 98 MBps. The Seagate Cheetah 15K.4 has a maximum sustained data transfer rate of up to 96 MBps. Both have an MTTF of 1.4 million hours.

Transfer rates can also be expressed in gigabits per second. 1.5 gigabits per second is equivalent to a data rate of 187 MBps. 3.0 gigabits per second is equivalent to 374 MBps. Sometimes you’ll see a maximum external transfer rate (per the specification to which the drive complies) and an average sustained transfer rate. The average sustained transfer rate is the most important factor. The Seagate Barracuda 7200 SATA II drive has a rotational speed of 7,200 RPM and an average sustained transfer rate of 58 MBps. With an average seek time of 8.5 msec and an MTTF of 1 million hours, the drive performs comparably to other 7,200-RPM SATA II drives. However, most Ultra320 SCSI drives perform better.

Real World

Real World

Temperature is another important factor to consider when you are selecting a drive—but it is a factor few administrators take into account. Typically, the faster a drive rotates, the hotter it will run. This is not always the case, but it is certainly something you should consider when making your choice. For example, 15K drives tend to run hot, and you must be sure to carefully regulate temperature. Both the Maxtor Atlas 15K II and the Seagate Cheetah 15K.4 can become nonoperational at temperatures of 70 °C or higher (as would most other drives).

Configuring the I/O Subsystem

When configuring your server system, you will typically have to make a choice between hardware RAID and software RAID for the server’s internal disk drives. You must make this choice, in most cases, even if your server will use external storage. Cost and performance are the two key issues to consider for internal RAID.

Hardware RAID is more expensive than software RAID because it requires RAID controller cards. The expense of hardware RAID, however, is offset by the performance boost it offers. With software RAID, the server’s operating system manages the RAID implementation, which requires system resources: CPU processing power, memory, and so on. With hardware RAID, the server’s RAID controllers manage the RAID implementation.

Hardware RAID may also give you additional fault tolerance options. For example, Windows Server 2003 supports software RAID levels 0 (disk striping), 1 (disk mirroring), and 5 (disk striping with parity). With hardware RAID, you may have additional options, such as RAID 0 + 1 (which is also referred to as RAID 10, and combines disk striping and mirroring).

The operating system drive of a SQL Server system is often configured with RAID 1, as are drives used for SQL Server’s transaction logs. RAID 1 provides a full duplicate (or mirror) of a drive that can be used in case of failure of a primary drive. Because all data writes must go to two drives, disk mirroring doesn’t have the best write performance. Read performance is improved over that of a single disk because seeks typically can be split over both disks in the set. This means you could essentially get twice as many reads as with a single disk.

Note

Note

RAID can be configured in many ways. Sometimes it is more efficient to use both hardware and software RAID. For example, you could use hardware RAID controllers to do parity calculations and software RAID to stripe across the disks. Sometimes, you’ll want to use two drive controllers with mirroring (a technique referred to as disk duplexing). Disk duplexing has the same write performance as a single disk.

With RAID 1, failure recovery is easier and quicker than with other RAID options because you have a full duplicate disk. This is also why RAID 1 is recommended for the operating system drive. RAID 1 is recommended for drives containing transaction logs because transaction logs are sequentially written and only read in the case of a rollback operation. Thus, when you put a transaction log on its own mirrored drive, you can achieve good performance and have fault tolerance.

Drives containing SQL Server’s data files are often configured with RAID 5 or RAID 0 + 1. RAID 5 provides fault tolerance by striping data across multiple disks and storing parity information as data is written. Sections of data and parity information are written to each disk in the set in turn. In the case of disk failure, the parity information can be used to re-create the data on any lost disk. It is important to point out that this parity information can only be used to recover from the loss of a single drive in the array. If multiple drives fail simultaneously, the entire array will fail.

RAID 5 has advantages and disadvantages. With RAID 1, you can mirror a 150-GB drive onto another 150-GB drive. When you do this, there is a 50 percent overhead requirement, meaning that you use double the number of disks and gain no additional storage space. With a three-disk RAID 5 array, the amount of overhead required is about one-third (33 percent) of the total disk space. As you add volumes to a RAID 5 array, the overhead requirement decreases. Because reads are performed across multiple drives, RAID 5 offers better read performance than RAID 1. Essentially, you can perform as many reads as with a single disk times the number of disks in the array, meaning an array with five disks would have a read capacity of five times that of a single disk.

RAID 5 has poorer write performance than RAID 1 because whenever data is written to a RAID 5 array, four I/O operations are required: two reads and two writes. The target disk stripe and the parity stripe must be read first. The parity is then calculated. Then the target stripe and the parity stripe are written to disk.

RAID 0 + 1 is a combination of disk striping and mirroring. With RAID 0 + 1, you mirror a disk stripe, ensuring that there is a duplicate for each striped disk while gaining the performance of pure disk striping. As with RAID 1, each RAID 0 + 1 write operation requires two I/O operations: a write to each disk in the mirror (as with RAID 1). Read operations typically are spread across multiple disks, offering high performance (as with RAID 0 or RAID 5).

RAID 0 + 1 offers very high fault tolerance. Unlike RAID 1 and 5 the array can continue to operate in many cases even if more than one disk fails. In fact, all the disks on one side of the mirror could fail and the array would continue to operate. Failure of both sides of the mirror would result in a complete failure of the set, however.

Note

Note

A disadvantage of RAID 0 + 1 is the number of disks required. You need twice as many disks as you would need with a striped set. To mirror a 450-GB stripe set, you need another 450-GB stripe set, but the total capacity of the mirror does not change. It remains 450 GB.

When choosing between RAID 5 and RAID 0 + 1, and without considering the comparative cost, the key factor should be the way the disks will be used. RAID 5 works well when there is a high percentage of reads and few writes. RAID 0 + 1 offers better performance compared to RAID 5 as the amount of write operations increase. Specifically, with 90 percent reads and 10 percent writes, RAID 5 is the better choice. As the ratio of writes to reads increases, you will see improved performance if you select RAID 0 + 1.

Tip

Tip

When using RAID 1, 5, and 0 + 1, be sure that the disks have a battery-backed write cache. A battery-backed write cache can help protect data because it still can be written to disk even in the event of power interruption or failure. This is important when the same data must be written to multiple disks, as with RAID 1 and 0 + 1, and when parity information must be written accurately to ensure fault tolerance.

Ensuring Availability and Scalability

Not long ago, your options for ensuring availability and scalability were limited. This is no longer the case. You have many options—and most of these options do not require expensive storage subsystems or storage area networks (SANs).

To ensure availability, you can use log shipping to establish a standby server that you have to manually bring online if the primary server fails. You can use Microsoft Cluster service to create a failover server—one that could automatically come online if the primary server fails. For scalability, you can use distributed partition views to horizontally distribute tables across multiple servers. To improve read-ahead performance, you can use indexed views.

The key drawback to server clustering is that it is expensive, both in terms of required equipment and in resources required for setup. SQL Server 2005 introduces an extended form of log shipping called database mirroring, which works on standard server hardware and requires no special storage or controllers. Database mirroring allows you to continuously stream the transaction log from a source server to a destination server. If the source server fails, applications can reconnect to the database on the secondary server within a matter of seconds. Unlike server clustering, transaction logs can be fully synchronized between the servers. This allows changes to be synchronized in both directions.

Database mirroring requires three servers running SQL Server 2005:

  • A source server, also referred to as the principal. The principal server is the one to which applications connect and where transactions are processed.

  • A destination server, also referred to as the mirror. The mirroring server is the target of the shipped transaction logs and it operates in a standby state that does not allow read operations.

  • A tracking server, also referred to as the witness. The witness server tracks which server currently is acting as the principal and which is acting as the mirror. It is used when automatic failover is needed. Whenever there is contention between which server has which role, the witness makes a decision.

As transaction log records are generated on the principal, they are replayed either synchronously (at the same time) or asynchronously (at different times, such as after a short delay) on the mirror. This ensures that the mirror server is exactly in sync or very close to being in sync with the principal server. For example, there may be no write lag between the two servers, or there may be one or more transaction write lags between the two.

From the client’s point of view, failover from the principal to the mirror is automatic and nearly instantaneous. If the principal goes offline, the application fails over to the mirror. The mirror then becomes the principal. When the failed server comes back online, it becomes the mirror and receives transaction log records.

Note

Note

SQL Server replication can also be used to create copies of a database. You can use replication to distribute data across multiple databases. SQL Server supports several types of replication including snapshot replication, transactional replication, and merge replication. For more information on replication, see Chapter 12.

Ensuring Connectivity and Data Access

SQL Server 2005 introduces two features that can help ensure consistent connectivity and data access:

  • Dedicated administrator connection. Designed to ensure that administrators can get consistent access to SQL Server.

  • Multiple active result sets. Designed to ensure that users accessing the database have consistent access to SQL Server.

Unlike previous editions of SQL Server, in which administrators could be locked out if SQL Server became unresponsive, SQL Server 2005 uses dedicated administrator connections to provide a way for administrators to access a server that is not responding or is otherwise unavailable. With this feature, administrators are able to establish a connection that can be used to troubleshoot and resolve problems.

Any administrator who is a member of the sysadmin fixed server role can establish a dedicated server connection using the SQLCMD command prompt utility with the -A parameter. Consider the following example:

sqlcmd –U wrstanek –P moreFunPlease -S corpdbsvr05 -A

Here, the user wrstanek, who is a member of the sysadmin fixed server role, is connecting to the default instance on CorpDBSvr05. You could also connect to a named instance, such as:

sqlcmd –U wrstanek –P moreFunPlease -S corpdbsvr05webapp05 -A

where webapp05 is the name of the SQL Server instance.

Multiple active results sets (MARS) have improved SQL Server connectivity markedly for users as well. With SQL Server 2000, you could have at most one pending request in a given situation. Although server-side cursors and other techniques can be used to work around this limitation, you still do not have a direct way to handle multiple result sets in a single session. MARS corrects this problem by providing the programming interfaces necessary to represent a connection and a request executed under that connection separately. As an example, with Open Database Connectivity (ODBC) you represent connections and executed requests within connections using handles:

  • The SQL_HANDLE_DBC type represents connection handles.

  • The SQL_HANDLE_STMT type represents executed statements within connections.

The SQLODBC and SQLOLEDB drivers included in the SQL Native Client Installation for SQL Server 2005 are MARS-enabled, as is the SqlClient .NET Data Provider including in the Microsoft .NET Framework, version 2.0 or later. By default, these drivers establish connections and handle requests using MARS. Technically, execution requests can be a single T-SQL statement, a batch of T-SQL statements, or the name of a stored procedure or function to run along with any appropriate parameter values. Regardless, SQL Server sequentially executes the statements as it iterates through them, and the statements may or may not produce results. Thus, you can have more than one pending request under a given connection and more than one default result set.

Tip

Tip

Native drivers for SQL Server 2000 or earlier do not support MARS. MARS works by interleaving execution of multiple requests and not by parallel execution. MARS allows a statement, batch, or procedure to run and within the execution allows other requests to run. Interleaving works with SELECT, FETCH, READTEXT, RECEIVE, and BULK INSERT. It also works with asynchronous cursor population.

In contrast to SQL Server 2000, in which implicit spawning of connections under OLEDB and additional requests under ODBC are not allowed, SQL Server 2005 allows both to occur. This means that if a session has an active transaction, all new requests run under the transaction. When there is no active transaction, batches run in autocommit mode, in which each statement is executed under its own transaction.

The SqlClient .NET Provider has separate SqlConnection, SqlCommand, and SqlTransaction objects. SqlConnection objects represent connections established to a server. SqlCommand objects represent commands (requests) executed under the connection. SqlTransaction objects represent active transactions. When you begin a transaction within the context of a specific connection, a SqlTransaction object is returned to represent this transaction.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset