In this section, you will learn about the techniques and concepts you need to work with table values. Whether you want to create a new table or modify an existing one, the techniques and concepts you need to understand are similar.
Native data types are those built into SQL Server and supported directly. All data types have a length value, which is either fixed or variable. The length for a numeric or binary data type is the number of bytes used to store the number. The length for a character data type is the number of characters. Most numeric data types also have precision and scale. Precision is the total number of digits in a number. Scale is the number of digits to the right of the decimal point in a number. For example, the number 8714.235 has a precision of seven and a scale of three.
Table 9-1 summarizes native data types that work with numbers and money. The first column shows the general data type or data type synonym for SQL-92 compatibility. The second column shows the SQL Server data type. The third column shows the amount of storage space used.
Table 9-1. Native Data Types for Numbers and Money
SQL-92 Name—Type | SQL Server Name | Range—Description | Storage Size |
---|---|---|---|
Integers | |||
Bit | bit | 0,1, or NULL | 1 byte (for each 1- to 8-bit column) |
Big integer | bigint | -2^63 through 2^63 - 1 | 8 bytes |
Integer | int | -2^31 (-2,147,483,648) through 2^31 - 1 (2,147,483,647) | 4 bytes |
small integer | smallint | 2^15 (-32,768) through 2^15 - 1 (32,767) | 2 bytes |
tiny integer | tinyint | 0 through 255 | 1 byte |
Money | |||
Money | money | -922,337,203,685,477.5808 through +922,337,203,685,477.5807 | 8 bytes |
small money | smallmoney | -214,748.3648 through +214,748.3647 | 4 bytes |
Decimal | |||
dec, decimal | decimal | -10^38 + 1 through 10^38 - 1 | 5 to 17 bytes |
Numeric | decimal | -10^38 + 1 through 10^38 - 1 | 5 to 17 bytes |
Approximate Numeric | |||
Double precision | float | -1.79E + 308 through 1.79E + 308 | 4 to 8 bytes |
Float | float | -1.79E + 308 through 1.79E + 308. float[(n)] for n = 1 - 53 | 4 to 8 bytes |
Float | real | -1.18E - 38, 0 and -1.18E - 38 through 3.40E + 38. float[(n)] for n = 1 - 24 | 4 bytes |
Other Numerics | |||
Cursor | cursor | A reference to a cursor | Varies |
Rowversion | rowversion, timestamp | A database-wide unique number that indicates the sequence in which modifications took place in the database (Rowversion is a synonym for timestamp.) | 8 bytes |
SQL Variant | sql_variant | A special data type that allows a single column to store multiple data types [except text, ntext, sql_variant, image, timestamp, xml, varchar(max), varbinary(max), nvarchar(max) and .NET CLR user-defined types] | Varies |
Table | table | A special data type that is used to store a result set temporarily for processing; can be used only to define local variables and as the return type for user-defined functions | Varies |
Uniqueidentifier | uniqueidentifier | A globally unique identifier (GUID) | 16 bytes |
Xml | xml | A special data type that allows you to store XML data; XML markup is defined using standard text characters. | Varies |
Table 9-2. summarizes native data types for dates, characters, and binary data. Again, the first column shows the general data type or data type synonym for SQL-92 compatibility. The second column shows the SQL Server data type. The third column shows the amount of storage space used.
Table 9-2. Native Data Types for Dates, Characters, and Binary Values
SQL-92 Name—Type | SQL Server Name | Range—Description | Storage Size |
---|---|---|---|
Date | |||
Datetime | datetime | January 1, 1753, to December 31, 9999; accuracy of three-hundredths of a second | Two 4-byte integers |
small datetime | smalldatetime | January 1, 1900, through June 6, 2079; accuracy of one minute | Two 2-byte integers |
Character | |||
Character | char | Fixed-length, non-Unicode character data with a maximum length of 8,000 characters | 1 byte per character |
character varying | varchar | Variable-length, non-Unicode data with a maximum length of 8,000 characters | 1 byte per character |
varchar(max) | Variable-length data to 2^31 - 1 (2,147,483,647) characters | 1byte per character + 2-byte pointer | |
Text | text | Variable-length, non-Unicode data with a maximum length of 2^31 - 1 (2,147,483,647) characters | 1 byte per character |
national character | nchar | Fixed-length, Unicode data with a maximum length of 4,000 characters | 2 bytes per character |
national char varying | nvarchar | Variable-length, Unicode data with a maximum length of 4,000 characters | 2 bytes per character |
nvarchar(max) | Variable-length, Unicode data with a maximum length of 2^31 - 1 (1,073,741,823) characters | 2 bytes per character plus 2-byte pointer | |
national text | ntext | Variable-length, Unicode data with a maximum length of 2^30 - 1 (1,073,741,823) characters | 2 bytes per character |
Binary | |||
binary | binary | Fixed-length, binary data with a maximum length of 8,000 bytes | Size of data in bytes |
binary varying | varbinary | Variable-length, binary data with a maximum length of 8,000 bytes | Size of data in bytes |
varbinary(max) | Variable-length binary data to 2^31 - 1 (2,147,483,647) bytes | Size of data in bytes + 2-byte pointer | |
Image | image | Variable-length, binary data with a maximum length of 2^31 – 1 (2,147,483,647) bytes | Size of data in bytes |
When you create or modify a table in SQL Server Management Studio, you assign a native data type by clicking in the Data Type column and using the selection list to select a data type. In Transact-SQL, you set the data type when you create the table and populate its columns or when you alter a table and add or change columns. Example 9-6 shows how you could use Transact-SQL commands to create a table and its columns.
Example 9-6. Creating a Table and Its Columns
USE OrderSystemDB CREATE TABLE Sales.Customers (CustomerID nchar(5) NOT NULL, CompanyName nvarchar(40) NOT NULL, ContactName nvarchar(30) NOT NULL, ContactTitle nvarchar(30) NOT NULL, Address nvarchar(60) NOT NULL, City nvarchar(15) NULL, Region nvarchar(15) NULL, PostalCode nvarchar(5) NULL, Country nvarchar(15) NULL, Phone nvarchar(24) NULL, Fax nvarchar(24) NULL)
You can create binary and character data types as fixed-length, variable-length, or max-length fields. When you use fixed-length data types, the column size you specify is reserved in the database and can be written to without manipulating the data in the column. This makes updates to the database quicker than with variable-length fields. When you use variable-length data types, SQL Server will squeeze more rows into data pages, if possible. When there are more rows per data page, the process of reading data is usually more efficient, which can translate into improved performance for read operations. When you use max-length data types, SQL Server stores a 2-byte pointer to the actual data in the table’s regular data space and stores the actual data in the large object data space. Generally speaking, you should:
Use fixed-length data types when the size of the data is consistent.
Use variable-length data types when the size of the data varies.
Use max-length data types when the size of the data exceeds the fixed-length or variable-length limit.
To gain a better understanding of the performance implications, consider the following scenario. With fixed-length columns of 80, 120, 40, and 500 bytes each, rows would always be written using 750 bytes of storage (740 bytes for data plus 10 bytes of overhead for each row). In this example, 10 rows will fit on each data page (8096/750, without the remainder).
If you use variable-length columns, however, the number of bytes used per row and the number of rows stored per page would vary. For example, assume that the average variable-length row uses 400 bytes. This includes 380 bytes of data and 20 bytes of overhead (12 bytes of overhead for rows that use variable-length data plus 2 bytes of overhead per variable-length column, thus 4 * 2 + 12 = 20). In this case, 20 rows will fit on each data page (8096/400, without the remainder), which would make data reads more efficient than the fixed-length example.
User-defined data types are special data types that are based on a native data type. You will want to use user-defined data types when two or more tables store the same type of data in a column and these columns must have exactly the same data type, length, and nullability. You can create user-defined data types yourself, or you can let SQL Server do the job. For example, sysname is a user-defined data type that is used to reference database object names. The data type is defined as a variable Unicode character type of 128 characters, which is why object names are limited to 128 characters throughout SQL Server. You can apply this same concept to ensure that specific data is used exactly as you want it to be used.
You create user-defined data types at the database level rather than at the table level, which is why user-defined data types are static and immutable (unchangeable). This ensures that there is no performance penalty associated with user-defined data types. User-defined data types do have some limitations, however. You cannot declare a default value or check constraint as part of the user-defined data type. You also cannot create a user-defined data type based on a user-defined data type.
When you create user-defined data types in a user-defined database, they apply only to that database. If you want user-defined data types to apply to multiple databases, define the data type in the model database. Then the user-defined data type will exist in all new user-defined databases.
In SQL Server Management Studio, you create a user-defined data type by completing the following steps:
In SQL Server Management Studio, connect to the server instance containing the database in which you want to work.
In Object Explorer, expand the Databases node, and then select a database and expand the view to show its resource nodes.
Expand the Programmability node, right-click Types, point to New, and then select User-Defined Data Type. This opens the New User-Defined Data Type Properties dialog box shown in Figure 9-4.
The dbo schema is the default schema. To place the data type in a different schema, click the button to the right of the Schema box, and then click Browse. Select the schema you want to use, and then click OK twice.
Type a name for the new data type.
In the Data Type list, select the data type on which you want to base the user-defined data type.
If the data type has a variable length, set the number of bytes or characters for the data type. For fixed-length variables, such as int, you will not be able to set a length.
To allow the data type to accept null values, select Allow Nulls.
Optionally, use the Default and Rule lists to select a default or rule to bind to the user-defined data type.
Click OK. If you open a new table or edit an existing table, you will see the new data type as one of the last entries in the Data Type selection list.
You can also create user-defined data types with the CREATE TYPE statement. Example 9-7 shows this procedure’s syntax and usage.
After you create user-defined data types, you often will need to manage their properties. To manage user-defined data types, complete the following steps:
In SQL Server Management Studio, connect to the server instance that contains the database in which you want to work.
In Object Explorer, expand the Databases node, and then select a database and expand the view to show its resource nodes.
Expand Programmability, Types, and User-Defined Data Types to list the current user-defined data types.
Right-click the user-defined data type you want to manage and then select:
Properties to view the data type’s properties and set dependencies.
Delete to delete the data type.
Rename to rename the data type.
To see where the data type is used in the database, right-click the user-defined data type, and then select View Dependencies from the shortcut menu.
When you create columns in a table, you can specify whether or not nulls are allowed. A null means there is no entry in the column for that row; it is not the same as zero or an empty string. Columns defined with a primary key constraint or identity property cannot allow null values.
If you add a row but do not set a value for a column that allows null values, SQL Server inserts the value NULL-unless a default value is set for the column. When a default value is set for a column and you insert a null value, SQL Server replaces NULL with the default value. Additionally, if the column allows nulls, you can explicitly set a column to null using the NULL keyword. Do not use quotation marks when setting null explicitly.
In SQL Server Management Studio’s Table view, you can:
Allow nulls in a column by selecting the Allow Nulls column property.
Disallow nulls in a column by clearing the Allow Nulls column property.
For an example of how to allow and disallow nulls with Transact-SQL, refer to Example 9-4 earlier in this chapter.
Null values are useful when you do not know a value or a value is missing. The use of null values is controversial, however, and a better alternative is to set a default value. The default value is used when no value is set for a column you are inserting into a table. For example, you may want a character-based column to have the value N/A rather than NULL, so you would set the default value as N/A.
Table 9-3 summarizes combinations of default values and nullability that are handled in different ways. The main thing to remember is that if you set a default value, the default is used whenever a value is not specified for the column entry. This is true even if you allow nulls.
When you design tables, you will often need to think about unique identifiers that can be used as primary keys or can ensure that merged data does not conflict with existing data. Unique identifiers for primary keys could include customer account numbers or Social Security numbers. However, if a unique identifier is not available, you may want to use the identity property to generate sequential values that are unique for each row in a table. You could also use this unique identifier to generate a customer account number, an order number, or whatever other unique value you need automatically.
Although the identity property provides a local solution for a specific table, it does not guarantee that the value used as an identifier will be unique throughout the database. Other tables in the database may have identity columns with the same values. In most cases, this is not a problem because the identity values usually are used only within the context of a single table and do not relate to other tables.
However, in some situations, you may want to use a value that is unique throughout one or more databases, and then globally unique identifiers provide the solution you need.
Globally unique identifiers are guaranteed to be unique across all networked computers in the world, which is extremely useful in merge replication. When you are merging data from multiple databases, globally unique identifiers ensure that records are not inadvertently associated with each other. For example, a company’s New York, Chicago, and San Francisco offices may have customer account numbers that are unique within those local offices but are not unique at the national level. Globally unique identifiers would ensure that account XYZ from New York and account XYZ from Chicago are not merged as the same account.
Identities and globally unique identifiers are not mutually exclusive. Each table can have one identifier column and one globally unique identity property. These values are often used together. For example, all clustered indexes in SQL Server should be unique, but they do not have to be unique.
In SQL Server Management Studio’s Table view, you set identity values for a table by completing the following steps:
Create or modify other columns in the table as appropriate, and then start a new column for the identity value.
Give the identity column a name, and then select a data type. Identifier columns must use one of the following data types: tinyint, smallint, int, bigint, decimal, or numeric. Globally unique identifier columns must have a data type of uniqueidentifier.
Clear the Allow Nulls check box for the identity column.
To assign a globally unique identifier, select the identity column in the Table view. In the Column Properties view, set Is RowGuid to Yes. A default value of newid() is created automatically for you.
The newid() function is used to generate new unique identifier values by combining the identification number of a network card with a unique number from the CPU clock. If a server process generates the identifier, the server’s network card identification number is used. If the identifier is returned by application API function calls, the client’s network card is used. Network card manufacturers guarantee that no other network card in the next 100 years will have the same number.
To assign a unique identifier, select the identity column in the Table view, expand Identity Specification in the Column Properties view, and then set (Is Identity) to Yes.
Type a value in the Identity Increment cell. This value is the increment that is added to the Identity Seed for each subsequent row. If you leave this cell blank, the value 1 is assigned by default.
Type a value in the Identity Seed cell. This value is assigned to the first row in the table. If you leave this cell blank, the value 1 is assigned by default.
If you are replicating a database as discussed in Chapter 12, and do not want the column to be replicated, set Is Not For Replication to Yes. Otherwise, you will typically want to set Is Not For Replication to No to allow the column to be replicated.
The identity seed and increment values are used to determine the identifier for rows. If you enter a seed value of 100 and an increment of 10, the first row has a value of 100, the second has a value of 110, and so on.
When you create a table in Transact-SQL, globally unique identifiers are not generated automatically. You must reference the newid() function as the default value for the identifier column, as shown in this example:
USE OrderSystemDB CREATE TABLE Sales.Customers (cust_lname varchar(40) NOT NULL, cust_fname varchar(20) NOT NULL, phone char(12) NOT NULL, uid uniqueidentifier NOT NULL DEFAULT newid())
Then when you insert a new row into the table, you call newid() to generate the globally unique identifier:
INSERT INTO Sales.Customers Values ('Stanek', 'William', '123-555-1212', newid())