Most applications need to manipulate some form of data. The Microsoft .NET Framework provides many techniques that simplify or improve the efficiency of common data manipulation tasks. The recipes in this chapter describe how to do the following:
Manipulate the contents of strings efficiently to avoid the overhead of automatic string creation due to the immutability of strings (recipe 2-1)
Represent basic data types using different encoding schemes or as byte arrays to allow you to share data with external systems (recipes 2-2, 2-3, and 2-4)
Validate user input and manipulate string values using regular expressions (recipes 2-5 and 2-6)
Create System.DateTime
objects from string values, such as those that a user might enter, and display DateTime
objects as formatted strings (recipe 2-7)
Mathematically manipulate DateTime
objects in order to compare dates or add/subtract periods of time from a date (recipe 2-8)
Sort the contents of an array or an ArrayList
collection (recipe 2-9)
Copy the contents of a collection to an array (recipe 2-10)
Use the standard generic collection classes to instantiate a strongly typed collection (recipe 2-11)
Use generics to define your own general-purpose container or collection class that will be strongly typed when it is used (recipe 2-12)
Serialize object state and persist it to a file (recipes 2-13 and 2-14)
Read user input from the Windows console (recipe 2-15)
Use large integer values (recipe 2-16)
Select elements from an array or collection (recipe 2-17)
Remove duplicate entries from an array or collection (recipe 2-18)
You need to manipulate the contents of a String
object and want to avoid the overhead of automatic String
creation caused by the immutability of String
objects.
Use the System.Text.StringBuilder
class to perform the manipulations and convert the result to a String
object using the StringBuilder.ToString
method.
String
objects in .NET are immutable, meaning that once created their content cannot be changed. For example, if you build a string by concatenating a number of characters or smaller strings, the Common Language Runtime (CLR) will create a completely new String
object whenever you add a new element to the end of the existing string. This can result in significant overhead if your application performs frequent string manipulation.
The StringBuilder
class offers a solution by providing a character buffer and allowing you to manipulate its contents without the runtime creating a new object as a result of every change. You can create a new StringBuilder
object that is empty or initialized with the content of an existing String
object. You can manipulate the content of the StringBuilder
object using overloaded methods that allow you to insert and append string representations of different data types. At any time, you can obtain a String
representation of the current content of the StringBuilder
object by calling StringBuilder.ToString
.
Two important properties of StringBuilder
control its behavior as you append new data: Capacity
and Length. Capacity
represents the size of the StringBuilder
buffer, and Length
represents the length of the buffer's current content. If you append new data that results in the number of characters in the StringBuilder
object (Length
) exceeding the capacity of the StringBuilder
object (Capacity
), StringBuilder
must allocate a new buffer to hold the data. The size of this new buffer is double the size of the previous Capacity
value. Used carelessly, this buffer reallocation can negate much of the benefit of using StringBuilder
. If you know the length of data you need to work with, or know an upper limit, you can avoid unnecessary buffer reallocation by specifying the capacity at creation time or setting the Capacity
property manually. Note that 16 is the default Capacity
property setting. When setting the Capacity
and Length
properties, be aware of the following behavior:
If you set Capacity
to a value less than the value of Length
, the Capacity
property throws the exception System.ArgumentOutOfRangeException
. The same exception is also thrown if you try to raise the Capacity
setting above the value of the MaxCapacity
property. This should not be a problem unless you want to allocate more that 2 gigabytes (GB).
If you set Length
to a value less than the length of the current content, the content is truncated.
If you set Length
to a value greater than the length of the current content, the buffer is padded with spaces to the specified length. Setting Length
to a value greater than Capacity
automatically adjusts the Capacity
value to be the same as the new Length
value.
The ReverseString
method shown in the following example demonstrates the use of the StringBuilder
class to reverse a string. If you did not use the StringBuilder
class to perform this operation, it would be significantly more expensive in terms of resource utilization, especially as the input string is made longer. The method creates a StringBuilder
object of the correct capacity to ensure that no buffer reallocation is required during the reversal operation.
using System; using System.Text; namespace Apress.VisualCSharpRecipes.Chapter02 { class Recipe02_01 { public static string ReverseString(string str) { // Make sure we have a reversible string. if (str == null || str.Length <= 1) { return str; } // Create a StringBuilder object with the required capacity. StringBuilder revStr = new StringBuilder(str.Length); // Loop backward through the source string one character at a time and // append each character to StringBuilder. for (int count = str.Length - 1; count > −1; count--) { revStr.Append(str[count]); } // Return the reversed string. return revStr.ToString(); } public static void Main() { Console.WriteLine(ReverseString("Madam Im Adam")); Console.WriteLine(ReverseString( "The quick brown fox jumped over the lazy dog."));
// Wait to continue. Console.WriteLine(" Main method complete. Press Enter"); Console.ReadLine(); } } }
You need to exchange character data with systems that use character-encoding schemes other than UTF-16, which is the character-encoding scheme used internally by the CLR.
Use the System.Text.Encoding
class and its subclasses to convert characters between different encoding schemes.
Unicode is not the only character-encoding scheme, nor is UTF-16 the only way to represent Unicode characters. When your application needs to exchange character data with external systems (particularly legacy systems) through an array of bytes, you may need to convert character data between UTF-16 and the encoding scheme supported by the other system.
The abstract class Encoding
and its concrete subclasses provide the functionality to convert characters to and from a variety of encoding schemes. Each subclass instance supports the conversion of characters between UTF-16 and one other encoding scheme. You obtain instances of the encoding-specific classes using the static factory method Encoding.GetEncoding
, which accepts either the name or the code page number of the required encoding scheme.
Table 2-1 lists some commonly used character-encoding schemes and the code page number you must pass to the GetEncoding
method to create an instance of the appropriate encoding class. The table also shows static properties of the Encoding
class that provide shortcuts for obtaining the most commonly used types of encoding objects.
Table 2.1. Character-Encoding Classes
Encoding Scheme | Class | Create Using |
---|---|---|
ASCII |
|
|
Default |
|
|
UTF-7 |
|
|
UTF-8 |
|
|
UTF-16 (big-endian) |
|
|
UTF-16 (little-endian) |
|
|
Windows OS |
|
|
Once you have an Encoding
object of the appropriate type, you convert a UTF-16–encoded Unicode string to a byte array of encoded characters using the GetBytes
method. Conversely, you convert a byte array of encoded characters to a string using the GetString
method.
The following example demonstrates the use of some encoding classes:
using System; using System.IO; using System.Text; namespace Apress.VisualCSharpRecipes.Chapter02 { class Recipe02_02 { public static void Main() { // Create a file to hold the output. using (StreamWriter output = new StreamWriter("output.txt")) { // Create and write a string containing the symbol for pi. string srcString = "Area = u03A0r^2"; output.WriteLine("Source Text : " + srcString); // Write the UTF-16 encoded bytes of the source string. byte[] utf16String = Encoding.Unicode.GetBytes(srcString); output.WriteLine("UTF-16 Bytes: {0}", BitConverter.ToString(utf16String));
// Convert the UTF-16 encoded source string to UTF-8 and ASCII. byte[] utf8String = Encoding.UTF8.GetBytes(srcString); byte[] asciiString = Encoding.ASCII.GetBytes(srcString); // Write the UTF-8 and ASCII encoded byte arrays. output.WriteLine("UTF-8 Bytes: {0}", BitConverter.ToString(utf8String)); output.WriteLine("ASCII Bytes: {0}", BitConverter.ToString(asciiString)); // Convert UTF-8 and ASCII encoded bytes back to UTF-16 encoded // string and write. output.WriteLine("UTF-8 Text : {0}", Encoding.UTF8.GetString(utf8String)); output.WriteLine("ASCII Text : {0}", Encoding.ASCII.GetString(asciiString)); } // Wait to continue. Console.WriteLine(" Main method complete. Press Enter"); Console.ReadLine(); } } }
Running the code will generate a file named output.txt
. If you open this file in a text editor that supports Unicode, you will see the following content:
Source Text : Area = πr^2 UTF-16 Bytes: 41-00-72-00-65-00-61-00-20-00-3D-00-20-00-A0-03-72-00-5E-00-32-00 UTF-8 Bytes: 41-72-65-61-20-3D-20-CE-A0-72-5E-32 ASCII Bytes: 41-72-65-61-20-3D-20-3F-72-5E-32 UTF-8 Text : Area = πr^2 ASCII Text : Area = ?r^2
Notice that using UTF-16 encoding, each character occupies 2 bytes, but because most of the characters are standard characters, the high-order byte is 0. (The use of little-endian byte ordering means that the low-order byte appears first.) This means that most of the characters are encoded using the same numeric values across all three encoding schemes. However, the numeric value for the symbol pi (emphasized in bold in the preceding output) is different in each of the encodings. The value of pi requires more than 1 byte to represent. UTF-8 encoding uses 2 bytes, but ASCII has no direct equivalent and so replaces pi with the code 3F
. As you can see in the ASCII text version of the string, 3F
is the symbol for an English question mark (?).
If you convert Unicode characters to ASCII or a specific code page–encoding scheme, you risk losing data. Any Unicode character with a character code that cannot be represented in the scheme will be ignored.
The Encoding
class also provides the static method Convert
to simplify the conversion of a byte array from one encoding scheme to another without the need to manually perform an interim conversion to UTF-16. For example, the following statement converts the ASCII-encoded bytes contained in the asciiString
byte array directly from ASCII encoding to UTF-8 encoding:
byte[] utf8String = Encoding.Convert(Encoding.ASCII, Encoding.UTF8,asciiString);
The static methods of the System.BitConverter
class provide a convenient mechanism for converting most basic value types to and from byte arrays. An exception is the decimal
type. To convert a decimal
type to or from a byte array, you need to use a System.IO.MemoryStream
object.
The static method GetBytes
of the BitConverter
class provides overloads that take most of the standard value types and return the value encoded as an array of bytes. Support is provided for the bool, char, double, short, int, long, float, ushort, uint
, and ulong
data types. BitConverter
also provides a set of static methods that support the conversion of byte arrays to each of the standard value types. These are named ToBoolean, ToUInt32, ToDouble
, and so on.
Unfortunately, the BitConverter
class does not provide support for converting the decimal
type. Instead, write the decimal
type to a MemoryStream
instance using a System.IO.BinaryWriter
object, and then call the MemoryStream.ToArray
method. To create a decimal
type from a byte array, create a MemoryStream
object from the byte array and read the decimal
type from the MemoryStream
object using a System.IO.BinaryReader
instance.
The following example demonstrates the use of BitConverter
to convert a bool
type and an int
type to and from a byte array. The second argument to each of the ToBoolean
and ToInt32
methods is a zero-based offset into the byte array where the BitConverter
should start taking the bytes to create the data value. The code also shows how to convert a decimal
type to a byte array using a MemoryStream
object and a BinaryWriter
object, as well as how to convert a byte array to a decimal
type using a BinaryReader
object to read from the MemoryStream
object.
using System; using System.IO; namespace Apress.VisualCSharpRecipes.Chapter02 { class Recipe02_03 { // Create a byte array from a decimal. public static byte[] DecimalToByteArray (decimal src) { // Create a MemoryStream as a buffer to hold the binary data. using (MemoryStream stream = new MemoryStream()) { // Create a BinaryWriter to write binary data to the stream. using (BinaryWriter writer = new BinaryWriter(stream)) { // Write the decimal to the BinaryWriter/MemoryStream. writer.Write(src); // Return the byte representation of the decimal. return stream.ToArray(); } } } // Create a decimal from a byte array. public static decimal ByteArrayToDecimal (byte[] src) { // Create a MemoryStream containing the byte array. using (MemoryStream stream = new MemoryStream(src)) { // Create a BinaryReader to read the decimal from the stream. using (BinaryReader reader = new BinaryReader(stream)) { // Read and return the decimal from the // BinaryReader/MemoryStream. return reader.ReadDecimal(); } } } public static void Main()
{ byte[] b = null; // Convert a bool to a byte array and display. b = BitConverter.GetBytes(true); Console.WriteLine(BitConverter.ToString(b)); // Convert a byte array to a bool and display. Console.WriteLine(BitConverter.ToBoolean(b,0)); // Convert an int to a byte array and display. b = BitConverter.GetBytes(3678); Console.WriteLine(BitConverter.ToString(b)); // Convert a byte array to an int and display. Console.WriteLine(BitConverter.ToInt32(b,0)); // Convert a decimal to a byte array and display. b = DecimalToByteArray(285998345545.563846696m); Console.WriteLine(BitConverter.ToString(b)); // Convert a byte array to a decimal and display. Console.WriteLine(ByteArrayToDecimal(b)); // Wait to continue. Console.WriteLine("Main method complete. Press Enter"); Console.ReadLine(); } } }
The BitConverter.ToString
method provides a convenient mechanism for obtaining a String
representation of a byte array. Calling ToString
and passing a byte array as an argument will return a String
object containing the hexadecimal value of each byte in the array separated by a hyphen—for example "34-A7-2C"
. Unfortunately, there is no standard method for reversing this process to obtain a byte array from a string with this format.
You need to convert binary data into a form that can be stored as part of an ASCII text file (such as an XML file) or sent as part of a text e-mail message.
Use the static methods ToBase64CharArray
and FromBase64CharArray
of the System.Convert
class to convert your binary data to and from a Base64-encoded char
array. If you need to work with the encoded data as a string value instead of a char
array, you can use the ToBase64String
and FromBase64String
methods of the Convert
class instead.
Base64 is an encoding scheme that enables you to represent binary data as a series of ASCII characters so that it can be included in text files and e-mail messages in which raw binary data is unacceptable. Base64 encoding works by spreading the contents of 3 bytes of input data across 4 bytes and ensuring each byte uses only the 7 low-order bits to contain data. This means that each byte of Base64-encoded data is equivalent to an ASCII character and can be stored or transmitted anywhere ASCII characters are permitted.
The ToBase64CharArray
and FromBase64CharArray
methods of the Convert
class make it straightforward to Base64 encode and decode data. However, before Base64 encoding, you must convert your data to a byte array. Similarly, when decoding you must convert the byte array back to the appropriate data type. See recipe 2-2 for details on converting string data to and from byte arrays and recipe 2-3 for details on converting basic value types. The ToBase64String
and FromBase64String
methods of the Convert
class deal with string representations of Base64-encoded data.
The example shown here demonstrates how to Base64 encode and decode a byte array, a Unicode string, an int
type, and a decimal
type using the Convert
class. The DecimalToBase64
and Base64ToDecimal
methods rely on the ByteArrayToDecimal
and DecimalToByteArray
methods listed in recipe 2-3.
using System; using System.IO; using System.Text; namespace Apress.VisualCSharpRecipes.Chapter02 { class Recipe02_04 { // Create a byte array from a decimal. public static byte[] DecimalToByteArray (decimal src) { // Create a MemoryStream as a buffer to hold the binary data. using (MemoryStream stream = new MemoryStream()) { // Create a BinaryWriter to write binary data the stream. using (BinaryWriter writer = new BinaryWriter(stream)) { // Write the decimal to the BinaryWriter/MemoryStream. writer.Write(src);
// Return the byte representation of the decimal. return stream.ToArray(); } } } // Create a decimal from a byte array. public static decimal ByteArrayToDecimal (byte[] src) { // Create a MemoryStream containing the byte array. using (MemoryStream stream = new MemoryStream(src)) { // Create a BinaryReader to read the decimal from the stream. using (BinaryReader reader = new BinaryReader(stream)) { // Read and return the decimal from the // BinaryReader/MemoryStream. return reader.ReadDecimal(); } } } // Base64 encode a Unicode string. public static string StringToBase64 (string src) { // Get a byte representation of the source string. byte[] b = Encoding.Unicode.GetBytes(src); // Return the Base64-encoded string. return Convert.ToBase64String(b); } // Decode a Base64-encoded Unicode string. public static string Base64ToString (string src) { // Decode the Base64-encoded string to a byte array. byte[] b = Convert.FromBase64String(src); // Return the decoded Unicode string. return Encoding.Unicode.GetString(b); } // Base64 encode a decimal. public static string DecimalToBase64 (decimal src) { // Get a byte representation of the decimal. byte[] b = DecimalToByteArray(src); // Return the Base64-encoded decimal. return Convert.ToBase64String(b); }
// Decode a Base64-encoded decimal. public static decimal Base64ToDecimal (string src) { // Decode the Base64-encoded decimal to a byte array. byte[] b = Convert.FromBase64String(src); // Return the decoded decimal. return ByteArrayToDecimal(b); } // Base64 encode an int. public static string IntToBase64 (int src) { // Get a byte representation of the int. byte[] b = BitConverter.GetBytes(src); // Return the Base64-encoded int. return Convert.ToBase64String(b); } // Decode a Base64-encoded int. public static int Base64ToInt (string src) { // Decode the Base64-encoded int to a byte array. byte[] b = Convert.FromBase64String(src); // Return the decoded int. return BitConverter.ToInt32(b,0); } public static void Main() { // Encode and decode a general byte array. Need to create a char[] // to hold the Base64-encoded data. The size of the char[] must // be at least 4/3 the size of the source byte[] and must be // divisible by 4. byte[] data = { 0x04, 0x43, 0x5F, 0xFF, 0x0, 0xF0, 0x2D, 0x62, 0x78, 0x22, 0x15, 0x51, 0x5A, 0xD6, 0x0C, 0x59, 0x36, 0x63, 0xBD, 0xC2, 0xD5, 0x0F, 0x8C, 0xF5, 0xCA, 0x0C};
char[] base64data = new char[(int)(Math.Ceiling((double)data.Length / 3) * 4)]; Console.WriteLine(" Byte array encoding/decoding"); Convert.ToBase64CharArray(data, 0, data.Length, base64data, 0); Console.WriteLine(new String(base64data)); Console.WriteLine(BitConverter.ToString( Convert.FromBase64CharArray(base64data, 0, base64data.Length))); // Encode and decode a string. Console.WriteLine(StringToBase64 ("Welcome to Visual C# Recipes from Apress")); Console.WriteLine(Base64ToString("VwBlAGwAYwBvAG0AZQAgAHQAbwA" + "gAFYAaQBzAHUAYQBsACAAQwAjACAAUgBlAGMAaQBwAGUAcwAgAGYAcgB" + "vAG0AIABBAHAAcgBlAHMAcwA=")); // Encode and decode a decimal. Console.WriteLine(DecimalToBase64(285998345545.563846696m)); Console.WriteLine(Base64ToDecimal("KDjBUP07BoEPAAAAAAAJAA==")); // Encode and decode an int. Console.WriteLine(IntToBase64(35789)); Console.WriteLine(Base64ToInt("zYsAAA==")); // Wait to continue. Console.WriteLine(" Main method complete. Press Enter"); Console.ReadLine(); } } }
If you Base64 encode binary data for the purpose of including it as MIME data in an e-mail message, be aware that the maximum allowed line length in MIME for Base64-encoded data is 76 characters. Therefore, if your data is longer than 76 characters, you must insert a new line. For further information about the MIME standard, consult RFCs 2045 through 2049, which can be found at www.ietf.org/rfc.html
.
You need to validate that user input or data read from a file has the expected structure and content. For example, you want to ensure that a user enters a valid IP address, telephone number, or e-mail address.
Use regular expressions to ensure that the input data follows the correct structure and contains only valid characters for the expected type of information.
When a user inputs data to your application or your application reads data from a file, it's good practice to assume that the data is bad until you have verified its accuracy. One common validation requirement is to ensure that data entries such as e-mail addresses, telephone numbers, and credit card numbers follow the pattern and content constraints expected of such data. Obviously, you cannot be sure the actual data entered is valid until you use it, and you cannot compare it against values that are known to be correct. However, ensuring the data has the correct structure and content is a good first step to determining whether the input is accurate. Regular expressions provide an excellent mechanism for evaluating strings for the presence of patterns, and you can use this to your advantage when validating input data.
The first thing you must do is figure out the regular expression syntax that will correctly match the structure and content of the data you are trying to validate. This is by far the most difficult aspect of using regular expressions. Many resources exist to help you with regular expressions, such as The Regulator (http://osherove.com/tools
), and RegExDesigner.NET, by Chris Sells (www.sellsbrothers.com/tools/#regexd
). The RegExLib.com
web site (www.regxlib.com
) also provides hundreds of useful prebuilt expressions.
Regular expressions are constructed from two types of elements: literals and metacharacters. Literals represent specific characters that appear in the pattern you want to match. Metacharacters provide support for wildcard matching, ranges, grouping, repetition, conditionals, and other control mechanisms. Table 2-2 describes some of the more commonly used regular expression metacharacter elements. (Consult the .NET SDK documentation for a full description of regular expressions. A good starting point is http://msdn.microsoft.com/en-us/library/system.text.regularexpressions.regex.aspx
.)
Table 2.2. Commonly Used Regular Expression Metacharacter Elements
Element | Description |
---|---|
| Specifies any character except a newline character ( |
| Specifies any decimal digit |
| Specifies any nondigit |
| Specifies any whitespace character |
| Specifies any non-whitespace character |
| Specifies any word character |
| Specifies any nonword character |
| Specifies the beginning of the string or line |
| Specifies the beginning of the string |
| Specifies the end of the string or line |
| Specifies the end of the string |
| Matches one of the expressions separated by the vertical bar (pipe symbol); for example, |
| Specifies a match with one of the specified characters; for example, |
| Specifies a match with any one character except those specified; for example, |
| Specifies a match with any one character in the specified range; for example, |
| Identifies a subexpression so that it's treated as a single element by the regular expression elements described in this table |
| Specifies one or zero occurrencesof the previous character or subexpression; for example, |
| Specifies zero or more occurrences of the previous character or subexpression; for example, |
| Specifies one or more occurrences of the previous character or subexpression; for example, |
| Specifies exactly |
| Specifies a minimum of |
| Specifies a minimum of |
The more complex the data you are trying to match, the more complex the regular expression syntax becomes. For example, ensuring that input contains only numbers or is of a minimum length is trivial, but ensuring a string contains a valid URL is extremely complex. Table 2-3 shows some examples of regular expressions that match against commonly required data types.
Table 2.3. Commonly Used Regular Expressions
Input Type | Description | Regular Expression |
---|---|---|
Numeric input | The input consists of one or more decimal digits; for example, 5 or 5683874674. |
|
Personal identification number (PIN) | The input consists of four decimal digits; for example, 1234. |
|
Simple password | The input consists of six to eight characters; for example, |
|
Credit card number | The input consists of data that matches the pattern of most major credit card numbers; for example, 4921835221552042 or 4921-8352-2155-2042. |
|
E-mail address | The input consists of an Internet e-mail address. The |
|
HTTP or HTTPS URL | The input consists of an HTTP-based or HTTPS-based URL; for example, |
|
Once you know the correct regular expression syntax, create a new System.Text.RegularExpressions.Regex
object, passing a string containing the regular expression to the Regex
constructor. Then call the IsMatch
method of the Regex
object and pass the string that you want to validate. IsMatch
returns a bool
value indicating whether the Regex
object found a match in the string. The regular expression syntax determines whether the Regex
object will match against only the full string or match against patterns contained within the string. (See the ^, A, $
, and z
entries in Table 2-2.)
The ValidateInput
method shown in the following example tests any input string to see if it matches a specified regular expression.
using System; using System.Text.RegularExpressions; namespace Apress.VisualCSharpRecipes.Chapter02 { class Recipe02_05 {
public static bool ValidateInput(string regex, string input) { // Create a new Regex based on the specified regular expression. Regex r = new Regex(regex); // Test if the specified input matches the regular expression. return r.IsMatch(input); } public static void Main(string[] args) { // Test the input from the command line. The first argument is the // regular expression, and the second is the input. Console.WriteLine("Regular Expression: {0}", args[0]); Console.WriteLine("Input: {0}", args[1]); Console.WriteLine("Valid = {0}", ValidateInput(args[0], args[1])); // Wait to continue. Console.WriteLine(" Main method complete. Press Enter"); Console.ReadLine(); } } }
To execute the example, run Recipe02-05.exe
and pass the regular expression and data to test as command-line arguments. For example, to test for a correctly formed e-mail address, type the following:
Recipe02-05 ^[w-]+@([w-]+.)+[w-]+$ [email protected]
The result would be as follows:
Regular Expression: ^[w-]+@([w-]+.)+[w-]+$ Input: [email protected] Valid = True
You can use a Regex
object repeatedly to test multiple strings, but you cannot change the regular expression tested for by a Regex
object. You must create a new Regex
object to test for a different pattern. Because the ValidateInput
method creates a new Regex
instance each time it's called, you do not get the ability to reuse the Regex
object. As such, a more suitable alternative in this case would be to use a static
overload of the IsMatch
method, as shown in the following variant of the ValidateInput
method:
// Alternative version of the ValidateInput method that does not create
// Regex instances. public static bool ValidateInput(string regex, string input) { // Test if the specified input matches the regular expression. return Regex.IsMatch(input, regex); }
You need to minimize the impact on application performance that arises from using complex regular expressions frequently.
When you instantiate the System.Text.RegularExpressions.Regex
object that represents your regular expression, specify the Compiled
option of the System.Text.RegularExpressions.RegexOptions
enumeration to compile the regular expression to Microsoft Intermediate Language (MSIL).
By default, when you create a Regex
object, the regular expression pattern you specify in the constructor is compiled to an intermediate form (not MSIL). Each time you use the Regex
object, the runtime interprets the pattern's intermediate form and applies it to the target string. With complex regular expressions that are used frequently, this repeated interpretation process can have a detrimental effect on the performance of your application.
By specifying the RegexOptions.Compiled
option when you create a Regex
object, you force the .NET runtime to compile the regular expression to MSIL instead of the interpreted intermediary form. This MSIL is just-in-time (JIT) compiled by the runtime to native machine code on first execution, just like regular assembly code. You use a compiled regular expression in the same way as you use any Regex
object; compilation simply results in faster execution.
However, a couple downsides offset the performance benefits provided by compiling regular expressions. First, the JIT compiler needs to do more work, which will introduce delays during JIT compilation. This is most noticeable if you create your compiled regular expressions as your application starts up. Second, the runtime cannot unload a compiled regular expression once you have finished with it. Unlike as with a normal regular expression, the runtime's garbage collector will not reclaim the memory used by the compiled regular expression. The compiled regular expression will remain in memory until your program terminates or you unload the application domain in which the compiled regular expression is loaded.
As well as compiling regular expressions in memory, the static Regex.CompileToAssembly
method allows you to create a compiled regular expression and write it to an external assembly. This means that you can create assemblies containing standard sets of regular expressions, which you can use from multiple applications. To compile a regular expression and persist it to an assembly, take the following steps:
Create a System.Text.RegularExpressions.RegexCompilationInfo
array large enough to hold one RegexCompilationInfo
object for each of the compiled regular expressions you want to create.
Create a RegexCompilationInfo
object for each of the compiled regular expressions. Specify values for its properties as arguments to the object constructor. The following are the most commonly used properties:
IsPublic
, a bool
value that specifies whether the generated regular expression class has public
visibility
Name
, a String
value that specifies the class name
Namespace
, a String
value that specifies the namespace of the class
Pattern
, a String
value that specifies the pattern that the regular expression will match (see recipe 2-5 for more details)
Options
, a System.Text.RegularExpressions.RegexOptions
value that specifies options for the regular expression
Create a System.Reflection.AssemblyName
object. Configure it to represent the name of the assembly that the Regex.CompileToAssembly
method will create.
Execute Regex.CompileToAssembly
, passing the RegexCompilationInfo
array and the AssemblyName
object.
This process creates an assembly that contains one class declaration for each compiled regular expression—each class derives from Regex
. To use the compiled regular expression contained in the assembly, instantiate the regular expression you want to use and call its method as if you had simply created it with the normal Regex
constructor. (Remember to add a reference to the assembly when you compile the code that uses the compiled regular expression classes.)
This line of code shows how to create a Regex
object that is compiled to MSIL instead of the usual intermediate form:
Regex reg = new Regex(@"[w-]+@([w-]+.)+[w-]+", RegexOptions.Compiled);
The following example shows how to create an assembly named MyRegEx.dll
, which contains two regular expressions named PinRegex
and CreditCardRegex
:
using System; using System.Reflection; using System.Text.RegularExpressions; namespace Apress.VisualCSharpRecipes.Chapter02 { class Recipe02_06 {
public static void Main() { // Create the array to hold the Regex info objects. RegexCompilationInfo[] regexInfo = new RegexCompilationInfo[2]; // Create the RegexCompilationInfo for PinRegex. regexInfo[0] = new RegexCompilationInfo(@"^d{4}$", RegexOptions.Compiled, "PinRegex", "", true); // Create the RegexCompilationInfo for CreditCardRegex. regexInfo[1] = new RegexCompilationInfo( @"^d{4}-?d{4}-?d{4}-?d{4}$", RegexOptions.Compiled, "CreditCardRegex", "", true); // Create the AssemblyName to define the target assembly. AssemblyName assembly = new AssemblyName(); assembly.Name = "MyRegEx"; // Create the compiled regular expression Regex.CompileToAssembly(regexInfo, assembly); } } }
You need to create a System.DateTime
instance that represents the time and date specified in a string.
Use the Parse
or ParseExact
method of the DateTime
class.
Many subtle issues are associated with using the DateTime
class to represent dates and times in your applications. Although the Parse
and ParseExact
methods create DateTime
objects from strings as described in this recipe, you must be careful how you use the resulting DateTime
objects within your program. See the article titled "Coding Best Practices Using DateTime in the .NET Framework," at http://msdn.microsoft.com/netframework/default.aspx?pull=/library/en-us/dndotnet/html/datetimecode.asp
, for details about the problems you may encounter.
Dates and times can be represented as text in many different ways. For example, 1st June 2005, 1/6/2005, 6/1/2005, and 1-Jun-2005 are all possible representations of the same date, and 16:43 and 4:43 p.m. can both be used to represent the same time. The static DateTime.Parse
method provides a flexible mechanism for creating DateTime
instances from a wide variety of string representations.
The Parse
method goes to great lengths to generate a DateTime
object from a given string. It will even attempt to generate a DateTime
object from a string containing partial or erroneous information and will substitute defaults for any missing values. Missing date elements default to the current date, and missing time elements default to 12:00:00 a.m. After all efforts, if Parse
cannot create a DateTime
object, it throws a System.FormatException
exception.
The Parse
method is both flexible and forgiving. However, for many applications, this level of flexibility is unnecessary. Often, you will want to ensure that DateTime
parses only strings that match a specific format. In these circumstances, use the ParseExact
method instead of Parse
. The simplest overload of the ParseExact
method takes three arguments: the time and date string to parse, a format string that specifies the structure that the time and date string must have, and an IFormatProvider
reference that provides culture-specific information to the ParseExact
method. If the IFormatProvider
value is null
, the current thread's culture information is used.
The time and date must meet the requirements specified in the format string, or else ParseExact
will throw a System.FormatException
exception. You use the same format specifiers for the format string as you use to format a DateTime
object for display as a string. This means that you can use both standard and custom format specifiers.
The following example demonstrates the flexibility of the Parse
method and the use of the ParseExact
method. Refer to the documentation for the System.Globalization.DateTimeFormatInfo
class in the .NET Framework SDK document for complete details on all available format specifiers.
using System; namespace Apress.VisualCSharpRecipes.Chapter02 { class Recipe02_07 { public static void Main(string[] args) { string ds1 = "Sep 2005"; string ds2 = "Monday 5 September 2005 14:15:33"; string ds3 = "5,9,5"; string ds4 = "5/9/2005 14:15:33"; string ds5 = "2:15 PM"; // 1st September 2005 00:00:00 DateTime dt1 = DateTime.Parse(ds1); // 5th September 2005 14:15:33 DateTime dt2 = DateTime.Parse(ds2);
// 5th September 2005 00:00:00 DateTime dt3 = DateTime.Parse(ds3); // 5th September 2005 14:15:33 DateTime dt4 = DateTime.Parse(ds4); // Current Date 14:15:00 DateTime dt5 = DateTime.Parse(ds5); // Display the converted DateTime objects. Console.WriteLine("String: {0} DateTime: {1}", ds1, dt1); Console.WriteLine("String: {0} DateTime: {1}", ds2, dt2); Console.WriteLine("String: {0} DateTime: {1}", ds3, dt3); Console.WriteLine("String: {0} DateTime: {1}", ds4, dt4); Console.WriteLine("String: {0} DateTime: {1}", ds5, dt5); // Parse only strings containing LongTimePattern. DateTime dt6 = DateTime.ParseExact("2:13:30 PM", "h:mm:ss tt", null); // Parse only strings containing RFC1123Pattern. DateTime dt7 = DateTime.ParseExact( "Mon, 05 Sep 2005 14:13:30 GMT", "ddd, dd MMM yyyy HH':'mm':'ss 'GMT'", null); // Parse only strings containing MonthDayPattern. DateTime dt8 = DateTime.ParseExact("September 05", "MMMM dd", null); // Display the converted DateTime objects. Console.WriteLine(dt6); Console.WriteLine(dt7); Console.WriteLine(dt8); // Wait to continue. Console.WriteLine(" Main method complete. Press Enter"); Console.ReadLine(); } } }
Use the DateTime
and TimeSpan
structures, which support standard arithmetic and comparison operators.
A DateTime
instance represents a specific time (such as 4:15 a.m. on September 5, 1970), whereas a TimeSpan
instance represents a period of time (such as 2 hours, 35 minutes). You may want to add, subtract, and compare TimeSpan
and DateTime
instances.
Internally, both DateTime
and TimeSpan
use ticks to represent time. A tick is equal to 100 nanoseconds (ns). TimeSpan
stores its time interval as the number of ticks equal to that interval, and DateTime
stores time as the number of ticks since 12:00:00 midnight on January 1 in 0001 CE. (CE stands for Common Era and is equivalent to AD in the Gregorian calendar.) This approach and the use of operator overloading makes it easy for DateTime
and TimeSpan
to support basic arithmetic and comparison operations. Table 2-4 summarizes the operator support provided by the DateTime
and TimeSpan
structures.
Table 2.4. Operators Supported by DateTime and TimeSpan
Operator | TimeSpan | DateTime |
---|---|---|
Assignment ( | Because | Because |
Addition ( | Adds two | Adds a |
Subtraction ( | Subtracts one | Subtracts a |
Equality ( | Compares two | Compares two |
Inequality ( | Compares two | Compares two |
Greater than ( | Determines if one | Determines if one |
Greater than or equal to ( | Determines if one | Determines if one |
Less than ( | Determines if one | Determines if one |
Less than or equal to ( | Determines if one | Determines if one |
Unary negation ( | Returns a | Not supported |
Unary plus ( | Returns the | Not supported |
The DateTime
structure also implements the AddTicks, AddMilliseconds, AddSeconds, AddMinutes, AddHours, AddDays, AddMonths
, and AddYears
methods. Each of these methods allows you to add (or subtract using negative values) the appropriate element of time to a DateTime
instance. These methods and the operators listed in Table 2-4 do not modify the original DateTime
; instead, they create a new instance with the modified value.
The following example demonstrates the use of operators to manipulate the DateTime
and TimeSpan
structures:
using System; namespace Apress.VisualCSharpRecipes.Chapter02 { class Recipe02_08 { public static void Main() { // Create a TimeSpan representing 2.5 days. TimeSpan timespan1 = new TimeSpan(2, 12, 0, 0); // Create a TimeSpan representing 4.5 days. TimeSpan timespan2 = new TimeSpan(4, 12, 0, 0); // Create a TimeSpan representing 3.2 days. // using the static convenience method TimeSpan timespan3 = TimeSpan.FromDays(3.2); // Create a TimeSpan representing 1 week. TimeSpan oneWeek = timespan1 + timespan2;
// Create a DateTime with the current date and time. DateTime now = DateTime.Now; // Create a DateTime representing 1 week ago. DateTime past = now - oneWeek; // Create a DateTime representing 1 week in the future. DateTime future = now + oneWeek; // Display the DateTime instances. Console.WriteLine("Now : {0}", now); Console.WriteLine("Past : {0}", past); Console.WriteLine("Future: {0}", future); // Use the comparison operators. Console.WriteLine("Now is greater than past: {0}", now > past); Console.WriteLine("Now is equal to future: {0}", now == future); // Wait to continue. Console.WriteLine(" Main method complete. Press Enter"); Console.ReadLine(); } } }
Use the static System.Linq.
Enumerable.OrderBy
method to sort generic collections and arrays. For other collections, use the Cast
method to convert to a generic collection and then use Enumerable.OrderBy
. Use ArrayList.Sort
for ArrayList
objects.
The static Enumerable.OrderBy
method takes an implementation of the IEnumerable
interface and a function delegate (which can be a lambda expression). The generic collection classes all implement IEnumerable
and they, as well as arrays, can be sorted. The function delegate allows you to specify which property or method will be used to sort the data—the parameter is a data element from the collection or array and the return value is what you wish to represent that value in the sort operation. So, for example, if you wish to sort a collection of MyType
instances using the myProperty
property for sorting, you would call
List<MyType> list = new List<MyType>(); Enumerable.OrderBy(list, x => x.myProperty);
Enumerable.OrderBy
returns an instance of IOrderedEnumerable
, which you can use to enumerate the sorted data (for example, in a foreach
loop) or use to create a new sorted collection, by calling the ToArray, ToDictionary
, or ToList
method.
Nongeneric collections (those that are created without the <type>
syntax) must be converted to generic collections using the Cast<>
method. You must either ensure that all of the items in your collection are of the type specified or use Cast<object>()
to obtain a collection that will work with any type that is contained.
The ArrayList
collection is an exception in that it cannot be used with the generic syntax. For instances of ArrayList
, use the ArrayList.Sort()
method.
The following example demonstrates how to sort an array, a generic List
, and an ArrayList
:
using System; using System.Collections; using System.Collections.Generic; using System.Linq; namespace Apress.VisualCSharpRecipes.Chapter02 { class Recipe02_09 { public static void Main() { // Create a new array and populate it. int[] array = { 4, 2, 9, 3 }; // Created a new, sorted array array = Enumerable.OrderBy(array, e => e).ToArray<int>(); // Display the contents of the sorted array. foreach (int i in array) { Console.WriteLine(i); } // Create a list and populate it. List<string> list = new List<string>(); list.Add("Michael"); list.Add("Kate"); list.Add("Andrea"); list.Add("Angus");
// Enumerate the sorted contents of the list. Console.WriteLine(" List sorted by content"); foreach (string person in Enumerable.OrderBy(list, e => e)) { Console.WriteLine(person); } // Sort and enumerate based on a property. Console.WriteLine(" List sorted by length property"); foreach (string person in Enumerable.OrderBy(list, e => e.Length)) { Console.WriteLine(person); } // Create a new ArrayList and populate it. ArrayList arraylist = new ArrayList(4); arraylist.Add("Michael"); arraylist.Add("Kate"); arraylist.Add("Andrea"); arraylist.Add("Angus"); // Sort the ArrayList. arraylist.Sort(); // Display the contents of the sorted ArrayList. Console.WriteLine(" Arraylist sorted by content"); foreach (string s in list) { Console.WriteLine(s); } // Wait to continue. Console.WriteLine(" Main method complete. Press Enter"); Console.ReadLine(); } } }
Use the ICollection.CopyTo
method implemented by all collection classes, or use the ToArray
method implemented by the ArrayList, Stack
, and Queue
collections.
The ICollection.CopyTo
method and the ToArray
method perform roughly the same function: they perform a shallow copy of the elements contained in a collection to an array. The key difference is that CopyTo
copies the collection's elements to an existing array, whereas ToArray
creates a new array before copying the collection's elements into it.
The CopyTo
method takes two arguments: an array and an index. The array is the target of the copy operation and must be of a type appropriate to handle the elements of the collection. If the types do not match, or no implicit conversion is possible from the collection element's type to the array element's type, a System.InvalidCastException
exception is thrown. The index is the starting element of the array where the collection's elements will be copied. If the index is equal to or greater than the length of the array, or the number of collection elements exceeds the capacity of the array, a System.ArgumentException
exception is thrown.
The ArrayList, Stack
, and Queue
classes and their generic versions also implement the ToArray
method, which automatically creates an array of the correct size to accommodate a copy of all the elements of the collection. If you call ToArray
with no arguments, it returns an object[]
array, regardless of the type of objects contained in the collection. For convenience, the ArrayList.ToArray
method has an overload to which you can pass a System.Type
object that specifies the type of array that the ToArray
method should create. (You must still cast the returned strongly typed array to the correct type.) The layout of the array's contents depends on which collection class you are using. For example, an array produced from a Stack
object will be inverted compared to the array generated by an ArrayList
object.
This example demonstrates how to copy the contents of an ArrayList
structure to an array using the CopyTo
method, and then shows how to use the ToArray
method on the ArrayList
object.
using System; using System.Collections; namespace Apress.VisualCSharpRecipes.Chapter02 { class Recipe02_10 { public static void Main() { // Create a new ArrayList and populate it. ArrayList list = new ArrayList(5); list.Add("Brenda"); list.Add("George"); list.Add("Justin"); list.Add("Shaun"); list.Add("Meaghan"); // Create a string array and use the ICollection.CopyTo method // to copy the contents of the ArrayList. string[] array1 = new string[list.Count]; list.CopyTo(array1, 0);
// Use ArrayList.ToArray to create an object array from the // contents of the collection. object[] array2 = list.ToArray(); // Use ArrayList.ToArray to create a strongly typed string // array from the contents of the collection. string[] array3 = (string[])list.ToArray(typeof(String)); // Display the contents of the three arrays. Console.WriteLine("Array 1:"); foreach (string s in array1) { Console.WriteLine(" {0}",s); } Console.WriteLine("Array 2:"); foreach (string s in array2) { Console.WriteLine(" {0}", s); } Console.WriteLine("Array 3:"); foreach (string s in array3) { Console.WriteLine(" {0}", s); } // Wait to continue. Console.WriteLine(" Main method complete. Press Enter"); Console.ReadLine(); } } }
You need a collection that works with elements of a specific type so that you do not need to work with System.Object
references in your code.
Use the appropriate collection class from the System.Collections.Generic
namespace. When you instantiate the collection, specify the type of object the collection should contain using the generics syntax.
The generics functionality makes it easy to create type-safe collections and containers (see recipe 2-12). To meet the most common requirements for collection classes, the System.Collections.Generic
namespace contains a number of predefined generic collections, including the following:
Dictionary
LinkedList
List
Queue
Stack
When you instantiate one of these collections, you specify the type of object that the collection will contain by including the type name in angled brackets after the collection name; for example, List<System.Reflection.AssemblyName>
. As a result, all members that add objects to the collection expect the objects to be of the specified type, and all members that return objects from the collection will return object references of the specified type. Using strongly typed collections and working directly with objects of the desired type simplifies development and reduces the errors that can occur when working with general Object
references and casting them to the desired type.
The following example demonstrates the use of generic collections to create a variety of collections specifically for the management of AssemblyName
objects. Notice that you never need to cast to or from the Object
type.
using System; using System.Reflection; using System.Collections.Generic; namespace Apress.VisualCSharpRecipes.Chapter02 { class Recipe02_11 { public static void Main(string[] args) { // Create an AssemblyName object for use during the example. AssemblyName assembly1 = new AssemblyName("com.microsoft.crypto, " + "Culture=en, PublicKeyToken=a5d015c7d5a0b012, Version=1.0.0.0"); // Create and use a Dictionary of AssemblyName objects. Dictionary<string,AssemblyName> assemblyDictionary = new Dictionary<string,AssemblyName>(); assemblyDictionary.Add("Crypto", assembly1); AssemblyName a1 = assemblyDictionary["Crypto"];
Console.WriteLine("Got AssemblyName from dictionary: {0}", a1); // Create and use a List of Assembly Name objects. List<AssemblyName> assemblyList = new List<AssemblyName>(); assemblyList.Add(assembly1); AssemblyName a2 = assemblyList[0]; Console.WriteLine(" Found AssemblyName in list: {0}", a1); // Create and use a Stack of Assembly Name objects. Stack<AssemblyName> assemblyStack = new Stack<AssemblyName>(); assemblyStack.Push(assembly1); AssemblyName a3 = assemblyStack.Pop(); Console.WriteLine(" Popped AssemblyName from stack: {0}", a1); // Wait to continue. Console.WriteLine(" Main method complete. Press Enter"); Console.ReadLine(); } } }
You need to create a new general-purpose type such as a collection or container that supports strong typing of the elements it contains.
You can leverage the generics capabilities of the .NET Framework in any class you define. This allows you to create general-purpose classes that can be used as type-safe instances by other programmers. When you declare your type, you identify it as a generic type by following the type name with a pair of angled brackets that contain a list of identifiers for the types used in the class. Here is an example:
public class MyGenericType<T1, T2, T3>
This declaration specifies a new class named MyGenericType
, which uses three generic types in its implementation (T1, T2
, and T3
). When implementing the type, you substitute the generic type names into the code instead of using specific type names. For example, one method might take an argument of type T1
and return a result of type T2
, as shown here:
public T2 MyGenericMethod(T1 arg)
When other people use your class and create an instance of it, they specify the actual types to use as part of the instantiation. Here is an example:
MyGenericType<string,Stream,string> obj = new MyGenericType<string,Stream,string>();
The types specified replace T1, T2
, and T3
throughout the implementation, so with this instance, MyGenericMethod
would actually be interpreted as follows:
public Stream MyGenericMethod(string arg)
You can also include constraints as part of your generic type definition. This allows you to make specifications such as the following:
Only value types or only reference types can be used with the generic type.
Only types that implement a default (empty) constructor can be used with the generic type.
Only types that implement a specific interface can be used with the generic type.
Only types that inherit from a specific base class can be used with the generic type.
One generic type must be the same as another generic type (for example, T1
must be the same as T3
).
For example, to specify that T1
must implement the System.IDisposable
interface and provide a default constructor, that T2
must be or derive from the System.IO.Stream
class, and that T3
must be the same type as T1
, change the definition of MyGenericType
as follows:
public class MyGenericType<T1, T2, T3> where T1 : System.IDisposable, new() where T2 : System.IO.Stream where T3 : T1 { * ...Implementation... * }
The following example demonstrates a simplified bag implementation that returns those objects put into it at random. A bag is a data structure that can contain zero or more items, including duplicates of items, but does not guarantee any ordering of the items it contains.
using System; using System.Collections.Generic;
namespace Apress.VisualCSharpRecipes.Chapter02 { public class Bag<T> { // A List to hold the bags's contents. The list must be // of the same type as the bag. private List<T> items = new List<T>(); // A method to add an item to the bag. public void Add(T item) { items.Add(item); } // A method to get a random item from the bag. public T Remove() { T item = default(T); if (items.Count != 0) { // Determine which item to remove from the bag. Random r = new Random(); int num = r.Next(0, items.Count); // Remove the item. item = items[num]; items.RemoveAt(num); } return item; } // A method to provide an enumerator from the underlying list public IEnumerator<T> GetEnumerator() { return items.GetEnumerator(); } // A method to remove all items from the bag and return them // as an array public T[] RemoveAll() { T[] i = items.ToArray(); items.Clear(); return i; } }
public class Recipe02_12 { public static void Main(string[] args) { // Create a new bag of strings. Bag<string> bag = new Bag<string>(); // Add strings to the bag. bag.Add("Darryl"); bag.Add("Bodders"); bag.Add("Gary"); bag.Add("Mike"); bag.Add("Nigel"); bag.Add("Ian"); Console.WriteLine("Bag contents are:"); foreach (string elem in bag) { Console.WriteLine("Element: {0}", elem); } // Take four strings from the bag and display. Console.WriteLine(" Removing individual elements"); Console.WriteLine("Removing = {0}", bag.Remove()); Console.WriteLine("Removing = {0}", bag.Remove()); Console.WriteLine("Removing = {0}", bag.Remove()); Console.WriteLine("Removing = {0}", bag.Remove()); Console.WriteLine(" Bag contents are:"); foreach (string elem in bag) { Console.WriteLine("Element: {0}", elem); } // Remove the remaining items from the bag. Console.WriteLine(" Removing all elements"); string[] s = bag.RemoveAll(); Console.WriteLine(" Bag contents are:"); foreach (string elem in bag) { Console.WriteLine("Element: {0}", elem); } // Wait to continue. Console.WriteLine(" Main method complete. Press Enter"); Console.ReadLine(); } } }
Use a formatter to serialize the object and write it to a System.IO.FileStream
object. When you need to retrieve the object, use the same type of formatter to read the serialized data from the file and deserialize the object. The .NET Framework class library includes the following formatter implementations for serializing objects to binary or SOAP format:
System.Runtime.Serialization.Formatters.Binary.BinaryFormatter
System.Runtime.Serialization.Formatters.Soap.SoapFormatter
Using the BinaryFormatter
and SoapFormatter
classes, you can serialize an instance of any serializable type. (See recipe 13-1 for details on how to make a type serializable.) The BinaryFormatter
class produces a binary data stream representing the object and its state. The SoapFormatter
class produces a SOAP document.
Both the BinaryFormatter
and SoapFormatter
classes implement the interface System.Runtime.Serialization.IFormatter
, which defines two methods: Serialize
and Deserialize
. The Serialize
method takes a System.IO.Stream
reference and a System.Object
reference as arguments, serializes the Object
, and writes it to the Stream
. The Deserialize
method takes a Stream
reference as an argument, reads the serialized object data from the Stream
, and returns an Object
reference to a deserialized object. You must cast the returned Object
reference to the correct type.
You will need to reference the System.Runtime.Serialization.Formatters.Soap
assembly in order to use SoapFormatter
. The BinaryFormatter
class is contained in the core assembly and requires no additional project references
The example shown here demonstrates the use of both BinaryFormatter
and SoapFormatter
to serialize a System.Collections.ArrayList
object containing a list of people to a file. The ArrayList
object is then deserialized from the files and the contents displayed to the console.
using System; using System.IO; using System.Collections; using System.Runtime.Serialization.Formatters.Soap; using System.Runtime.Serialization.Formatters.Binary; namespace Apress.VisualCSharpRecipes.Chapter02 { class Recipe02_13 { // Serialize an ArrayList object to a binary file. private static void BinarySerialize(ArrayList list) { using (FileStream str = File.Create("people.bin")) { BinaryFormatter bf = new BinaryFormatter(); bf.Serialize(str, list); } } // Deserialize an ArrayList object from a binary file. private static ArrayList BinaryDeserialize() { ArrayList people = null; using (FileStream str = File.OpenRead("people.bin")) { BinaryFormatter bf = new BinaryFormatter(); people = (ArrayList)bf.Deserialize(str); } return people; } // Serialize an ArrayList object to a SOAP file. private static void SoapSerialize(ArrayList list) { using (FileStream str = File.Create("people.soap")) { SoapFormatter sf = new SoapFormatter(); sf.Serialize(str, list); } } // Deserialize an ArrayList object from a SOAP file. private static ArrayList SoapDeserialize() { ArrayList people = null;
using (FileStream str = File.OpenRead("people.soap")) { SoapFormatter sf = new SoapFormatter(); people = (ArrayList)sf.Deserialize(str); } return people; } public static void Main() { // Create and configure the ArrayList to serialize. ArrayList people = new ArrayList(); people.Add("Graeme"); people.Add("Lin"); people.Add("Andy"); // Serialize the list to a file in both binary and SOAP form. BinarySerialize(people); SoapSerialize(people); // Rebuild the lists of people from the binary and SOAP // serializations and display them to the console. ArrayList binaryPeople = BinaryDeserialize(); ArrayList soapPeople = SoapDeserialize(); Console.WriteLine("Binary people:"); foreach (string s in binaryPeople) { Console.WriteLine(" " + s); } Console.WriteLine(" SOAP people:"); foreach (string s in soapPeople) { Console.WriteLine(" " + s); } // Wait to continue. Console.WriteLine(" Main method complete. Press Enter"); Console.ReadLine(); } } }
The SOAP file that the example produces is show following. The binary file is not human-readable.
<SOAP-ENV:Envelope xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:SOAP- ENC="http://schemas.xmlsoap.org/soap/encoding/" xmlns:SOAP- ENV="http://schemas.xmlsoap.org/soap/envelope/" xmlns:clr="http://schemas.microsoft.com/soap/encoding/clr/1.0" SOAP- ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"> <SOAP-ENV:Body> <a1:ArrayList id="ref-1" xmlns:a1=" http://schemas.microsoft.com/clr/ns/System.Collections"> <_items href="#ref-2"/> <_size>3</_size> <_version>3</_version> </a1:ArrayList> <SOAP-ENC:Array id="ref-2" SOAP-ENC:arrayType="xsd:anyType[4]"> <item id="ref-3" xsi:type="SOAP-ENC:string">Graeme</item> <item id="ref-4" xsi:type="SOAP-ENC:string">Lin</item> <item id="ref-5" xsi:type="SOAP-ENC:string">Andy</item> </SOAP-ENC:Array> </SOAP-ENV:Body> </SOAP-ENV:Envelope>
Create a Stream
that either writes to the destination you wish to serialize to or is the source of the data you wish to deserialize from. Create an instance of DataContractJsonSerializer
, using the type of the object that you wish to serialize or deserialize as the constructor argument. Call WriteObject
(to serialize) or ReadObject
(to deserialize) using the object you wish to process as a method argument.
The DataContractJsonSerializer
class is part of the wider .NET data contract support, which allows you to create a formal contract between a client and a service about the way in which data will be exchanged. For our purposes, we need only know that Microsoft has included data contract support for most .NET data types (including collections), allowing easy serialization to and from JSON.
You will need to reference the System.ServiceModel.Web
and System.Runtime.Serialization
assemblies in order to use DataContractJsonSerializer
.
When creating an instance of DataContractJsonSerializer
, you must supply the type of the object that you are going to serialize or deserialize as a constructor argument—you can obtain this by calling the GetType
method on any object. To serialize an object, call the WriteObject
method using the object you wish to serialize and the Stream
you wish to serialize it to as method arguments. The WriteObject
method will throw an exception if you try to serialize an object that does not match the type you used in the constructor.
To deserialize an object, call the ReadObject
method using a Stream
that contains the JSON data you wish to process—if you have received the JSON data as a string, you can use the MemoryStream
class (see the code following for an illustration of this technique). The ReadObject
method returns an object
, and so you must cast to your target type.
To serialize a data type that you have created, use the [Serializable]
annotation as follows:
[Serializable] class MyJSONType { public string myFirstProperty { get; set;} public string mySecondProperty { get; set; } }
Using [Serializable]
will serialize all of the members of your class. If you wish to be selective about which members are included in the JSON data, then use the [DataContract]
annotation at the class level, and mark each member you wish to be included with the [DataMember]
annotation, as follows:
[DataContract] class MyJSONType { [DataMember] public string myFirstProperty { get; set;} public string mySecondProperty { get; set; } }
For the simple class shown, this will result in the myFirstProperty
member being included in the JSON output and mySecondProperty
excluded.
The following example serializes a List
of strings using a MemoryStream
, prints out the resulting JSON, and then deserializes the List
in order to print out the contents.
using System; using System.Collections.Generic; using System.Runtime.Serialization.Json; using System.IO; using System.Text; namespace Apress.VisualCSharpRecipes.Chapter02 { class Recipe02_14 { public static void Main() { // Create a list of strings. List<string> myList = new List<string>() { "apple", "orange", "banana", "cherry" }; // Create memory stream - we will use this // to get the JSON serialization as a string. MemoryStream memoryStream = new MemoryStream(); // Create the JSON serializer. DataContractJsonSerializer jsonSerializer = new DataContractJsonSerializer(myList.GetType()); // Serialize the list. jsonSerializer.WriteObject(memoryStream, myList); // Get the JSON string from the memory stream. string jsonString = Encoding.Default.GetString(memoryStream.ToArray());
// Write the string to the console. Console.WriteLine(jsonString); // Create a new stream so we can read the JSON data. memoryStream = new MemoryStream(Encoding.Default.GetBytes(jsonString)); // Deserialize the list. myList = jsonSerializer.ReadObject(memoryStream) as List<string>; // Enumerate the strings in the list. foreach (string strValue in myList) { Console.WriteLine(strValue); } // Wait to continue. Console.WriteLine(" Main method complete. Press Enter"); Console.ReadLine(); } } }
Use the Read
or ReadLine
method of the System.Console
class to read input when the user presses Enter. To read input without requiring the user to press Enter, use the Console.ReadKey
method.
The simplest way to read input from the console is to use the static Read
or ReadLine
methods of the Console
class. These methods will both cause your application to block, waiting for the user to enter input and press Enter. In both instances, the user will see the input characters in the console. Once the user presses Enter, the Read
method will return an int
value representing the next character of input data, or −1
if no more data is available. The ReadLine
method will return a string containing all the data entered, or an empty string if no data was entered.
The .NET Framework includes the Console.
ReadKey
method, which provides a way to read input from the console without waiting for the user to press Enter. The ReadKey
method waits for the user to press a key and returns a System.ConsoleKeyInfo
object to the caller. By passing true
as an argument to an overload of the ReadKey
method, you can also prevent the key pressed by the user from being echoed to the console.
The returned ConsoleKeyInfo
object contains details about the key pressed. The details are accessible through the properties of the ConsoleKeyInfo
class (summarized in Table 2-5).
Table 2.5. Properties of the ConsoleKeyInfo Class
Property | Description |
---|---|
| Gets a value of the |
| Gets a |
| Gets a bitwise combination of values from the |
The KeyAvailable
method of the Console
class returns a bool
value indicating whether input is available in the input buffer without blocking your code.
The following example reads input from the console one character at a time using the ReadKey
method. If the user presses F1, the program toggles in and out of "secret" mode, where input is masked by asterisks. When the user presses Esc, the console is cleared and the input the user has entered is displayed. If the user presses Alt+X or Alt+x, the example terminates.
using System; using System.Collections.Generic; namespace Apress.VisualCSharpRecipes.Chapter02 { class Recipe02_15 { public static void Main() { // Local variable to hold the key entered by the user. ConsoleKeyInfo key; // Control whether character or asterisk is displayed. bool secret = false; // Character List for the user data entered. List<char> input = new List<char>();
string msg = "Enter characters and press Escape to see input." + " Press F1 to enter/exit Secret mode and Alt-X to exit."; Console.WriteLine(msg); // Process input until the user enters "Alt+X" or "Alt+x". do { // Read a key from the console. Intercept the key so that it is not // displayed to the console. What is displayed is determined later // depending on whether the program is in secret mode. key = Console.ReadKey(true); // Switch secret mode on and off. if (key.Key == ConsoleKey.F1) { if (secret) { // Switch secret mode off. secret = false; } else { // Switch secret mode on. secret = true; } } // Handle Backspace. if (key.Key == ConsoleKey.Backspace) { if (input.Count > 0) { // Backspace pressed, remove the last character. input.RemoveAt(input.Count - 1); Console.Write(key.KeyChar); Console.Write(" "); Console.Write(key.KeyChar); } } // Handle Escape. else if (key.Key == ConsoleKey.Escape) { Console.Clear(); Console.WriteLine("Input: {0} ", new String(input.ToArray())); Console.WriteLine(msg); input.Clear(); } // Handle character input.
else if (key.Key >= ConsoleKey.A && key.Key <= ConsoleKey.Z) { input.Add(key.KeyChar); if (secret) { Console.Write("*"); } else { Console.Write(key.KeyChar); } } } while (key.Key != ConsoleKey.X || key.Modifiers != ConsoleModifiers.Alt); // Wait to continue. Console.WriteLine(" Main method complete. Press Enter"); Console.ReadLine(); } } }
Numeric values in the .NET Framework have maximum and minimum values based on how much memory is allocated by the data type. The System.Numerics.BigInteger
class has no such limits, and can be used to perform operations on very large integer values.
Instances of BigInteger
are immutable, and you perform operations using the static methods of the BigInteger
class, each of which will return a new instance of BigInteger
as the result—see the code in this recipe for an example.
The following example creates a BigInteger
with a value that is twice the maximum value of Int64
and then adds another Int64.MaxValue
.
using System; using System.Collections.Generic; using System.Linq; using System.Text; using System.Numerics; namespace Apress.VisualCSharpRecipes.Chapter02 { class Recipe2_16 { static void Main(string[] args) { // Create a new big integer. BigInteger myBigInt = BigInteger.Multiply(Int64.MaxValue, 2); // Add another value. myBigInt = BigInteger.Add(myBigInt, Int64.MaxValue); // Print out the value. Console.WriteLine("Big Integer Value: {0}", myBigInt); // Wait to continue. Console.WriteLine(" Main method complete. Press Enter"); Console.ReadLine(); } } }
LINQ allows you to select elements from a collection based on characteristics of the elements contained within the collection. The basic sequence for querying a collection or array using LINQ is as follows:
Start a new LINQ query using the from
keyword.
Identify the conditions to use in selecting elements with the where
keyword.
Indicate what value will be added to the result set from each matching element using the select
keyword.
Specify the way you want the results to be sorted using the orderby
keyword.
The output of a LINQ query is an instance of IEnumerable
containing the collection/array elements that meet your search criteria—you can use IEnumerable
to walk through the matching elements using a foreach
loop, or as the data source for further LINQ queries. The following is an example of a LINQ query against a string
array—we select the first character of any entry longer than four characters and order the results based on length:
IEnumerable<char> linqResult = from e in stringArray where e.Length > 4 orderby e.Length select e[0];
Queries can also be written using lambda expressions and the methods available on the collections classes and array types. The preceding query would be as follows with lambda expressions:
IEnumerable<char> linqResult = stringArray.Where(e => e.Length > 4).OrderBy(e => e.Length).Select(e => e[0]);
For large collections and arrays, you can use Parallel LINQ (PLINQ), which will partition your query and use multiple threads to process the data in parallel. You enable PLINQ by using the AsParallel
method on your collection or array—for example:
IEnumerable<char> linqResult = from e in stringArray.AsParallel() where e.Length > 4 orderby e.Length select e[0];
LINQ is a rich and flexible feature and provides additional keywords to specify more complex queries—see Chapter 16 for further LINQ recipes.
The following example defines a class Fruit
, which has properties for the name and color of a type of fruit. A List
is created and populated with fruits, which are then used as the basis of a LINQ query—the query is performed using keywords and then repeated using lambda expressions.
using System; using System.Collections.Generic; using System.Linq; using System.Text; namespace Apress.VisualCSharpRecipes.Chapter02 { class Recipe02_17 { static void Main(string[] args) {
// Create a list of fruit. List<Fruit> myList = new List<Fruit>() { new Fruit("apple", "green"), new Fruit("orange", "orange"), new Fruit("banana", "yellow"), new Fruit("mango", "yellow"), new Fruit("cherry", "red"), new Fruit("fig", "brown"), new Fruit("cranberry", "red"), new Fruit("pear", "green") }; // Select the names of fruit that isn't red and whose name // does not start with the letter "c." IEnumerable<string> myResult = from e in myList where e.Color != "red" && e.Name[0] != 'c' orderby e.Name select e.Name; // Write out the results. foreach (string result in myResult) { Console.WriteLine("Result: {0}", result); } // Perform the same query using lambda expressions. myResult = myList.Where(e => e.Color != "red" && e.Name[0] != 'c').OrderBy(e => e.Name).Select(e => e.Name); // Write out the results. foreach (string result in myResult) { Console.WriteLine("Lambda Result: {0}", result); } // Wait to continue. Console.WriteLine(" Main method complete. Press Enter"); Console.ReadLine(); } } class Fruit { public Fruit(string nameVal, string colorVal) { Name = nameVal; Color = colorVal; } public string Name { get; set; } public string Color { get; set; } } }
The Distinct
method is part of the LINQ feature of the .NET Framework, which we used in the previous recipe to select items from a collection. The Distinct
method returns an instance of IEnumerable
, which can be converted into an array or collection with the ToArray, ToList
, and ToDictionary
methods. You can provide an instance of IEqualityComparer
as an argument to the Distinct
method in order to provide your rules for identifying duplicates.
The following example removes duplicates from a List<Fruit>
using a custom implementation of IEqu
alityComparer
passed to the List.Distinct
method and prints out the unique items:
using System; using System.Collections.Generic; using System.Linq; using System.Text; namespace Apress.VisualCSharpRecipes.Chapter02 { class Recipe02_18 { static void Main(string[] args) { // Create a list of fruit, including duplicates. List<Fruit> myList = new List<Fruit>() { new Fruit("apple", "green"), new Fruit("apple", "red"), new Fruit("orange", "orange"), new Fruit("orange", "orange"), new Fruit("banana", "yellow"), new Fruit("mango", "yellow"), new Fruit("cherry", "red"), new Fruit("fig", "brown"), new Fruit("fig", "brown"),
new Fruit("fig", "brown"), new Fruit("cranberry", "red"), new Fruit("pear", "green") }; // Use the Distinct method to remove duplicates // and print out the unique entries that remain. foreach (Fruit fruit in myList.Distinct(new FruitComparer())) { Console.WriteLine("Fruit: {0}:{1}", fruit.Name, fruit.Color); } // Wait to continue. Console.WriteLine(" Main method complete. Press Enter"); Console.ReadLine(); } } class FruitComparer : IEqualityComparer<Fruit> { public bool Equals(Fruit first, Fruit second) { return first.Name == second.Name && first.Color == second.Color; } public int GetHashCode(Fruit fruit) { return fruit.Name.GetHashCode() + fruit.Name.GetHashCode(); } } class Fruit { public Fruit(string nameVal, string colorVal) { Name = nameVal; Color = colorVal; } public string Name { get; set; } public string Color { get; set; } } }