What’s In This Chapter?
Storage is important for applications, and understanding how to put information into storage and retrieve it is the backbone of applications. The Azure Storage System provides storage for all your cloud-based storage needs including tabular, blobs, and queues. This chapter explains how to leverage Blob storage to effectively store, retrieve, and manage your large binary files.
Binary Large Objects (blobs) have been in IT vocabulary for quite a while. Blobs are simply binary files and are commonly but not exclusively multimedia files; they can be any type of file from images, videos, audio, and to documents. Many of the files you may access on a day-to-day basis could be consider a blob. Consider the video you watch online, the file that contains the data viewed in Microsoft Excel, or even this book if you are reading it on a computer or mobile device. These files are large binary files that can be considered a blob. Microsoft SQL Server has supported the storage of them within a database for some time. With Azure Storage you now have a support to Blob storage in the cloud.
As part of its Storage services, Azure supports the storage and retrieval of large unstructured files using Blob storage. Therefore to understand blobs and Blob storage, you must have a basic understanding Azure Storage.
Azure Storage is simple—it’s storage in the cloud. In line with Azure’s ability to host applications in the cloud, Azure Storage makes cloud data available as a service. Azure Storage provides the features and services to store and retrieve data in the cloud where it is available anywhere, anytime.
Four unique storage options are available in Azure Storage. Each one provides a solution for different application requirements. The four storage options are as follows:
Blob storage has a hierarchy of objects that define a resource URI and security. The hierarchy is a series of “containers” that have zero or more items. To understand how to leverage Blob storage, you must first understand how it uses these containers to define resource URIs and manage access. Figure 10-1 shows an example of the basic Blob storage container hierarchy starting with the Storage account, which is the top most container.
Storage accounts are associated with an Azure subscription and provide a security and access boundary separating the contained storage objects from other Storage account objects. The name given to a specific Storage account is also part of the root URI for the various entry points into the Azure Storage Service. Storage accounts are not specific for Blob storage; a single Storage account provides services for all storage types: Blobs, Tables, and Queues.
An Azure subscription can contain zero or more Storage accounts. These accounts are created and managed using the Azure Management Portal located at https://windows.azure.com. Creating a new Storage account provisions the required storage components in Azure and the required entry points to the services in DNS. These entry points are the resource URIs that access blobs stored in Blob storage. A new Storage account requires a unique account name, which becomes part of the resource URI. The account name must be unique from all other Storage account names, have a length of 3–24 characters, and use lowercase letters and numbers. As an example, Figure 10-1 uses a generic name of “school” for the account name, which will result in a Blob storage resource URI of http://school.blob.core.windows.net. This URL requests service from our Blob storage service.
The Storage account is also a security boundary. Each Storage account has two generated keys called Access Keys. These keys generate a signed Authorization header for use in the web request to validate and authorize the request. Without knowledge of the keys, a request is not signed and cannot be validated. Two keys are provided to allow one to be in use, whereas the second key can be regenerated resulting in no loss in service when changing access keys. Keys are managed from the Azure Management Portal.
Similar to the Storage account, containers provide namespace and security boundaries functionality to Blob storage and contain zero or more containers. Figure 10-1 includes two containers: math101 and chem260. Container names create a unique namespace and logical organization of contained blobs, and they give the appearance of a path-like structure similar to a file system. Containers also define the security for the contained blobs, but they exist only within a Storage account and cannot exist beyond the scope of the Storage account.
The Azure Management Portal does not provide an interface to create containers, therefore you create them programmatically with a name and optional meta data information. Containers cannot be nested; therefore, each container within a Storage account must be unique. Container names must follow the following guidelines:
The container name is included as part of the resource URI. For example, in the container math101 displayed in Figure 10-1 the resource URI would be http://school.blob.core.windows.net/math101.
Containers may have associated meta data stored as a name-value pair. Meta data is added and retrieved from Blob storage as header values in the web request and response. The name of the meta data item must start with x-ms-meta-. For this example use meta data to identify the primary responsible party for a container using a custom metadata tag such as x-ms-meta-responsibleparty.
Containers allow the controlling of access to the container as a whole and to its contained blobs. There are only two access scenarios in Blob storage:
Access | Permissions |
Full Public Read Access | Can view container and contained blob information; can enumerate blobs in the container. |
Public Read Access for Blobs Only | Can view only blobs; cannot view container information or enumerate blobs |
Basic access control is not granular. Either you have access to the Storage account Access Keys and can properly sign the Authorization header value in the web request or you are anonymous.
Blobs themselves are unstructured text or binary files up to 1 terabyte in length depending on the blob type. Similar to containers blobs are created programmatically. The Azure Management Portal does not provide a user interface to create blobs. Blobs do not have a specific security setting; security is determined by the container. The type of blob, Block or Page (discussed later in this section), is determined at the time of creation and cannot be changed.
The blob name is used as part of the resource URI and must be unique within a container. It can contain any character and must have a length of 1 to 1,024 characters. Reserved URL characters in the name must be properly escaped. It is legal to include path separators such as the forward slash (/) as part of the blob name. As noted in the previous “Working with Containers” section, the storage schema is flat. There are no subcontainers to create a hierarchical path similar to a file system path. You can create a virtual path when you use the forward slash in the blob name. For example, consider the math101 container in the school Storage account. A blob named additive.avi would have a resource URI of http://school.blob.core.windows.net/math101/additive.avi.
If the blob were named /videos/additive.avi, the resource URL of the blob would be http://school.blob.core.windows.net/math101/videos/additive.avi.
Although it appears there is a nested hierarchy, that’s not the case. The container remains math101, and the name of the blob is /videos/additive.avi, as shown in Figure 10-2.
Like containers, blobs support associated meta data as name-value pairs. These are added and retrieved from Blob storage as header values associated with the web request and response. The name of the meta data must start with x-ms-meta-. For this example, consider meta data that identifies the content owner. This meta data field may be named x-ms-meta-owner and have a value of Bob.
Blobs have two means of support for managing concurrency: ETags and leases.
Block blobs are commonly used for data such as videos and documents with each blob being a single piece of content. Block blobs support a maximum size of 200 GB. Uploading large files in an HTTP-based system can be problematic because they can cause timeouts, network issues, and can become corrupted; in fact, it’s not uncommon to upload a large file only to have network connectivity issues and need to restart the upload. Block blobs avoid these common issues by managing smaller pieces of the complete file. Block blobs are composed of one or more uploaded blocks of different sizes, which are then committed to the system as a single blob (see Figure 10-3). The current maximum block size is 4 MB. To upload large blobs to Blob storage, the file is chunked into multiple blocks, and each block is uploaded to Blob storage. When all the blocks are in Blob storage, the blocks are committed and the blob is available for use. Until the blocks are committed, the blob is not available.
The process of uploading blocks and ultimately committing multiple blocks is the responsibility of the programmer or application. You can upload files smaller than 64 MB as a single operation. You must partition files larger than 64 MB into blocks and commit them in Blob storage.
Page blobs support random access read and writes in Blob storage. They allow the application to start a read or write operation at an offset location in the file. This means faster access to the data and does not require the application to re-upload a large file due to a write operation. Blob storage uses Page files to support Windows Azure Drives and logging operations.
Consider the difference between a common Block blob (image) and a Page blob (log file). When the image file is modified, the complete image file is manipulated and must be reloaded to Blob storage. Most image formats do not allow for an isolated byte change in the file. Log files are commonly sequential files where you target an isolated change based on an offset. There is little need to rewrite the complete file for a byte level change.
Page blobs consist of an array or indexed collection of pages. Each page is 512 bytes in size. The total size of a Page blob cannot exceed 1 terabyte and must be a multiple of 512 bytes. Applications can write to pages in the Page blob based on the 512 byte offset, and a single write request can write up to 4 MB of data. The Page blob can grow by adding pages. Unlike the Block blob, Page blobs do not require a commit request. Data written to a Page blob is immediately available at the end of the write request.
Armed with a basic understanding of Azure Storage and Blob storage, you can now leverage Blob storage as a cloud-based data store. This section focuses on how to programmatically work with Blob storage using the Windows Azure Storage Services REST APIs.
Before you power up your favorite code editor and start programing the Blob storage, you need to create a Windows Azure Storage account. The Storage account is the top-level container that provides configuration and access to the Azure storage services. As mentioned previously, creating a Storage account generates the necessary namespaces in Azure as well as the required access keys needed to create and manage content.
To create a Storage account for Windows Azure, follow these steps:
Windows Azure Storage Services provides REST APIs to access the item contained in Azure Storage. This API supports programmability via common HTTP methods for Blob, Table, and Queue storage. This chapter focuses on blob access using the REST APIs. You can find these APIs documented at http://msdn.microsoft.com/en-us/library/windowsazure/dd179355.aspx.
The Blob service REST API works on three basic types of objects:
By working with the Blob service using the REST API, an application can:
The REST API provides the following Container functionality:
Finally, the REST API functionally for blobs includes the following:
REST APIs are web-based APIs and they require a request to the service endpoint using HTTP. REST APIs rely on the basic request methods defined in HTTP including PUT, GET, HEAD and DELETE. The URL of the request is a resource URI for a specific resource such as a container or blob. Some requests require additional action information, and those items can be included as a URI parameter, request header, or in the body of the request.
For example, to create a container named math101 in a Storage account named school, the URI would be http://school.blob.core.windows.net/math101. The request would also need to include a URI parameter named restype with a value of the container. The Blob service also requires a few headers in the requests. The headers required to create the container include the x-ms-date, x-ms-version, authorization, and content-length header. Finally the request is created using the PUT HTTP verb.
Many of the requests are similar in format, and most of the example code is repetitive. The example C# code provided with this chapter follows the same basic format:
The example code included with the chapter has abstracted some of the repetitive code into helper methods including GetWebRequest, CreateSharedKeyLiteAuthorizationHeader, GetCanonicalizedHeaders, and GetCanonicalizedResourceString. You should review these to understand some of the key pieces to formulating a web request with the proper headers and values. The example code does not handle exceptions outside of a basic try/catch block and does not provide parameter validation to minimize the amount of code. A production-ready application should include robust exception handling, logging, validation, and even retries of failed web requests.
Access to a container or blob is controlled by the container itself. Unless the container’s access policy has been changed, only the Owner is allowed access. Recall from “How Do Containers Work” section that there are only two types of users with regard to access: Owner and anonymous. Blob storage uses a signed Authorization header value to authenticate an Owner. Blob storage supports two types of authentication schemes: Shared Key and Shared Key Lite. This chapter uses only Shared Key Lite because of its simplicity. For further understanding of the Shared Key authentication scheme, take a look at Authentication Schemes in the Azure API reference located at http://msdn.microsoft.com/en-us/library/windowsazure/dd179428.aspx.
When creating a request to Blob storage, the Authorization header value is one of the more challenging pieces. It must be constructed, converted to a byte array, encrypted, and finally encoded. Not following the rules to create the value will result in an Authentication Failure exception. The Authorization value is composed of a string calculated from key pieces of the request, including the HTTP method, the resource URI, the comp URI parameter (if available) and certain web request headers and values. These values in a certain well-defined order are concatenated into a string. The string is then converted to a byte array. This byte array is encrypted using the HMAC-SHA256 algorithm, and the hash is Base64 encoded.
Listing 10-1 contains the three custom methods used to create the Authorization header value for the chapter’s sample code. The CreateSharedKeyLiteAuthorizationHeader method controls the creation of the value. It uses GetCanonicalizedHeaders method to filter out only headers that start with x-ms- and sorts the values. These values must be in alpha order before they are added to the signature string in a name:value format. The GetCanonicalizedResourceString method is responsible for creating the correctly formatted resource string. The sharedKey value is one of the two Storage account’s access keys. It is converted to a byte array for the signing process. The result of the CreateSharedKeyLiteAuthorizationHeader method is added to each web request as the Authorization header value. Any web request that requires Owner access needs to include the Authorization value. The only exception is for container policies and Shared Access Signatures, which are discussed later in the “Managing Permissions” section.
Listing 10-1: Creating the Shared Key Lite Authorization Value
private String CreateSharedKeyLiteAuthorizationHeader(string method, WebHeaderCollection headers, string accountName, string encodedPath, string container, string contentType ) { byte[] sharedKey = Convert.FromBase64String("<YOUR ACCESS KEY>"); string signature = "{0} {1} {2} {3} {4}{5}"; string signatureString = String.Format(CultureInfo.InvariantCulture, signature, method.ToUpper(), //Uppercase HTTP Method - 0 "", //Content-MD5 - 1 contentType, //Content-Type - 2 "", //Date - 3 GetCanonicalizedHeaders(headers), //Canonicalized Headers - 4 GetCanonicalizedResourceString(accountName, encodedPath, container) //Canonicalized Resource - 5 ); byte[] signatureBytes = System.Text.Encoding.UTF8.GetBytes(signatureString); HMACSHA256 crypto = new HMACSHA256(sharedKey); byte[] hashedSignature = crypto.ComputeHash(signatureBytes); return System.Convert.ToBase64String(hashedSignature); } //Select only values staring with x-ms- and sort collection private string GetCanonicalizedHeaders(WebHeaderCollection headers) { String result = ""; List<string> requiredHeaders = new List<string>(); foreach (string header in headers) { if (header.StartsWith("x-ms-")) { string tmpHeader = String.Format("{0}:{1} ", header, headers.GetValues(header)[0]); if (!requiredHeaders.Contains(tmpHeader)) { requiredHeaders.Add(tmpHeader); } } } requiredHeaders.Sort(); requiredHeaders.ForEach((hdr) => result += hdr); return result; } //Create the correct resource string private string GetCanonicalizedResourceString(string accountName, string encodedUriPath, string container) { string containerParam = container == "" ? container : "?comp=" + container; return string.Format("/{0}/{1}{2}", accountName, encodedUriPath, containerParam); }
Creating the correct Authorization value can be frustrating if you don’t know the intricacies of the process. Review Authentication Schemes in the Azure API reference located at http://msdn.microsoft.com/en-us/library/windowsazure/dd179428.aspx for more details of the process.
You saw earlier in the chapter that Storage accounts contain zero or more containers. There is no user interface provided in the Azure Management Portal to create or manage containers. You can create, modify, and delete containers using the Storage REST API. This section covers how to create a web request to create, modify, and delete containers in a Storage account.
A newly created Storage account does not have any containers. To create a container the web request to the storage service must include the following:
Optionally, you can include custom meta data during the creation of a container. Listing 10-2 demonstrates how to create a web request to generate a container. The containerName must be a valid name and the meta data parameter passes custom meta data using a name-value pair. The meta data parameter is required in this code, but passing in an empty dictionary object is allowed.
Listing 10-2: Creating a Blob Storage Container
private void CreateContainer(string containerName, Dictionary<string, string> metadata ) { HttpWebRequest request = GetWebRequest("PUT", ROOT_URI + "/" + containerName.ToLowerInvariant() + "?restype=container"); //~DH~DH~DH Create Headers ~DH~DH~DH~DH- request.ContentLength = 0; // custom metadata x-ms-meta foreach (KeyValuePair<string, string> nvp in metadata) { request.Headers.Add(nvp.Key, nvp.Value); } string encryptedHeader = CreateSharedKeyLiteAuthorizationHeader( request.Method, request.Headers, STORAGE_ACCOUNT, containerName.ToLowerInvariant(), "", ""); request.Headers.Add("Authorization", string.Format(CultureInfo.InvariantCulture, "SharedKeyLite {0}:{1}", STORAGE_ACCOUNT, encryptedHeader)); //Process response try { using (HttpWebResponse response = request.GetResponse() as HttpWebResponse) { if (response.StatusCode == HttpStatusCode.Created) { Console.WriteLine("Created container: {0}", containerName.ToLowerInvariant()); } } } catch (WebException webEx) { Console.WriteLine("Error creating: {0}. {1}", containerName.ToLowerInvariant(), webEx.Message); } }
Successful creation of the container results in an HTTP status of Created. If the container already exists, an error is returned, which can be caught and handled. By default, the container permissions is set to Owner. To access the container a correctly signed Authorization header value must be supplied.
Containers within a Storage account can be listed. Web requests created to list containers must include the following:
Optionally, you can use the include-metadata URI parameter to return container meta data. This adds a meta data section to the results and returns any custom meta data associated with the container. A request for listing containers may also include the prefix URI parameter, which you use to limit the results to containers that start with the provided prefix parameter value. Listing 10-3 includes the example code to list all containers within a Storage account. The code defines two requests: one that lists all containers and one that uses the prefix URI parameter to limit the results. Also, notice in the CreateSharedKeyLiteAuthorizationHeader method call the value of list is passed in. The comp parameter value is required when building the Authorization header value. When there is no comp parameter, an empty string is used in the Authorization header value.
A successful web response includes an HTTP status of OK with XML including the results, as shown of Listing 10-3.
Listing 10-3: Creating a Blob Storage Container
private void ListContainers(bool includeMetadata) { //List all containers HttpWebRequest request = GetWebRequest("GET", ROOT_URI + "/?comp=list" + (includeMetadata?"&include=metadata":"")); //List only containers that start with math // HttpWebRequest request = GetWebRequest("GET", ROOT_URI + "/?comp=list&prefix=math" + (includeMetadata?"&include=metadata":"")); string encryptedHeader = CreateSharedKeyLiteAuthorizationHeader( request.Method, request.Headers, STORAGE_ACCOUNT, "", "list", ""); request.Headers.Add("Authorization", string.Format(CultureInfo.InvariantCulture, "SharedKeyLite {0}:{1}", STORAGE_ACCOUNT, encryptedHeader)); //Process response Console.WriteLine("Containers:"); using (HttpWebResponse response = request.GetResponse() as HttpWebResponse) { if (response.StatusCode == HttpStatusCode.OK) { StreamReader rdr = new StreamReader(response.GetResponseStream()); XElement root = XElement.Parse(rdr.ReadToEnd()); foreach (XElement c in root.Element("Containers").Elements("Container")) { Console.WriteLine(c.Element("Name").Value); if (includeMetadata) { foreach (XElement meta in c.Element("Metadata").Elements()) { Console.WriteLine(" Metadata {0} = {1}", meta.Name.ToString(), meta.Value); } } } } } }
Deleting a container requires a web request using the DELETE request method. The deletion process occurs in two steps: Containers are initially marked for delete, but this is a logical or “soft” delete. The container is not truly deleted until a garbage collection sweep occurs. During the time of the logical deletion and the sweep, you may receive an error if you attempt to create another container using the same container name. Web requests created to delete containers must include the following:
The result of a successful delete is an HTTP status of Accepted. Listing 10-4 includes example code to delete a container.
Listing 10-4: Deleting Containers
private void DeleteContainer(string containerName) { HttpWebRequest request = GetWebRequest("DELETE", ROOT_URI + "/" + containerName.ToLowerInvariant() + "?restype=container"); string encryptedHeader = CreateSharedKeyLiteAuthorizationHeader( request.Method, request.Headers, STORAGE_ACCOUNT, containerName.ToLowerInvariant(), "", ""); request.Headers.Add("Authorization", string.Format(CultureInfo.InvariantCulture, "SharedKeyLite {0}:{1}", STORAGE_ACCOUNT, encryptedHeader)); //Process results try { using (HttpWebResponse response = request.GetResponse() as HttpWebResponse) { if (response.StatusCode == HttpStatusCode.Accepted) { Console.WriteLine("Container {0} has been deleted", containerName); } } } catch (WebException webEx) { Console.WriteLine("Error deleting container: {0}. {1}", containerName.ToLowerInvariant(), webEx.Message); } }
Blobs are the core of Blob storage. Blobs can be created, modified, copied, and deleted. This section focuses on Block blobs. Page blobs are similar in most requests with the exception of the upload process.
There are two different approaches for creating a blob in a container, depending on the size of the blob.
Web requests for uploading a small binary as a blob in a single request require the following:
Listing 10-5 includes an example that creates a Block blob using a single request. The PUT request can include optional meta data. Meta data can also be added or modified as a separate request. Pay close attention to the blobName. The blob name must be a valid name. An invalid name results in Authorization error, which can lead your troubleshooting effort down the wrong path. The x-ms-blob-type header defines the type of blob created, page or block, and is required. This code sets the content type of the blob. You use the content type value to create the Authorization header and it’s therefore passed in as a parameter to the CreateSharedKeyLiteAuthorizationHeader method. The content type value is optional, but if you include it as a web request header, it must be included in the Authorization header, or you receive an authorization failure error. If successful the expected result is an HTTP status of Created.
Listing 10-5: Creating a Blob - Single Request
private void PutBLockBlob_Single(string blobName, string contentType, string filePath, Dictionary<string, string> metadata) { HttpWebRequest request = GetWebRequest("PUT", ROOT_URI + "/" + blobName.ToLowerInvariant()); //~DH~DH- Create Headers ~DH~DH~DH~DH request.Headers.Add("x-ms-blob-type", "BlockBlob"); request.ContentType = contentType; // custom metadata x-ms-meta foreach (KeyValuePair<string, string> nvp in metadata) { request.Headers.Add(nvp.Key, nvp.Value); } string encryptedHeader = CreateSharedKeyLiteAuthorizationHeader( request.Method, request.Headers, STORAGE_ACCOUNT, blobName.ToLowerInvariant(), "", contentType); request.Headers.Add("Authorization", string.Format(CultureInfo.InvariantCulture, "SharedKeyLite {0}:{1}", STORAGE_ACCOUNT, encryptedHeader)); using (Stream vid1 = File.OpenRead(filePath)) { byte[] buffer = new byte[4096]; while (true) { int bytesRead = vid1.Read(buffer, 0, buffer.Length); if (bytesRead == 0) break; request.GetRequestStream().Write(buffer, 0, bytesRead); } } //Process response using (HttpWebResponse response = request.GetResponse() as HttpWebResponse) { if (response.StatusCode == HttpStatusCode.Created) { Console.WriteLine("Blob: {0} created", blobName.ToLowerInvariant()); } } }
Files greater than 64 MB in size must be partitioned into smaller files called blocks. Each block along with its id is uploaded to Blob storage as an uncommitted block. After all blocks have been uploaded to Blob storage, the blocks can be committed and composed into the actual blob. Partitioning the file into smaller blocks of content allows for parallel upload of data and the potential to restart the upload of the remaining blocks after an error instead of restarting the file upload again. Listing 10-6 is an example of partitioning a file into smaller blocks, uploading the blocks to Blob storage, and committing the blocks to create the blob.
Multiple web requests are required to upload large files as multiple blocks. Listing 10-5 contains all the requests in one single method. The nature of committing multiple blocks into a blob does not require that all the web requests happen in a single method. There is no requirement that all the blocks must be uploaded in order or at the same time. The only requirement is that all the blocks have been uploaded successfully before the final list of blocks is uploaded to commit the blocks to a blob.
Two basic web request types are required to upload a multiblock blob. The type of web request is used to upload each block. There is one web request for each block. The web request to upload a block requires the following:
The second web request type is used to commit the blocks as a single blob to Blob storage. This is done by posting an XML request that contains the order and id of blocks that creates the blob. This web request requires the following:
There are a few items to point out with the code in Listing 10-6:
Listing 10-6: Using Blocks to Upload a Large File
private void PutBlockBlob_MultipleBlocks(string blobName, string contentType, string filePath, Dictionary<string, string> metadata) { List<Block> blocks = new List<Block>(); //Get file content Byte[] fileAsBytes = File.ReadAllBytes(filePath); int maxAllowedBlockSize = 4000000; int targetBlockSize = maxAllowedBlockSize; int currentPos = 0; int len = fileAsBytes.Length; int currentBlockId = 0; //Create partition the byte[] into smaller blocks while (targetBlockSize == maxAllowedBlockSize) { if ((currentPos + targetBlockSize) > len) targetBlockSize = len - currentPos; byte[] blockContent = new byte[targetBlockSize]; Array.Copy(fileAsBytes, currentPos, blockContent, 0, targetBlockSize); blocks.Add(new Block() { Id = Convert.ToBase64String(System.BitConverter.GetBytes(currentBlockId++)), BlockArray = blockContent }); currentPos += targetBlockSize; } //Put each block into Blob storage blocks.ForEach((blk) => { HttpWebRequest request = GetWebRequest("PUT", ROOT_URI + "/" + blobName.ToLowerInvariant() + "?comp=block&blockid=" + blk.Id); //~DH~DH- Create Headers ~DH~DH~DH~DH request.ContentLength = blk.BlockArray.Length; string encryptedHeader = CreateSharedKeyLiteAuthorizationHeader( request.Method, request.Headers, STORAGE_ACCOUNT, blobName.ToLowerInvariant(), "block", ""); request.Headers.Add("Authorization", string.Format(CultureInfo.InvariantCulture, "SharedKeyLite {0}:{1}", STORAGE_ACCOUNT, encryptedHeader)); //getting the current block content request.GetRequestStream().Write(blk.BlockArray, 0, blk.BlockArray.Length); //Process response using (HttpWebResponse response = request.GetResponse() as HttpWebResponse) { if (response.StatusCode == HttpStatusCode.Created) { Console.WriteLine("Block: {0} created", blk.Id); } } } ); //Create the BlockList as XML XElement root = new XElement("BlockList"); blocks.ForEach((blk) => { root.Add(new XElement("Uncommitted", blk.Id)); } ); //Put the block list into Blob stroage HttpWebRequest req = GetWebRequest("PUT", ROOT_URI + "/" + blobName.ToLowerInvariant() + "?comp=blocklist"); //~DH~DH- Create Headers ~DH~DH~DH~DH req.Headers.Add("x-ms-blob-content-type", contentType); // custom metadata x-ms-meta foreach (KeyValuePair<string, string> nvp in metadata) { req.Headers.Add(nvp.Key, nvp.Value); } string encryptedHeader2 = CreateSharedKeyLiteAuthorizationHeader( req.Method, req.Headers, STORAGE_ACCOUNT, blobName.ToLowerInvariant(), "blocklist", ""); req.Headers.Add("Authorization", string.Format(CultureInfo.InvariantCulture, "SharedKeyLite {0}:{1}", STORAGE_ACCOUNT, encryptedHeader2)); //getting block list byte[] byteArray = Encoding.UTF8.GetBytes(root.ToString()); req.GetRequestStream().Write(byteArray, 0, byteArray.Length); //Processing resposne using (HttpWebResponse response = req.GetResponse() as HttpWebResponse) { if (response.StatusCode == HttpStatusCode.Created) { Console.WriteLine("Blob: {0} created", blobName.ToLowerInvariant() } } }
There are two ways to retrieve blobs from Blob storage, depending on what you need to do. The first way retrieves an XML list of the details of the blobs included in a container, and the second is to actually stream the blob out from storage, such as downloading the blob to a file system. Blobs are stored as binary files in Blob storage, and the latter approach retrieves this binary file stream.
Listing blobs is similar to listing containers. Web requests created to list blobs must include the following:
The web request to list blobs can also include a Prefix URL parameter to limit the results, and a parameter to determine what information is included in the results. The included URL parameter accepts values of snapshot, metadata, or uncommittedblobs. Listing 10-7 includes example code to list all blobs in a container. The comp parameter value is passed to the CreateSharedKeyLiteAuthorizationHeader method. Failure to pass the comp value into the custom CreateSharedKeyLiteAuthorizationHeader results in an authentication failure.
Listing 10-7: Listing Blobs in a Container
private void ListBlobs(string containerName, bool includeMetadata) { //Show all blobs HttpWebRequest request = GetWebRequest("GET", ROOT_URI + "/" + containerName.ToLowerInvariant() + "?restype=container&comp=list" + (includeMetadata?"&include=metadata":"")); //show only blobs that start with instruments //HttpWebRequest request = GetWebRequest("GET", ROOT_URI + "/" + containerName.ToLowerInvariant() + "?restype=container&comp=list& Prefix=instruments" + (includeMetadata ? "&include=metadata" : "")); string encryptedHeader = CreateSharedKeyLiteAuthorizationHeader( request.Method, request.Headers, STORAGE_ACCOUNT, containerName.ToLowerInvariant(), "list", ""); request.Headers.Add("Authorization", string.Format(CultureInfo.InvariantCulture, "SharedKeyLite {0}:{1}", STORAGE_ACCOUNT, encryptedHeader)); Console.WriteLine("Blobs:"); using (HttpWebResponse response = request.GetResponse() as HttpWebResponse) { if (response.StatusCode == HttpStatusCode.OK) { StreamReader rdr = new StreamReader(response.GetResponseStream()); XElement root = XElement.Parse(rdr.ReadToEnd()); foreach (XElement c in root.Element("Blobs").Elements("Blob")) { Console.WriteLine(c.Element("Name").Value); if (includeMetadata) { foreach (XElement meta in c.Element("Metadata").Elements()) { Console.WriteLine(" Metadata {0} = {1}", meta.Name.ToString(), meta.Value); } } } } } }
A successful listing results in an HTTP status of OK and a stream of XML with the results of the list request. The example code retrieves the XML and displays the blob name and meta data to the console.
Recall that the Blob service is REST enabled and that, in this model, a simple GET request is sufficient to retrieve the resource directly if the resource allows for unauthenticated access. This means, for example, that simply typing the address of a blob into a regular Internet browser will retrieve the blob and render it according to the browser’s rendering rules for the file type. Programmatically, accessing the blob is a little more work. You need to programmatically create a web request using a GET request method and the blob’s resource URI and retrieve the blob as a stream and save to a local file. The section that follows covers this latter scenario—generating a web request programmatically and retrieving a blob to a local file.
Web requests created to retrieve a single blob must include the following:
Listing 10-8 displays example code that downloads a blob and saves it to the file system. The majority of the code is simply for writing the response stream to the file.
Listing 10-8: Downloading a Blob
private void GetBlob(string containerName, string blobName, string outputPath) { HttpWebRequest request = GetWebRequest("GET", ROOT_URI + "/" + containerName.ToLowerInvariant() + "/" + blobName.ToLowerInvariant()); string encryptedHeader = CreateSharedKeyLiteAuthorizationHeader( request.Method, request.Headers, STORAGE_ACCOUNT, containerName.ToLowerInvariant() + "/" + blobName.ToLowerInvariant(), "", ""); request.Headers.Add("Authorization", string.Format(CultureInfo.InvariantCulture, "SharedKeyLite {0}:{1}", STORAGE_ACCOUNT, encryptedHeader)); //Process response try { using (HttpWebResponse response = request.GetResponse() as HttpWebResponse) { if (response.StatusCode == HttpStatusCode.OK) { using (Stream strm = response.GetResponseStream()) { using (Stream file = File.OpenWrite(outputPath)) { byte[] buffer = new byte[8192]; int len; while ((len = strm.Read(buffer, 0, buffer.Length)) > 0) { file.Write(buffer, 0, len); } } } Console.Write("Blob: {0} downloaded to: {1}", containerName.ToLowerInvariant() + "/" + blobName.ToLowerInvariant(), outputPath); } } } catch (WebException webEx) { Console.WriteLine("Error retrieving {0}. {1}", containerName.ToLowerInvariant() + "/" + blobName.ToLowerInvariant(), webEx.Message); } }
Blobs can be copied in a Storage account without the need to download the binary to a file system and upload the binary again. Copying a blob from one container to another also copies the binary and the meta data.
Web requests created to copy a blob must include the following:
Listing 10-9 shows example code used to copy a blob. The CopyBlob parameters define the location of the copy. The x-ms-copy-source header is the location of the blob to be copied. This value must include the Storage account name in the resource URI or the request will fail. In the example code provided, the x-ms-copy-source value is passed in as custom meta data and is appended as a web request header. The copy request can also add meta data during the copy. A successful copy results in an HTTP status of Created.
Listing 10-9: Copying a Blob
private void CopyBlob(string containerName, string blobName, Dictionary<string, string> metadata) { HttpWebRequest request =GetWebRequest("PUT", ROOT_URI + "/" + containerName.ToLowerInvariant() + "/" + blobName.ToLowerInvariant()); //~DH~DH~DH Create Headers ~DH~DH~DH~DH- request.ContentLength = 0; // custom metadata x-ms-meta foreach (KeyValuePair<string, string> nvp in metadata) { request.Headers.Add(nvp.Key, nvp.Value); } //Format of passed in copy-source metadata //"x-ms-copy-source", "/<STORAGE_ACCOUNT>/<container>/<Blob path>"); string encryptedHeader = CreateSharedKeyLiteAuthorizationHeader( request.Method, request.Headers, STORAGE_ACCOUNT, containerName.ToLowerInvariant() + "/" + blobName.ToLowerInvariant(), "", ""); request.Headers.Add("Authorization", string.Format(CultureInfo.InvariantCulture, "SharedKeyLite {0}:{1}", STORAGE_ACCOUNT, encryptedHeader)); //Process response try { using (HttpWebResponse response = request.GetResponse() as HttpWebResponse) { if (response.StatusCode == HttpStatusCode.Created) { Console.WriteLine("Blob has been copied"); } } } catch (WebException webEx) { Console.WriteLine("Error copying Blob. {0}", webEx.Message); } }
Deleting a blob is similar to deleting a container. You use the DELETE request method with the URI of the blob. Like containers, blobs are initially marked for deletion. This is a logical delete. The container is not truly deleted until a garbage collection sweep occurs. During the time of the logical deletion and the sweep, you may receive an error if you attempt to create another blob using the same container name. Any snapshots created from the blob must be deleted before the blob can be deleted.
Web requests created to delete containers must include the following:
The result of a successful delete is an HTTP status of Accepted. Listing 10-10 includes example code to delete a container.
Listing 10-10: Deleting a Blob
private void DeleteBlob(string containerName, string blobName) { HttpWebRequest request = GetWebRequest("DELETE", ROOT_URI + "/" + containerName.ToLowerInvariant() + "/" + blobName.ToLowerInvariant()); string encryptedHeader = CreateSharedKeyLiteAuthorizationHeader( request.Method, request.Headers, STORAGE_ACCOUNT, containerName.ToLowerInvariant() + "/" + blobName.ToLowerInvariant(), "", ""); request.Headers.Add("Authorization", string.Format(CultureInfo.InvariantCulture, "SharedKeyLite {0}:{1}", STORAGE_ACCOUNT, encryptedHeader)); //Process response try { using (HttpWebResponse response = request.GetResponse() as HttpWebResponse) { if (response.StatusCode == HttpStatusCode.Accepted) { Console.WriteLine("{0} has been deleted", containerName.ToLowerInvariant() + "/" + blobName.ToLowerInvariant()); } } } catch (WebException webEx) { Console.WriteLine("Error deleting {0}. {1}", containerName.ToLowerInvariant() + "/" + blobName.ToLowerInvariant(), webEx.Message); } }
It was mentioned previously in this chapter that security is associated with a container and not a blob. There is no blob-level security in Azure. Access policies are associated with the container, and to set different access policies on different blobs require different containers. The default access setting for a container is Owner. Owner is defined as anyone with access to the Storage account’s access keys.
Two methods allow you to set anonymous access to containers and the contained blobs. Anonymous access is anyone without access to the Storage account’s access keys. The first and simplest method is to modify the container’s x-ms-blob-public-access header value. The second method is more challenging and too lengthy to cover in a single chapter. That method of access control is using Container Policies and Shared Access Signatures, which can provide more granular access control to users. It still applies only to containers and not to individual blobs. For more information on container policies and Shared Access Signatures, review the topics in the Azure SDK located at http://msdn.microsoft.com/en-us/library/windowsazure/ee393343.aspx.
Setting and managing the x-ms-blob-public-access header value allows the Owner to define one of three options:
No public read access is the default Owner setting. Any request to access the container or contained blobs require a correctly signed Authorization header value. The signed Authorization header value is what determines who can be considered as the Owner. Public read access for blobs allows anonymous users to access blobs, blob properties, blob meta data, and block lists and page regions (used in Page blobs). With public read access for blobs set as the access policy on a container, users must know the correct URL to access the resource. Finally, full public read access allows anonymous users to access the same items as public read access for blobs and includes the access to list blobs in containers and view container properties and meta data. At no time can an anonymous user create a list of containers located in a Storage account.
To manage public access for a container, a web request must have the following:
The x-ms-blob-public-access header value can be set to container for full public read access, or blob for public read access for blobs. To set remove public access to the container, you must exclude the x-ms-blob-public-access header in the web request. Listing 10-11 contains example code that sets the x-ms-blob-public-access header based on the parameter aclChoice, which is a custom enumeration used for the example.
Listing 10-11: Setting Public Access on a Container
private void SetContainerACLs(string containerName, containerACLs aclChoice) { HttpWebRequest request = GetWebRequest("PUT", ROOT_URI + "/" + containerName.ToLowerInvariant() + "?restype=container&comp=acl"); //~DH~DH~DH~DH~DH Create Headers ~DH~DH~DH~DH~DH~DH~DH- request.ContentLength = 0; if(aclChoice == containerACLs.PublicContainer) request.Headers.Add("x-ms-blob-public-access", "container"); else if(aclChoice == containerACLs.PublicBlob) request.Headers.Add("x-ms-blob-public-access", "blob"); //no check necessary for owner. //By default not setting the x-ms-blob-public-access //header results in owner access. string encryptedHeader = CreateSharedKeyLiteAuthorizationHeader (request.Method, request.Headers, STORAGE_ACCOUNT, containerName.ToLowerInvariant(), "acl", ""); request.Headers.Add("Authorization", string.Format(CultureInfo.InvariantCulture, "SharedKeyLite {0}:{1}", STORAGE_ACCOUNT, encryptedHeader)); Console.WriteLine("Set ACLs for {0}", containerName.ToUpperInvariant()); using (HttpWebResponse response = request.GetResponse() as HttpWebResponse) { if (response.StatusCode == HttpStatusCode.OK) Console.WriteLine(" Container ACLs set to {0}", aclChoice.ToString()); } }
With the container’s public access set to either container or blob access, anyomous users can access blobs by entering a URL into a browser. For example if the school Storage account has a blob located at URI http://school.blob.core.windows.net/math101/additive.avi, the user can copy this URL into Internet Explorer and view the video. If the container’s access is set to container, an anonymous user could list the blobs in the container using a browser. For example if the container with a URI of http://school.blob.core.windows.net/math101/additive.avi has the container access set to container, the user may use the URL http://school.blob.core.windows.net/math101/ to list blobs in the container.
Retrieving the container’s access policy is a simple web request. To manage public access for a container, a web request must have the following:
The result of a successful request is an HTTP status code of OK. The response is the x-ms-blob-public-access value for the container. Listing 10-12 contains example code to retrieve the public access settings for a container.
Listing 10-12: Retrieving the ACLs for a Container
private void GetContainerACLs(string containerName) { HttpWebRequest request = GetWebRequest("GET", ROOT_URI + "/" + containerName.ToLowerInvariant() + "?restype=container&comp=acl"); string encryptedHeader = CreateSharedKeyLiteAuthorizationHeader( request.Method, request.Headers, STORAGE_ACCOUNT, containerName.ToLowerInvariant(), "acl", ""); request.Headers.Add("Authorization", string.Format(CultureInfo.InvariantCulture, "SharedKeyLite {0}:{1}", STORAGE_ACCOUNT, encryptedHeader)); Console.WriteLine("ACLs for {0}",containerName.ToLowerInvariant()); using (HttpWebResponse response = request.GetResponse() as HttpWebResponse) { if (response.StatusCode == HttpStatusCode.OK) { string accessHeader = response.Headers["x-ms-blob-public-access"]; if (accessHeader == null) Console.WriteLine("Access is private to account holder"); else Console.WriteLine(accessHeader); using (Stream strm = response.GetResponseStream()) { StreamReader rdr = new StreamReader(response.GetResponseStream()); string results = rdr.ReadToEnd(); Console.WriteLine(results); } } } }
Azure Storage provides applications with a cloud-based storage solution. Developers can select from one of many storage options depending on application requirements. Blob storage provides the features and functionality to store and manage large unstructured files in the cloud using a REST API. The nature of the REST API allows access to Blob storage from any language or platform that can create a web request; you are not limited to only .NET languages. Using the REST APIs you can easily create, manage and secure large binary objects and make them available to users and applications with access to the web.