Chapter 13. What’s New in SQL Server Analysis Services 2005

IN THIS CHAPTER

OLAP, Data Warehousing, Business Intelligence, and Data Mining

Analysis Services Fundamentals

Analysis Services 2005 Enhancements

Performance Tuning for Analysis Services

Administering Analysis Services

Analysis Services Security Overview

Migrating from SQL Server 2000 Analysis Services to SQL Server 2005 Analysis Services

SQL Server 2005 Reporting Services (SSRS) Overview

Microsoft SQL Server entered the business intelligence market when it introduced a new component with SQL Server 7 called OLAP Services. OLAP Services, in conjunction with Data Transformation Services (DTS), allowed information workers to combine data from disparate sources into a single, multidimensional repository, called the data warehouse or data mart. OLAP Services allowed storing, processing, and preparing of the multidimensional data warehouse, which is a pre-aggregated, summarized, de-normalized data store optimized for reporting, analysis, and decision making.

SQL Server 2000 extended the business intelligence support by introducing several enhancements to its analytical or OLAP engine and support for a new technology called data mining, which is used to determine patterns between two or more pieces of information. OLAP Services from SQL Server 7 was renamed Analysis Services in SQL Server 2000. The new OLAP features introduced in Analysis Services 2000 combined with DTS, the robust Extraction, transformation, and loading (ETL) tool, and the introduction of the Reporting Services add-on for SQL Server 2000, made SQL Server a clear leader in the business intelligence market.

Analysis Services 2005 builds on the solid foundation of business intelligence support in SQL Server 2000 and introduces new features, tools, and functionality to build and deploy end-to-end business intelligence solutions. In addition to introducing new features, Analysis Services 2005 addresses several limitations that were present in the previous releases.

The business intelligence support in SQL Server 2005 is based on the theme “integrate, analyze, and report.” Chapter 12, “SQL Server Integration Services Overview,” shows how to integrate data from various sources by using SQL Server Integration Services (SSIS); this chapter describes the new features introduced in Analysis Services to prepare your data for analysis, and also contains an overview of SQL Server 2005 Reporting Services.

Before looking at some of the most important new features in Analysis Services 2005, let’s take a quick overview of OLAP, data warehousing, business intelligence, and data mining.

OLAP, Data Warehousing, Business Intelligence, and Data Mining

A relational database can either be designed for real-time business operations, such as order processing, or optimized to support the decision-making process. The former is referred as an online transaction processing (OLTP) database; the later is called online analytical processing (OLAP). Data in an OLAP database is organized to support analysis rather than to process real-time transactions, as in an OLTP database. An ETL tool such as DTS is often used to extract, transform, and load data from various sources, including OLTP databases, into an OLAP database. OLTP databases are generally designed to support a much higher number of concurrent users than OLAP databases.

An OLAP database, also known as a data mart, consolidates and organizes data from varied sources, including the operational data from OLTP databases. This data is processed and pre-aggregated to provide superior performance for ad hoc queries submitted to provide actionable business insights and to aid in the business decision-making process. In other words, a data warehouse is a data store that is built using a more systematic approach to combine data from various sources, cleanse it for accuracy and consistency, and organize it in a way that favors queries that request thousands or millions of rows at a time versus one that requests limited rowsets, like the ones found in an OLTP system. A data warehouse may also help in segregating expensive reporting queries from the OLTP system. A data warehouse often contains historical and summarized data that supports business decisions at many levels.

Let’s say you wanted to find out about local, regional, national, and worldwide sales per year for the past five years for a particular set of products. To obtain this information from an OLTP database, the query might have to process millions of records and could take a long time to come back with results. In contrast, because an OLAP data warehouse already stores the summarized data in a multidimensional fashion, it yields the desired results a lot faster than the relational database.

Business intelligence is a generic term that means different things to different people. Business intelligence refers to sets of tools and applications that query OLAP data and provide reports and information to enterprise decision makers. Business intelligence tools and applications allow the leveraging of the organization’s internal and external information assets for making better business decisions. Business intelligence capabilities include data transformation (ETL), OLAP, data mining, and reporting.

Data mining is an activity that operates on a data warehouse to ferret out trending patterns in the data. A common example of data mining is exhibited on the Amazon.com website. If you have ever received an email from Amazon that recommends a new book or DVD based on your past purchases, you have experienced data mining at work. Data mining involves exploring large quantities of data, using patterns and rules, in order to discover meaningful information about the data. A data mining component allows you to define data mining models based on existing data to discover trends and predict outcomes.

OLAP Terminology

Before you read the rest of this chapter, it’s important that you be familiar with some terms related to OLAP.

A table that contains columns and rows is a basic object in a two-dimensional database system. The equivalent to this in a multidimensional OLAP database is a cube. A cube is a conceptual container of detailed values consisting of dimensions and measures. The term measures refers to facts available in any data store that are interesting to the business user. Some common measures are sales amount, sales quantity, hourly rate, current inventory amount, total expenses, and cost. In other words, a measure is a summarizable numerical value used to monitor business activity. A dimension is an aspect of the business through which measures are analyzed. In other words, a dimension is a descriptive category, such as date, product, location, or customer. Dimensions may have multiple hierarchies, each hierarchy may have multiple levels, and each level may have multiple members. For instance, the location dimension may have country, states, and city as the hierarchy.

Measures are aggregated based on the members of a dimension. A member is a particular instance of a dimension. Examples of members of a date dimension might be 2005, Quarter 1, January, and 1/1/2005. A level provides a grouping of members at a level of a hierarchy. In the preceding example, the levels that correspond to the example members might be Year, Quarter, Month, and Day. Dimension attributes are pieces of information about a member that are interesting for reporting or analysis, such as customer name, day of week, SKU, store hours, and so on.

The term fact table refers to a relational database table that is a central table in a data warehouse schema. A fact table usually contains a large number of rows, sometimes in the hundreds of millions of records. A fact table’s columns include numeric values for one or more measures and primary keys from dimension tables as the foreign key columns. An example of a fact table could be a table with the columns product_id, time_id, customer_id, promotion_id, store_id, store_sales, store_cost, and unit_sales. Note that a fact table contains primary keys from all the dimension tables as foreign keys, and the rest of the columns are the measures.

The two common schema design strategies for data warehouses include the star schema and the snowflake schema. The entity relationship diagram of the star schema resembles a star, with one fact table at the center and several dimension tables as the points of the star. The snowflake schema is an extension of a star schema such that one or more dimensions are defined by multiple tables. In a snowflake schema, only primary dimension tables are joined to the fact table. Additional dimension tables are joined to primary dimension tables. Snowflake schemas are more normalized than star schemas.

OLAP Storage Modes

There are three modes in which you can store dimensional data. Each choice has its own characteristic data storage requirement and can greatly affect both processing and querying performance:

Multidimensional OLAP (MOLAP)—MOLAP stores both the underlying data used to create the cube and the aggregations created for the cube in a multidimensional structure maintained on the OLAP engine, such as an Analysis Services server. Because the OLAP engine can answer all queries directly from this multidimensional structure without further accessing the relational or any other store, the MOLAP storage mode presents the best overall querying performance of all three storage modes. However, because the data used to create the cube is copied from the relational database to this multidimensional structure, even though the OLAP engine might compress this data, the storage requirement needed to support MOLAP is still the largest of the three storage modes. Another important consideration with the MOLAP strategy is the time it takes to process the cubes.

Relational OLAP (ROLAP)—With ROLAP storage mode, both the underlying data for the cube and the aggregates created for the cube are stored in a common relational database. Because the OLAP engine (the Analysis Server) must query the relational database to retrieve all cube data, this storage mode provides the slowest querying performance of the three storage modes.

Hybrid OLAP (HOLAP)—HOLAP combines elements from both MOLAP and ROLAP. With HOLAP, the aggregations are stored on the OLAP engine (the Analysis Services server) in multidimensional storage. However, the underlying data for the cube remains in relational database and is not copied into multidimensional storage. HOLAP provides the best processing performance and uses the least additional storage. Query performance is typically better in HOLAP than in ROLAP but not good as in MOLAP, depending on the aggregation design.

Analysis Services 2005 supports all three of these storage modes.

Analysis Services Fundamentals

Analysis Services is Microsoft’s OLAP engine. It stores multidimensional data in a way that facilitates the performance and flexibility of OLAP. Analysis Services 2000 was already the market leader in OLAP, and Analysis Services 2005 promises to even further increase the appeal of the product in the OLAP market space. Two large differentiating factors between Analysis Services and other products on the market are the way it deals with data explosion and data sparsity.

Data Explosion and Data Sparsity Handling

To explain data explosion, we first need to explore the power of OLAP engines in general. Most OLAP engines provide high-performance, flexible analytical environments by prestoring aggregations across several dimension levels. To better understand this, let’s look at a very simplistic example of a sales cube with two measures—sales amount and sales quantity—and two dimensions—product (category and SKU) and date (year, month, and day). Queries can request this data in a variety of different ways. Here are some sample requests that can be answered from the sample cube:

“How many widgets were sold in December of last year?”

“What is the average number of gadgets sold on Saturdays?”

“What were the total sales for last year?”

“How do the total sales for March of last year compare to the total sales for March of this year?”

OLAP engines provide quick answers by pre-calculating the answers to these and other questions. Given our simplistic example, there would be a total of 12 possible aggregations, as shown here:

images

The number of actual pre-calculations depends on the number of members at the given level of a category. For instance, if we are storing 5 years and we have a total of 40,000 products, there would be 200,000 pre-calculations for the “Year/Product” aggregation.

Data explosion occurs when additional dimensions or levels are added, especially those with a large number of members, such as customers. If each combination of dimension/level is precalculated, the number of aggregations can easily reach into the billions or higher. An efficient indexing strategy would be needed in order to parse through such a large set of data to find the relevant calculations!

Analysis Services solves this problem by selectively aggregating data. Taking the example above, Analysis Services might choose to store the “Month/Product” aggregation but not the “Year/Product” aggregation. A request for the total number of widgets sold last year could be easily resolved by adding the 12 “Month/Product” pre-calculations to obtain the total for the year. This is still far better than adding each product sale for the year.

One problem with this random, selective aggregation in Analysis Services is that all dimensions/levels are considered as equal candidates for aggregation. Analysis Services does not know to favor commonly queried dimension levels unless you tell it to. You can tell the Aggregation Design Wizard to eliminate some dimension levels from consideration for pre-aggregation. This takes the form of a property on the measure group to assign the levels that should be removed from consideration. Another way to influence the aggregation design is to feed the wizard a query load that has occurred on the cube or measure group. This is referred to as usage-based optimization. These methods do not concretely direct the Aggregation Design Wizard to create one aggregation over another. They are simply means of influencing the wizard to do so more intelligently. When using these methods, you can actually aggregate at a higher level (that is, create more aggregations) because there are fewer available possibilities.

Another prevalent issue in OLAP is data sparsity. For any given cube, a number of dimension intersections will not be populated. In the preceding example, let’s say the store didn’t start selling a particular product until this year. This means that all the intersection points for this product in previous years would return empty cells. Cubes, by their very nature, tend to be highly sparse. A dense cube would mean nearly every combination of dimension levels would be active. This is not so hard to believe in our simplistic example, but in a more realistic cube, where we have Date, Product, Location, Customer, Discount Type, Promotion Type and more, you can see where sparsity comes into play. In many other OLAP products, a placeholder is stored even for empty cells. Analysis Services compacts its cubes by not reserving space when there are no values for the intersection. Space saving can be in the neighborhood of 50% to 90%. This results, also, in a more efficient cube because there is less to manage and examine when queries are issued.

These two inherent features of Analysis Services have been integral to propelling it to its current position in the market. With this introduction to OLAP concepts and Analysis Services capabilities, let’s now see what’s new in Analysis Services 2005.

Analysis Services 2005 Enhancements

The enhancements to Analysis Services are extreme, enough to warrant multiple books of their own. This chapter focuses on the enhancements that are relevant to Analysis Services administrators and are mostly addressed at a very high level. Details are presented in specific cases where they provide critical insight to the new paradigms introduced in this new, dramatically improved version of the product. Specific changes to front-end development, such as the many language enhancements to Multidimensional Expressions (MDX), are not addressed in this chapter.

Integrated Tools and the Enhanced User Interface Experience

Business Intelligence Development Studio (BIDS) and SQL Server Management Studio (SSMS) are the two biggest tools introduced as part of the SQL Server 2005 toolset. BIDS provides an integrated environment for developing and deploying business intelligence solutions, and SSMS provides an administration and management interface for maintaining already deployed business intelligence solutions. SSMS also provides the ability to author and execute Multidimensional Expressions (MDX), Data Mining Extensions (DMX), and XML for Analysis (XMLA) queries. BIDS enables you to create SSIS packages, build Analysis Services objects such as cubes and data mining models, and author Reporting Services reports.

BIDS provides a lot of templates to accelerate the development of cubes, dimensions, and so on. SSMS also provides a bunch of templates to give you a head start in authoring MDX and DMX queries. Each management dialog in SSMS (such as the Analysis Server Properties dialog) contains a Script button that you can click to generate XMLA script for the actions performed by that dialog. XMLA is discussed later in this chapter.

SSMS and BIDS are discussed in great detail in Chapter 5, “SQL Server 2005 Tools and Utilities.”

You can use the SQL Server Configuration Manager tool to start and stop the Analysis Services instance and to view or edit service properties, such as the account under which the service runs.

Profiler Support

The SQL Profiler tool can now be used to trace Analysis Services events. You can use it to view both administrative commands and query requests. The DDL commands are displayed as XMLA scripts, and the queries are displayed as either MDX or DMX. Using Profiler is a great way to learn about MDX, DMX, and XMLA queries. You can perform various operations, such as browse a cube, in SSMS or BIDS and use Profiler to see the commands executed by these tools. Profiler also exposes events to show whether data was read from cache or disk and which aggregations were used to satisfy the query. Figure 13.1 shows SQL Profiler in action, capturing the queries and commands submitted to an Analysis Services instance.

Figure 13.1. Profiler in SQL Server 2005 can be used to trace commands and queries submitted to Analysis Services 2005.

image

Multiple Instance Support

Analysis Services 2000 did not support the installation of multiple instances. SQL Server 2005 fixes that problem by enabling up to 50 instances of the Analysis Services engine from SQL Server 2005 Enterprise Edition or 16 instances of the Analysis Services engine from other editions to be installed on the same machine.

By default, Analysis Services listens on port 2383. Additional instances can be assigned specific ports or can use a dynamic port configuration. In the latter scenario, Analysis Services attempts to use port 2383. If that port is not available, it looks for the first available port in a range of port numbers. The Analysis Services redirector, a part of the SQL Browser Windows service, is responsible for redirecting connections to the appropriate ports by using instance names in the connection string. Therefore, the Analysis Services redirector port, 2382, must be open to any external traffic that does not explicitly specify port information.

Failover Clustering Support

Analysis Services 2000 did not support failover clustering. Analysis Services 2005 adds this high-availability support and allows Analysis Services to be installed in an 8-node failover cluster on 32-bit systems and in a 4-node failover cluster on 64-bit platforms. The SQL Server 2005 setup is cluster aware and can seamlessly install Analysis Services on cluster nodes.

Unified Dimension Model (UDM) and Proactive Caching

The unified dimensional model (UDM) is arguably the most significant feature in Analysis Services 2005. The UDM provides a layer of abstraction between the source database and the data model that is presented to the querying user. The ultimate goal is to provide the best of OLAP and relational reporting.

Table 13.1 shows that the high-level benefits of relational reporting are not available in OLAP and vice versa. The UDM introduces a revolutionary platform for providing all these reporting features and more.

Table 13.1. Relational Reporting Versus OLAP Reporting

images

The UDM combines the best aspects of traditional OLAP-based analysis and relational reporting into one dimensional model. In its most simple form, the UDM could be used to provide user-friendly names to a querying client. Many OLTP applications have very cryptic column names that would not make sense to a business user, or even a developer, who tries to create queries on the information. You could simply define the UDM to provide a mapping to these underlying columns that renames them to more understandable names, such as changing CustID to Customer Identifier or CustFN to Customer First Name.

As mentioned previously, this is a highly simplified view of what the UDM provides; a UDM is far more powerful than that. The power of the UDM can best be demonstrated by walking through an implementation.

The UDM begins with the definition of one or more data sources. These are defined by using BIDS. These data sources can be any data store that is accessible via an OLE DB connection. This means that the UDM can actually be a combination of tables from various sources, such as SQL Server 2000, SQL Server 2005, Oracle, and DB2.

A data source view is then created to define a subset of information to be gathered from each data source. This data source view can then be manipulated to create relationships, rename tables or columns, add calculated columns, and more. The result is a logical data store that can be queried just as if it were physically stored in a single database. And, in fact, it can be physically stored. This is where the concept of proactive cache comes into play.

Proactive caching manages the local MOLAP data cache and can be configured to provide real-time, high-performance access to the information in the UDM at any point in time. This latency definition is literally represented as a slider bar in Analysis Services, as Figure 13.2 shows.

Figure 13.2. The Aggregation Design Wizard allows you to specify the storage mode and the caching options.

image

On one end of the slider bar, there is no latency. When changes have been made to the underlying data store, queries against the affected data revert to ROLAP until a new copy of the cache is rebuilt. At the other end of the slider bar is MOLAP. This means that the information is stored in Analysis Services and is updated within a period that is explicitly managed. Intermediate settings determine how often the MOLAP cache is rebuilt and whether the MOLAP cache can continue to be queried (instead of reverting to ROLAP) while the new cache is being built.

The beauty of proactive caching and the UDM is that it is fully configurable by the Analysis Services administrator. An experienced administrator knows that nothing comes without a price. It is rare that a business user does not say that he or she wants to receive information in real-time. It is up to administrators and developers to ascertain whether that is truly a requirement. The performance impact of real-time data should be taken into consideration. This impact does not change entirely with the UDM. Using MOLAP, which is on the other end of the spectrum, results in prestored, pre-calculated information that provides much better performance. MOLAP, however, does not give you the auto-processing functionality or the up-to-date information that you get with real-time or near-real-time analysis.

As you can see, the UDM does not remove all the real-world considerations of the past. However, it does provide a very easy mechanism of combining and presenting information for one or more source systems at a point in time that is completely configurable. Previously, such a solution would have required an extensive effort in architecting and programming.

Cube Enhancements

Analysis Services 2005 introduces several enhancements and new features related to designing and managing cubes. The following sections present some of the improvements introduced in Analysis Services 2005.

Multiple Fact Tables

You can use multiple fact tables in a single cube to consolidate related information. Measure groups are loosely equivalent to the concept of cubes in the previous versions of Analysis Services. These multiple-fact table cubes more closely resemble virtual cubes from Analysis Services 2000.

Intellicube

In Analysis Services 2000, an intimate knowledge of the underlying source database was required prior to the setup of dimensions and cubes. Each dimension had to be set up separately and added, explicitly, to a new cube. The Intellicube feature is invoked automatically when you create a new cube and check the Auto Build box in the Cube Wizard. It examines the tables in the designated data source view and determines what appear to be dimension and fact tables. These can be accepted or changed during the cube-building process. This functionality is very accurate, particularly if the source tables follow traditional multidimensional design techniques (using the star schema). This saves considerable time and effort in developing a new cube.

Key Performance Indicators (KPIs)

The term key performance indicator (KPI) refers to an indicator that measures how a business is doing in a given area. A simple example could be sales per day per retail square foot. For instance, for a given store, a KPI could be defined to include the following definition:

images

You could define this KPI as a part of a cube and easily have any application reference it. A KPI consists of a value, a goal, a status, and the trend. Several user-friendly graphics give a visual representation of the values, such as a gauge, traffic light, road sign, or thermometer. Figure 13.3 shows the KPIs tab inside BIDS.

Figure 13.3. The Cube Designer in BIDS allows you to define KPIs and associate visual representational graphics such as traffic lights, gauges, and so on with them.

image

Translations

Translations are a globalization feature that provides the ability to display metadata (that is, dimension and measure labels) and data in alternate languages. This is all controlled at the cube definition level, again removing complexity from the underlying applications. Figure 13.4 shows some sample label strings and their respective translation strings in Spanish and French.

Figure 13.4. The Translations tab allows you to define dimension and measure labels in multiple languages.

image

Perspectives

Perspectives provide a means of presenting alternate views of a cube to users to remove some of the underlying complexity. This is similar to creating a view in SQL Server that includes only a subset of the columns from the underlying tables.

Dimension Enhancements

At a high level, the concept of dimensions has changed in Analysis Services 2005. In Analysis Services 2000, dimensions were highly hierarchical structures. Dimension attributes, called member properties, could be used to create a separate dimension, called a virtual dimension. This greatly limited the navigational capabilities of business users.

Dimensions in Analysis Services 2005 are attribute based and are visible by default. They can be used for filtering and analysis, just like formal dimension structures were in Analysis Services 2000. This greatly expands the analytical capabilities of data analysts. Attributes correspond to the columns in the tables of a dimension. In addition to this extremely powerful aspect of dimension analysis, the following dimension enhancements are available in Analysis Services 2005.

No More 64,000 Limit

Analysis Services 2000 required that no member could have more than 64,000 children. This could be overcome by using member groups or introducing an explicit intermediate level that was invisible to the end user. This limitation is no longer present in Analysis Services 2005, and, therefore, no workaround is needed.

Role-Playing Dimensions

In Analysis Services 2000, any sourced dimension table could be used only once for a cube. This issue came up regularly if there were two date dimensions on the fact table that referenced a single date dimension. An example is when the fact table has an OrderDate and a ShipmentDate. Analysis Services 2000 required that a separate date dimension table be created to source the second dimension. In Analysis Services 2005, the same date dimension can be used for multiple foreign key columns in a fact table.

Reference Dimensions

A reference dimension table is one that is indirectly related to a fact table. Reference dimensions are readily apparent in snowflake designs, but Analysis Services 2005 provides the capability to separate out reference dimension with or without an underlying snowflake design. These dimensions can be used as part of a dimension hierarchy or created as a separate dimension in their own right. An example might be a case where a Product dimension is related directly to a Sales fact table. A Vendor table may then be related to the Product table but not directly to the Sales fact table. In this case, Vendor could be a level above Product in the same dimension, and/or it could be segregated into its own explicit dimension, even though it is not directly related to the Sales fact table.

Fact Dimension Relationships

A fact dimension is a dimension whose attributes are drawn from a fact table. This provides a means of implementing degenerate dimensions in Analysis Services without explicitly creating and populating the dimension in the star schema. Due to their large size, degenerate dimensions were difficult and expensive to load and manage.

Many-to-Many Dimensions

Analysis Services 2005 provides a solution to the age-old problem of many-to-many dimensions. An example could be a product that exists in multiple categories. The relational model would typically have a ProductCategory bridge table between the Product dimension and the Sales fact table. Analysis Services allows you to specify this type of relationship when defining the relationships between dimension tables and fact tables. Analysis Services handles the rest, making sure each view of the data shows the correct totals.

Multiple Hierarchy Dimensions

The concept of multiple hierarchies existed in Analysis Services 2000 in name only. In essence, dimensions with multiple hierarchies were actually separate dimensions that shared a high-level name. In fact, these dimensions are migrated as separate dimensions when you use the Migration Wizard because that’s exactly what they are. In Analysis Services 2005, dimensions don’t have this formal hierarchical structure. Many hierarchies are supported through the use of attribute dimensions, which is implicit within the product.

Data Mining Enhancements

Data mining can be used to provide tremendous insight into data without having to manually perform in-depth manual analysis. It can be used to determine patterns between two or more pieces of information (for instance, buying patterns based on age, gender, and years of education). Without the proper tools, the task of mining data can be very tedious and requires expertise in statistical analysis. Analysis Services provides data mining algorithms to circumvent this requirement and allow for easy definition of mining models in order to accomplish this task.

Analysis Services 2000 provided two data mining algorithms: clustering and decision trees. Data mining gets a significant boost in Analysis Services 2005, with the introduction of five new mining models. Here is a brief explanation of each of these new data mining algorithms:

Naïve Bayes—This is a relatively simple algorithm that works well for predictive behavior analysis. Because it requires less computational activity, it is faster than most other models.

Association—As its name implies, the Association algorithm associates items that are likely to appear together in a single unit, such as a transaction. This makes it ideal for market-basket analysis for the purposes of cross-selling.

Sequence clustering—This algorithm can be used to analyze the sequence of events in a unit. It makes it an excellent tool for click-stream analysis.

Time series—This is a forecasting algorithm that can be used to predict future sales trends based on historical data.

Neural network—This algorithm is similar to decision trees (available in Analysis Services 2000) but can define a three-dimensional node structure to analyze larger amounts of data.

In addition to these new algorithms, the decision trees algorithm has been enhanced to allow an additional, continuous attribute as a predictable column.

The small number of data mining algorithms in Analysis Services 2000 is only partially the reason that data mining was a largely unused part of the product. Another was the difficulty in setup, maintenance, and presentation of the resulting findings of data modeling. Several new enhancements in Analysis Services 2005 have made data mining more accessible.

The Data Mining Wizard guides both novice and experienced users through the process of creating a data mining structures and models. SSIS has been enhanced with new tasks that create and process mining models and can subsequently run queries against them. Reporting Services also includes the ability to build reports on top of mining models.

XMLA Support

As previously mentioned, Analysis Services communication is now based exclusively on XMLA. XMLA is a standard protocol for communicating with multidimensional data stores. It is the result of a consortium between Microsoft and two other leading OLAP vendors, Hyperion and SAS. The specification can be found at www.xmla.org.

According to xmla.org, XMLA provides an “open industry-standard web service interface designed specifically for online analytical processing (OLAP) and data-mining functions.” XMLA has actually been around since 2001 and was available for use in Analysis Services 2000, but it required the installation of the XML for Analysis Services Development Kit and the subsequent setup of a virtual directory in IIS. XMLA is the native language protocol in Analysis Services 2005 and is installed with the product. It can be integrated with SOAP to make Analysis Services a universally accessible web service.

Microsoft has further extended XMLA to include a specification for managing an instance of Analysis Services and to manage Analysis Services objects such as cubes, dimensions, data sources, and mining models. These extensions are comprehensively referred to as the Analysis Services Scripting Language (ASSL). These objects can be scripted in either BIDS or SSMS. These objects can then be checked into a source code management system to provide a database versioning system. XMLA code can also be executed in SSMS.

Figure 13.5 shows an example of the XMLA script to create a new data source.

Figure 13.5. XMLA scripts can be authored and executed in SSMS. This script is for creating a data source in a database named Adventure Works DW.

image

XMLA consists of two methods: Discover and Execute. The Discover method provides a means of examining metadata, such as enumerating through databases, data sources, cubes, and data models on an Analysis Services instance. Properties can be examined and used to construct a subsequent query request in the form of MDX or DMX. These are sent to the Analysis Services instance by using the Execute method.

As mentioned earlier, all Analysis Services objects are scripted and saved in the Data directories. XMLA code can be scripted or viewed in various ways. One way is to find the .xml scripts throughout the data directory. Another is to select the View Code option in BIDS when viewing the object to be scripted. Probably the most intuitive method is to script the objects directly within SSMS. You can script virtually any object at any point in the Analysis Services hierarchy by right-clicking the object in Object Explorer and selecting the appropriate scripting menu option. These scripts can even be scripted to an XMLA query window, where text can be globally replaced and run to create a replica of the source object. This approach is useful for copying cubes and for creating additional partitions on a measure group.

In summary, XMLA support in SQL Server 2000 was provided as an add-on, whereas XMLA is natively built into Analysis Services 2005. XMLA with Analysis Services 2000 required clients to send HTTP/SOAP requests, which were processed by IIS, the XMLA SDK, and the Pivot Table Service on the middle tier. The middle-tier components then communicated with Analysis Services 2000 to execute the XMLA commands and obtain the results. With Analysis Services 2005, the client can submit the XMLA commands to Analysis Services 2005 directly over TCP/IP or over HTTP by using IIS.


Tip

There are four ways to learn XMLA: (1) by using SQL Profiler and tracing the server activity to see XMLA queries submitted to the server; (2) by right-clicking Analysis Services objects in Object Explorer in SSMS and selecting the appropriate scripting option; (3) by right-clicking Analysis Services objects in Solution Explorer in BIDS and selecting View Code; and (4) by clicking the Script button on the management dialogs in SSMS.

For instance, you can right-click an Analysis Server instance in Object Explorer, select Properties, and click the Script button to generate XMLA script to alter the server properties. Because AMO (discussed next) is based on XMLA, you can write an AMO script, execute it while Profiler is running, and see the XMLA that it submits to the server to perform the action. For example, if you write the AMO code to end, or kill, a longrunning or a ghosted connection, you see an XMLA command similar to the following in Profiler:

<Cancel xmlns= "http://schemas.microsoft.com/analysisservices/2003/engine">  <SPID>1453</SPID></Cancel>


If XMLA seems complicated to you, you can use a .NET-based API called Analysis Management Objects (AMO) from .NET code or scripting languages such as VBScript to manage Analysis Services 2005 objects. AMO is discussed in the following section.

Analysis Management Objects (AMO)

AMO provides a programmer-friendly layer on top of XMLA. It is exposed as a .NET object model and can be coded using any .NET programming language, such as Visual Basic .NET or C#. It replaces its predecessor, Decision Support Objects (DSO). DSO is still available in Analysis Services 2005 for backward compatibility, but it has not been enhanced to support any of the new functionality. Therefore, cubes that have been changed to use any of the new features in Analysis Services 2005 are no longer manageable by DSO.

AMO provides a more logical view of the Analysis Services environment than DSO. DSO inevitably required the use of the universal MDStores interface in order to reference databases, cubes, partitions, and aggregations. This was certainly confusing to anyone new to the object model, and it resulted in code that was sometimes difficult to interpret. MDStores had a list of possible properties that could be interpreted differently and were optionally available, depending on the object being referenced.

AMO has its own complexities, but they are related to the new Analysis Services architecture. When you understand the new architecture, the AMO object model becomes very intuitive. UDM, measure groups, and perspectives are just some of the new functionality that changes the way you traverse the object hierarchy to implement code. For instance, in Analysis Services 2000, the natural hierarchy from server to partition looked like this:

Server → Database → Cube → Partition

The DSO code to traverse this hierarchy looked something like this:

Set dsoServer = New DSO.Server
dsoServer.Connect ("localhost")

Set dsoDB = dsoServer.MDStores("Foodmart 2000")
Set dsoCube = dsoDB.MDStores("Sales")
Set dsoPartition = dsoCube.MDStores(1)

In Analysis Services 2005, the natural hierarchies have changed somewhat. In the preceding example, the roughly equivalent Analysis Services 2005 hierarchy is as follows:

Server → Database  → Cube → Measure Group → Partition

A code segment to navigate this hierarchy would look something like this:

Server.Connect("LocalHost")
Database = Server.Databases("Foodmart 2000")
Cube = Database.Cubes("Sales")
MeasureGroup = Cube.MeasureGroups("Sales")
Partition = MeasureGroup.Partition(0)

AMO can be used to automate numerous tasks in Analysis Services, such as backing up all databases on a server, creating new partitions during incremental processing, and running reports to list metadata information and check for best practices. Virtually anything that can be done manually in Analysis Services can be coded in a .NET program, VBScript, or a script task in SSIS. This chapter focuses on SSIS because that is often the method of choice by system administrators. The SSIS script task is especially a good choice when using AMO to perform tasks such as adding partitions, performing backups, and so on because these tasks are often part of the end-to-end process of loading the data warehouse, which is usually performed in an ETL tool such as SSIS.

Using AMO Inside an SSIS Script Task

SSIS includes a new task that can be used to script managed code by using a .NET language. This has many advantages over VBScript, not the least of which is usability. The scripting environment offers many of the browse and help features of the Visual Studio .NET environment. You can use Visual Studio .NET to examine the objects, methods, and properties throughout the AMO model.

The first hurdle to implementing AMO, however, is referencing the object libraries. A limitation in Visual Basic for Applications (VBA), which is used in SSIS, requires that the AMO-related assemblies (Microsoft.AnalysisServices.DLL and Microsoft. DataWarehouse.Interfaces.DLL) be copied from the %ProgramFiles%Microsoft SQL Server90SDKAssemblies folder to the .NET folder in the Windows directory (for example, %windir%Microsoft.NETFrameworkv2.0.50215). After you do this, the assemblies are accessible from the scripting task in SSIS. You can also right-click the script folder under Class View in the Microsoft Visual Studio for Applications script editor, select Add Reference, and then select Analysis Management Objects and Microsoft.DataWarehouse.Interfaces from the list. Figure 13.6 highlights the assembly references that are required for any basic AMO programming.

Figure 13.6. .NET assemblies are required to use AMO API.

image

To see AMO in action from an SSIS script task, you can launch BIDS, select File | Open | Project/Solution or press Ctrl+Shift+O, and open the Backup Sample Project.sln solution file. Then you can double-click the BackupDatabase.dtsx SSIS package, double-click the BackupDatabase script task, and on the Script page click the Design Script button to view the script.

Here is how the Visual Basic .NET script that uses AMO to back up an Analysis Services database looks:

Imports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Runtime

Imports Microsoft.DataWarehouse.Interfaces
Imports Microsoft.AnalysisServices

Public Class ScriptMain

   Public Sub Main()
      Dim server As New Server
      Dim database As Database
      Dim databaseName As String
      Dim serverName As String

      Dim backupFolderName As String
      Dim backupInfo As New BackupInfo

      serverName = Dts.Variables("ServerName").Value.ToString()
      databaseName = Dts.Variables("DatabaseName").Value.ToString()
      backupFolderName = Dts.Variables("BackupDirectory").Value.ToString()

      server.Connect(serverName)
      database = server.Databases(databaseName)

      backupInfo.AllowOverwrite = True
      backupInfo.ApplyCompression = True
      backupInfo.File = backupFolderName & databaseName & ".abf"

      database.Backup(backupInfo)

      server.Disconnect()

      Dts.TaskResult = Dts.Results.Success
   End Sub

End Class

Three SSIS variables are created to pass the name of the Analysis Services server, a database name, and a backup directory. You should right-click the Workflow designer area and select Variables. Then you should provide the values for these three variables, save the package, and execute it. If the package execution fails, you need to make sure that you have copied the Microsoft.AnalysisServices.DLL and Microsoft.DataWarehouse.Interfaces.DLL AMO assemblies from the %ProgramFiles%Microsoft SQL Server90SDKAssemblies folder into the .NET folder as mentioned previously; you also need to make sure you have updated the variable values and have the correct server name, database name, and backup folder path, as well as ensure that the specified backup folder already exists on the server.

AMO also allows you to create and maintain data mining objects, including model security, processing, and backup and restore.

ADOMD.NET

With SQL Server 2000, using the ADO MD and XMLA add-ons are the two ways to run queries against Analysis Server from a client application. ADO MD (MD for multidimensional) is a COM-based API created around OLE DB for OLAP. Much as all COM-based APIs are being replaced with .NET-based object libraries in SQL Server 2005, the ADO MD is being replaced with ADOMD.NET.

ADOMD.NET is a .NET-based object model for querying the Analysis Services server. Like AMO, ADOMD.NET is also based on XMLA. When you use the ADOMD.NET object model, behind the scene it generates XMLA that is submitted over TCP/IP or HTTP to the Analysis Server to retrieve data and metadata information from the server.

Performance Tuning for Analysis Services

There are two primary aspects to tuning Analysis Services: tuning processing performance and tuning query performance. This is often a balancing act because improving one often has a negative impact on the other. In a well-tuned Analysis Services instance, increasing aggregation improves query performance. This has a negative impact on processing times. The goal is to intelligently aggregate based on what is known about query requests.

Usage-Based Optimization (UBO)

When no additional information is given, Analysis Services creates aggregations randomly, with no knowledge of what aggregations will be most used. This is similar to indiscriminately creating indexes on all columns of a SQL Server table, without thinking about what columns will most frequently be searched on. One difference in a data warehouse is that it must be assumed that any piece of information may be used as a filter. Nonetheless, there will certainly be dimension levels that will be far more active than others. As previously mentioned, aggregation design can be influenced by setting properties at various levels in a dimension or by implementing usage-based optimization (UBO). UBO takes a workload from Analysis Services and designs aggregations based on the queries that were logged. You’ll learn more about the implementation of UBO later in this chapter, in the section “Administering Analysis Services.”

Partitioning Enhancements

Partitioning continues to be a performance-enhancing technique in Analysis Services 2005. Partitioning can improve the performance of queries against cubes by minimizing the amount of data that needs to be interrogated. A common partitioning strategy in Analysis Services uses a date as the partitioning key. Dates are an integral part of most analysis across industry verticals, providing feedback on trends and comparisons to historical segments in time. Partitioning by date has the side benefit of facilitating administration of the OLAP cubes. As new data is loaded into cubes, only the affected partitions require any type of processing. This can significantly diminish the batch window required for MOLAP and HOLAP processing. In addition, you can easily archive data out of an active cube by simply deleting older partitions. Otherwise, the data would need to be removed from the underlying source tables, and the entire cube/measure group would have to be reprocessed to remove the older data.

The implementation of partitioning in Analysis Services 2005 has been enhanced in a couple ways. First, data slices are no longer necessary if the aggregation design will be MOLAP. With MOLAP, Analysis Services keeps heuristics on the data stored in the partitions and can quickly examine that data to determine what partitions contain data relevant to the query. This does not apply to other aggregation storage levels. Second, partitions are processed in parallel by default. In Analysis Services 2000, this behavior required the use of a DSO application.

The Optimize Schema Option

One of the primary options used in Analysis Services 2000 and earlier implementations was Optimize Schema. By default, Analysis Services 2000 was pessimistic about the underlying integrity of the star schema tables. When processing a cube, it would issue a SQL query that joined the fact table to each associated dimension table as defined in the cube. This greatly slowed cube processing. Because the foreign key in the fact table is usually the same as the cube’s member key, these joins could be removed by using the Optimize Schema option in the Cube Editor. This assumed that referential integrity was checked through foreign key relationships in the fact table or as part of the star schema loading process. Analysis Services 2005 uses the fact table’s member key instead of joining to the dimension table by default. The Optimize Schema option was unknown by many users of Analysis Services, so the fact that this is now automatically implemented will speed up processing by default.

Administering Analysis Services

Now let’s take a look at some of the changes in the administration of Analysis Services. Almost every aspect of administration has some noteworthy changes in SQL Server 2005.

One of the first notable areas is the location of the server/instance configuration properties. The properties associated with Analysis Services are more accessible than in previous implementations. Some were previously available only through registry entries. Now you can right-click the Analysis Services instance in Object Explorer and select Properties to launch the Analysis Server Properties dialog. The configurations display now closely resembles that in SQL Server, with a display of the current value, default value, and whether a restart of the service is required. The Show Advanced (All) Properties check box at the bottom of the screen exposes additional configuration values. Figure 13.7 shows the Analysis Server Properties dialog.

Figure 13.7. The Analysis Server Properties dialog provides a convenient way to view and edit Analysis Services instance configuration properties.

image

Full explanation of all available properties is beyond the scope of this chapter, but the following sections describe what you need to do in order to implement UBO.

Implementing UBO

UBO is mentioned previously as a means of designing custom aggregations based on a workload. The first step is to provide a workload to Analysis Services. You accomplish this by turning on the query log. Several properties are available for controlling query log behavior. Best practice continues to indicate placing the query log in a SQL Server database. To do this, you must enter a connection string in the LogQueryLog QueryLogConnectionString property by using the Analysis Server Properties dialog. You select the Value cell and click the ellipsis (...) button to specify the server connection details. The name of the SQL Server table is controlled by the LogQueryLog QueryLogTableName property. You need to modify the LogQueryLogQueryLogSampling property for the duration of the query sampling period in order to sample queries more frequently. This property defaults to 10, but you can lower this value to sample every nth query instead of every 10th. There is a slight performance hit involved with this, so it is good to set the property back to the original value or higher when sampling is complete.

One final step is necessary in order to have Analysis Services create the log table in SQL Server. You need to check the Show Advanced (All) Properties check box to expose the FeatureCreateQueryLogTable property. Then you need to set this property value to true. When you look at the Restart column for this property, you should see that this requires a restart of Analysis Services in order to take effect. After you restart, your table is created in SQL Server, and you see rows appear in the log after Analysis Services cubes have been queried.

The UBO Wizard has not changed significantly since Analysis Services 2000. It can be invoked through SSMS. If you have sampled a workload by following the steps just outlined, the UBO Wizard allows you to filter on information, such as queries run by a specific user or queries that exceeded a specified duration, and it attempts to design new aggregations from which those queries would benefit. Note that UBO aggregation design still follows the rules for good aggregation design, which weigh the cost of an aggregation versus the benefit.

In Analysis Services 2000, you could automate UBO by using the DSO COM API via the CubeAnalyzer and PartitionAnalyzer objects. This continues to be true in Analysis Services 2005 but is made easier by the fact that SSIS can now expand beyond ActiveX scripting. Some of the limitations of VBScript required that the NextAnalysisStep method be invoked externally in a Visual Basic wrapper. This is not the case when you’re using the Script task in SSIS.

Analysis Services Database Backup

You can back up Analysis Services databases to an operating system file to create a snapshot of the database at a point in time. This is similar to the archive functionality in Analysis Services 2000, but with additional options. The backup options include an indication as to whether to overwrite the destination file if it exists, whether to apply compression, and whether to encrypt the backup file. You can also indicate whether to include security information with the backup. This is useful on production backups, but you can turn it off when you’re backing up a database with the intent to restore it on a different system. You can also indicate remote partitions in the Backup Database dialog. This is something that had to be handled separately in Analysis Services 2000. You connect to an Analysis Services instance by using Object Explorer in SSMS. Then you right-click any database and select Back Up to open the Backup Database dialog.

Cube and Mining Model Processing

Cube and mining model processing is the same in concept in SQL Server 2005 as in earlier implementations. The introduction of UDM changes this in terms of how manual a process it is. There are also a few new processing options, such as Process Index, Process Data, and Unprocess, that allow you to get more specific about what information is to be acted on. Also note that object processing is now done in parallel by default. For instance, when you process a cube, all partitions in all measure groups in the cube are processed in parallel. There are also new tasks in SSIS to more specifically define and perform the processing of cubes, dimensions, and mining models.

Analysis Services Security Overview

In the summer of 2003, Microsoft took a timeout to review the security vulnerabilities of literally every product it delivers. This is often referred to as Microsoft’s Trustworthy Computing initiative, and it resulted in a “secure by design, secure by default, and secure by deployment” approach. Many product configurations are shut off by default as a way of protecting consumers from potential security vulnerabilities that they might not even be aware of. This is certainly true of Analysis Services. Several options that were either not available in Analysis Services 2000 or were inactive by default are implemented in such a way as to make Analysis Services 2005 more secure with the default installation. Examples include disallowing anonymous access and encrypting authentication. You can change most of these options, but any such changes could make your server more vulnerable, so you should closely evaluate them to make sure you understand possible repercussions.

Analysis Services authentication works much the way that it did in previous versions. There are two authentication schemes: integrated security and HTTP (IIS). When you use integrated security, Analysis Services leverages Windows authentication and all the associated robustness. Analysis Services can be configured to allow anonymous access, but, as mentioned previously, that is not the default. It is recommended that you retain the defaults.

You can use HTTP for connecting to Analysis Services over the Internet. Various middletier scenarios can be implemented to provide manageability for scenarios when the users are connecting from outside a trusted domain.

Granular Permissions

Authorization within Analysis Services has changed to be more granular and to more closely resemble what is found in the SQL Server relational engine. Previous versions of Analysis Services provided a single, all-or-nothing administrative role. This meant that anyone who had permissions to perform administrative functions on one database had the same, all-inclusive access to every other database on the server. That user also had access to the server-level attributes. For instance, he or she could modify the memory settings or change logging characteristics.

Like Analysis Services 2000, Analysis Services 2005 has a single, serverwide role that enables its members to perform any activity within Analysis Services. This is similar to the System Administrator role in SQL Server. Any members of the local System Administrators group are automatically added as members of this group. A user in this role does not require any additional permission in order to perform administrative functions or to access any objects in any database on the relative server instance. This is the only means of gaining access to Analysis Services databases, unless explicit database access is granted through database roles.

In Analysis Services 2005, individual database roles more granularly define access within an individual database. Logins can be given access to the Full Control role to be given full access to the database. Much like the database owner role in SQL Server, no further permissions need to be granted to such a login. More granular permissions can be granted by using one of these database roles:

Process database

Read definition

Access a Data Source (Read/Write or None)

Access a Cube (Read, Read/Write, or None)

Access a Cell Data (Read, Read-contingent, Read/Write or None)

Access a Dimension (Read, Read/Write, or None)

Access a Mining Structure (Read or None)

Access a Mining Model (Read, Read/Write, or None)

More defined permissions can be granted to individual objects within a database. If a user is a member of the Analysis Services administrator role or has Full Control access within a database, he or she needs no further permissions to access any objects within the database. All other users require explicit access to objects.

Analysis Services objects can be secured at a number of levels, down to the cell data level. A user can be a member of more than one role and has access to objects based on the union of all access from all roles. For instance, if a user has Read permissions to the HR cube through membership in the Human Resources role and has Read/Write permissions to the HR cube through membership in the HR Admin role, that user has Read/Write permissions to this cube.

Dimension-Level Security

Probably the most integral aspects of end-user security have to do with access to dimensions. In Analysis Services 2000, this access was based on access to dimension members in the dimension hierarchy. In Analysis Services 2005, the concept of hierarchy has been dismantled, and security is granted to attributes because any attribute can participate in a dimension view. MDX is used to define AllowedSet and DeniedSet properties. As the names imply, an AllowedSet property defines a set of members that can be accessed, and a DeniedSet property defines a set of members to which access is denied. An additional property, ApplyDenied, is available on the DeniedSet property, and it defaults to True. The ApplyDenied property specifies whether additional members of the attribute hierarchy are denied based on the members specified in the DeniedSet property. For instance, if a role is denied access to the United States member of the Country attribute, having ApplyDenied set to true would also deny access to all members that were descendents to this member in this attribute hierarchy.

One of the primary issues with Analysis Services 2000 had to do with memory usage when dimension-level security was implemented. Dimension-level security controls the members to which a user has access. So, if a user is a member of the SouthCentral Sales role, he or she may be limited to sales information for the states of Texas, Louisiana, and Oklahoma. In Analysis Services 2000, this resulted in a replica dimension when any user with access to a different combination of dimension access connected to Analysis Services. Dimension-level security uses memory much more conservatively in Analysis Services 2005, so concerns about memory utilization should not be an issue when determining whether to use dimension-level security.

Role-Based Security

Analysis Services provides role-based security. The middle-tier application is responsible for authenticating the user and simply passes the roles to be used when determining authorization to objects in SSAS. The disadvantage to this scenario is that the session has no username, and dynamic security cannot be implemented. Two other connection string properties are EffectiveUser and EffectiveRoles. The EffectiveUser property provides the name of a user to impersonate. Authorization is determined based on the roles of which that user is a member. The EffectiveRoles property can provide a subset of the user’s roles that are defined within SSAS.

Migrating from SQL Server 2000 Analysis Services to SQL Server 2005 Analysis Services

As should be apparent from reading the previous sections in this chapter, Analysis Services 2005 introduces a big paradigm shift from previous versions. There are two options for migrating to Analysis Services 2005: using the Migration Wizard and starting all over and manually creating cubes from scratch.


Note

Analysis Services 2005 does not support migrating OLAP Services (the OLAP engine in SQL Server 7) databases.


Using the Migration Wizard

You can migrate Analysis Services 2000 databases to Analysis Services 2005 by using the Migration Wizard, which you can access by selecting Start | All Programs | Microsoft SQL Server 2005 | Analysis Services. You need to consider a number of issues when using this facility. The goal of the Migration Wizard is to migrate Analysis Services 2000 objects exactly. If you take this route, the new features of Analysis Services cannot be realized without some manual work. The following information should help you determine some of the compatibility issues with objects in previous versions and how to immediately leverage some of the new functionality in Analysis Services 2005.

An easy migration could be just that: You could migrate one or more databases and reprocess the objects. Applications should be validated to make sure that no functionality has been lost as a result of the migration. Drill-through settings, linked objects, and remote partitions are not migrated. Table 13.2 shows a list of object types in Analysis Services 2000 and how they appear in Analysis Services 2005.

Table 13.2. Analysis Services 2005 Migration Outcomes

images

After you run the Migration Wizard, all objects need to be reprocessed. Any changes to incorporate any of the many new features in Analysis Services 2005 need to be performed manually.

Starting from Scratch

Rather than use the Migration Wizard and manually update the features that need it, it might be more desirable to manually create cubes from scratch. This requires less effort than it might seem it would due to the new Intellicube technology. If you know your underlying schema well and it is in a star or snowflake schema format, the Cube Wizard can make a very accurate determination about which of the tables should become dimensions and which should become fact tables to feed the cubes when it invokes Intellicube. Because all cubes need to be reprocessed with either the Migration Wizard or the Cube Wizard, the latter option is worth considering.

SQL Server 2005 Reporting Services (SSRS) Overview

Reporting is an integral part of business intelligence development life cycle. Considering this, SQL Server 2005 includes a scalable and secure, enterprise reporting platform, called SQL Server 2005 Reporting Services. SSRS is a revolutionary reporting paradigm that can be used to author, manage, and deliver various kinds of reports containing data from relational or multi-dimensional data sources. Reporting Services was originally released as an add-on to SQL Server 2000. It is now natively integrated in SQL Server 2005 and also contains several enhancements over Reporting Services 2000. Before looking at these enhancements and new features, let’s review some of the core components and tools that make up the reporting services platform.

Reporting Services Components and Tools

Reporting Services architecture consists of various components and tools that you can use to manage Reporting Services application. Here is an overview of these components and tools:

• At the core of Reporting Services architecture is a component called Report Server. Report Server is responsible for data and report processing and report delivery. Report Server itself consists of other sub-components including a Windows service and a web service. The Windows service provides scheduling and delivery infrastructure and the web service provides processing and rendering programming interfaces. The Report Server interacts with a SQL Server database that is used to store information and metadata of published reports, report models, and folder hierarchy. All access to this database must be handled through the Report Server by using management tools (such as Report Manager and SQL Server Management Studio, discussed next), or programmatic interfaces such as URL access, Report Server web service, or the Windows Management Instrumentation (WMI) provider.

• The web-based Report Manager tool can be used to access and manage reports, perform administrative tasks such as managing reports, folders, security, scheduling, subscription, and to launch Report Builder (which is discussed later). An alternative way to administer reports is to use Object Explorer in SQL Server Management Studio by connecting to a Report Services instance.

• SQL Server Business Intelligence Development Studio can be used to author and deploy reports. The report designer in BIDS is a full-featured report authoring tool that can be used to create both simple and complex reports. BIDS provides three SSRS project types, Report Server Project Wizard, Report Server Project, and Report Model Project. The Wizard can be used to quickly create a report and then modify it; Report Server Project can be used to build a report from the ground up using the designer and the Report Model Project can be used to create data sources, data source views, and report models that can be used by business uses to create ad-hoc reports by using by Report Builder (which is discussed next). A report model is an additional layer of information that maps database tables and views into concepts that are meaningful to business users.

Report Builder is another report authoring tool available in SQL Server 2005. It is intended for business users who know their data and want to create ad-hoc reports. Report Builder is the result of the Microsoft’s purchase of a company named ActiveViews in March 2004. Report Builder is a lightweight reporting tool that allows a business user to design and build reports without knowledge of SQL or MDX and without using complex tools such as BIDS. Report Builder provides an interface similar to those of Microsoft Office tools such as Excel. After a user defines his or her connection information to access a cube, the interface exposes the cubes, dimensions, and measures that are accessible by the user. The user then drags and drops the dimensions and measures that he or she wants to analyze. The user can then build a report that can be deployed and managed like any other report. A report that is built using Report Builder could be further embellished, however, through BIDS. In this scenario, a business user could define reports for one-time analysis, or he or she could decide that the report he or she has designed will have ongoing benefits to himself or herself as well as to additional users. The user could then request a more formalized report, based on the report definition, from a formal report writing department in the organization. Report Builder is a ClickOnce application, which means when the a user clicks Report Builder link inside Report Manager website, the application gets downloaded from the report server to a user’s local computer. ClickOnce facilitates centralized management of the application.

• The reports created using BIDS and Report Builder are saved in an XML format called Report Definition Language (RDL). Once a report is created, it can be made available to other users by publishing or deploying a report project in BIDS or by saving a report in Report Builder.

• Users can view the reports by browsing to Report Manager website. Developers can present the reports in their custom applications by using new ReportViewer controls or on SharePoint sites by using report viewer web part. BIDS and Report Builder both allow previewing the reports. Reporting Services allows exporting reports to various formats such as XML, PDF, CSV, TIFF, Web archive (.mht), and Excel. Developers can extend the Reporting Services functionality by leveraging the WMI and SOAP web services programming APIs provided by SSRS.

• SSRS provides three command line utilities that can be used to administer a report server, rsconfig.exe, rskeymgmt.exe, and rs.exe. The rsconfig.exe utility is used to set configuration information for the specified report server. The configuration values are stored into RSReportServer.config file under %ProgramFiles%Microsoft SQL ServerMSSQL.instance_idReporting ServicesReportServer folder. For security reasons, some configuration values are encrypted before saving. The rskeymgmt.exe utility is an encryption key management tool that can be used to back up, apply, and recreate symmetric keys. The rs.exe utility is provided to allow administrators to automate report server deployment and administration tasks. This utility can be used to execute Visual Basic .NET scripts against the specified Report Server.

• The new Reporting Services Configuration Manager tool allows configuring a local or remote Reporting Services installation. This tool allows managing Reporting Services virtual directories, Windows and web services credentials, report server database, encryption keys, and SMTP e-mail delivery settings.

Figure 13.8 shows the various configuration options provided by the Reporting Services Configuration Manager.

Figure 13.8. Reporting Services Configuration Manager allows configuring a report server deployment.

image

Reporting Services 2005 Enhancements

SQL Server 2005 Reporting Services builds on the solid foundation of Reporting Services 2000, providing a deep integration with other SQL Server 2005 components and SharePoint, core product enhancements, improved development productivity, and new tools. Here are some of the new features and enhancements introduced by SQL Server 2005 Reporting Services:

• Setup and deployment enhancements: SQL Server 2005 setup has been enhanced to decouple setup and configuration. Setup provides a files-only installation option that copies the program files to disk. And later, Reporting Services Configuration Manager tool can be used to complete the configuration step. Setup also provides a default configuration option that installs a ready-to-use report server. Independent of which setup option was chosen, the Reporting Services Configuration Manager tool can be used to configure and customize a Reporting Services installation. In addition, the new SQL Server 2005 Surface Area Configuration tool provides options to quickly enable or disable Report Server Windows service (and hence enable or disable scheduled events and report delivery), and web service (and hence enable or disable HTTP SOAP access).

• Management Studio integration: SQL Server Management Studio is enhanced to allow administering one or more report server instances. In addition, the scripting support allows generating a Visual Basic .NET script for report server objects. This script can be optionally updated, and then executed by rs.exe utility discussed earlier.

• Report Builder: As discussed earlier, SQL Server 2005 includes a new tool for business uses to create ad-hoc reports based on the report models created and published by the model designers or developers. BIDS introduces a new project type called a Report Model that can be used to create the report models to be used by the Report Builder client.

• Report functionality enhancements: Responding to the feedback received on SQL Server 2000 Reporting Services, Microsoft has added several improvements to enhance the report functionality. The examples of these enhancements include ability to specify multiple values for a parameter, enhanced printing support, interactive sorting in reporting, SharePoint integration, and improved RDL. Various report authoring enhancements have been made to report designer inside BIDS including a new graphical MDX query builder, enhanced expression builder, and SSIS integration.

• Programmability enhancements: Reporting Services 2005 introduces two new web services interfaces to manage objects on report server and to control the report processing and rendering. In addition, Visual Studio 2005 includes a set of freely redistributable report viewer controls that you can embed in your custom Reporting Services client application.

Summary

Analysis Services 2000 was already a leader in business intelligence. SQL Server 2005 continues this leadership by introducing the model “integrate, analyze, and report.” The new tools and features introduced in this release make SQL Server 2005 an obvious choice for designing, developing, and deploying BI solutions. This chapter focuses on the “analyze” part of the new BI model and describes the latest and greatest features in Analysis Services 2005.

The chapter starts with an overview of OLAP, data warehousing, and BI concepts. Then it talks about some of the strengths of the Analysis Services 2000 release, the foundation on which Analysis Services 2005 is built. The chapter then discusses the new features in Analysis Services 2005, including the new UDM, cube and dimension enhancements, multiple-instance and clustering support, XMLA and AMO support, and data mining enhancements. The subsequent sections in the chapter provide an overview of Analysis Services 2005 performance tuning, administration, security, and migration-related topics. The chapter concludes with an overview of SQL Server 2005 Reporting Service, an integral part of the BI development lifecycle in SQL Server 2005.

This chapter concludes the discussion on business intelligence support in SQL Server 2005. Chapter 14 discusses Service Broker, a very interesting feature introduced in SQL Server 2005 that allows developers to build scalable applications by using asynchronous messaging within the database.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset