Metadata modeling
This chapter describes metadata modeling of relational data sources with IBM Cognos Framework Manager, a metadata modeling tool that drives query generation for IBM Cognos Business Intelligence (BI). Cognos BI enables performance management on normalized and denormalized relational data sources and also a variety of OLAP data sources.
The chapter contains the following sections:
3.1 Cognos Framework Manager
With Cognos Framework Manager you can do the following tasks:
Create a project representing your data source (or sources)
Import required metadata, such as tables and views, from your data source
Model the metadata for your users
Set and manage security
Publish packages to make appropriate parts of the model available to report authors and self-service analysts
The chapter presents important reminders, examples, and preferred practices, but assumes you are already familiar with Framework Manager modeling concepts. It also assumes your project is using the dynamic query mode.
For additional details about Framework Manager modeling, see the following resources:
Framework Manager User Guide:
Guidelines for Modeling Metadata:
IBM developerWorks articles on proven practices in business analytics:
Business analytics product training from IBM Education:
3.2 Goals of metadata modeling relational data sources
Cognos Framework Manager is a metadata modeling tool where the building blocks for authoring reports and performing analysis are defined. A Framework Manager model is a business-focused representation of information from one or more data sources. It allows you to define reusable objects for security, translations, custom calculations, and other functions in a way that allows a single model to serve the needs of many groups of users.
Modeling with Framework Manager is an iterative process of refining different views of your metadata, starting with the data source view, then the business view, and finally the presentation view that your users consume. The end result is a metadata model that depicts all of your organization’s data in a simplified way that hides the structural complexity of the underlying data sources.
3.2.1 Modeling for self-service analysis
Successful self-service applications require presentation layers that have intuitive naming conventions and data formatting specifications that align with business conventions.
Performance problems associated with long-running queries during ad hoc analysis can be avoided by adding appropriate embedded filters into the Framework Manager model. Modelers should also consider adding reusable stand-alone filters and calculations to the package to help users avoid the wait times that develop during the sequence of steps that users undertake to define the filters and calculations themselves.
3.3 Framework Manager architecture
Framework Manager is a client-side graphical user interface that performs two primary functions:
Updates the project's model.xml file, which is the primary repository for models created in Framework Manager
Calls on the query service and other components of the Cognos BI server as needed
Figure 3-1 illustrates the communication channels with components of the Cognos BI server during various Framework Manager operations.
Framework Manager uses information entered into Cognos Configuration to communicate with server components. For example, Framework Manager uses the configured dispatcher URI to locate the active content manager service that populates the list of available data sources for metadata import. When testing query items, the query service on a Cognos BI server is what establishes connectivity to the data source and returns the requested results to Framework Manager. Framework Manager does not communicate with dynamic query data sources directly, which means that the associated JDBC drivers (which Framework Manager will not use) only need to be made available to the Cognos BI server. Note that a valid gateway URI must be entered into Cognos Configuration because Framework Manager connects through the gateway to authenticate users.
Figure 3-1 Framework Manager communication channels with Cognos BI server components
3.4 Key objects of a relational model
Query subjects, dimensions, determinants, and relationships are the primary objects used to build a metadata model. This section explains each of these objects and provides guidance of how to use them to convey the rules for generating effective queries of your business intelligence.
3.4.1 Query subjects
A query subject is a set of query items that have an inherent relationship. For example, a table and its columns in a relational database may be represented in Cognos BI as a query subject and its query items. You use Framework Manager to modify query subjects to optimize and customize the data that they retrieve, such as by adding filters or calculations. When you change the definition of a query subject, Framework Manager regenerates the associated query items to ensure that any changes to the query subject properties are reflected in all query items for that query subject.
Various types of query subjects are available in Framework Manager:
Data source query subjects
Model query subjects
Stored procedure query subjects
Data source query subjects
Data source query subjects directly reference data in a single data source. Framework Manager automatically creates a data source query subject for each table and view that you import into your model.
For example, if you import the Employee Detail Fact table from the Great Outdoors Warehouse sample database (included with all Cognos BI products), Framework Manager creates a query subject using the following SQL statement:
Select * from [go_data_warehouse].EMPLOYEE_DETAIL_FACT
Framework Manager generates query subjects that represent tabular data from the data source. In this way, a query subject that references an entire table contains query items that represent each column in the table. If the SQL selects only specific columns, only those columns are represented as query items.
Unlike model query subjects (described later in this chapter), each data source query subject can reference data from only one data source at a time. Yet the advantage of data source query subjects is that you can directly edit the SQL that defines the data to be retrieved. This means that you can insert parameters to tailor your queries based on variables that are populated at run time, including attributes of the user that is initiating the query.
Changing the type of SQL entered into data source query subjects
By default, the SQL statement for a data source query subject is Cognos SQL but you have the option to define it as either native SQL or pass-through SQL. Native and pass-through SQL statements must be completely self-contained and must not reference anything outside that SQL, such as database prompts, variables, or native formatting that would normally be supplied by the calling application. In contrast, Cognos SQL statements are analyzed using metadata from either the model or the relational data source. By default, Cognos SQL is case-sensitive, so it looks up metadata using the names as they are displayed in the SQL statement.
If you change an existing query subject to native SQL or pass-through SQL, you must first ensure that the SQL reflects the rules that apply to the native data source so that your query runs properly.
Cognos SQL
By default, Cognos SQL is used to create and edit query subjects in Framework Manager. Cognos SQL adheres to SQL standards and works with all relational and tabular data sources. The Cognos BI server generates Cognos SQL that is optimized to improve query subject performance, such as by removing unused elements at query time.
Cognos SQL works with any supported database because it is transformed into the appropriate native SQL at query time. In this way, working with Cognos SQL is preferable to the native or pass-through methods.
Native SQL
Native SQL is the SQL that the data source uses, such as DB2 LUW SQL or Oracle SQL. Use native SQL to pass the SQL statement that you enter to the database. Cognos BI can add statements to what you enter to improve performance. You cannot use native SQL in a query subject that references more than one data source in the project.
Pass-through SQL
Use pass-through SQL when the SQL statement that you enter is not valid inside a derived table or subquery. Pass-through SQL lets you use native SQL without any of the restrictions that the data source imposes on subqueries. This is because pass-through SQL query subjects are not processed as subqueries. Instead, the SQL for each query subject is sent directly to the data source where the query results are generated.
If the SQL you are entering is valid inside a derived table or subquery, identify it as native instead of pass-through because doing so increases the opportunity for improved performance when more query processing is performed by the database and less data is returned from the database to the Cognos BI server. To optimize performance, the Cognos BI server will always try to submit as much of the processing to the database as possible and will employ derived tables to do it. Identifying custom SQL as pass-through SQL prevents the Cognos BI server from submitting this SQL inside of a derived table.
Model query subjects
Model query subjects are not generated directly from a data source but are based on query items in other query subjects or dimensions, including other model query subjects. By using model query subjects, you can create a more abstract, business-oriented view of a data source.
Model query subjects are based on the metadata in your model. This allows you to take the following actions:
Rename items in your model and reorganize them into a layer that is appropriately presented for authors.
Reuse complex SQL statements that exist elsewhere in the model.
Reference objects from different data sources in the same query subject.
The SQL for a model query subject is generated by the query service and cannot be edited directly. If you want to edit the SQL of a model query subject, the preferred method is to copy the SQL for the model query subject from the query information tab and paste it into a new data source query subject. Otherwise, you can convert the model query subject into a data source query subject through the Actions menu.
Stored procedure query subjects
Stored procedure query subjects are generated when you import a procedure from a relational data source. Framework Manager supports only user-defined stored procedures. System-stored procedures are not supported.
The procedure must be run in Framework Manager to get a description of the result set that the procedure is expected to return. The stored procedure must return a single uniform result set. Cognos BI supports only the first result set that is returned. If the procedure can conditionally return a different result set, the format of that set must be consistent with the one used to define the metadata in Framework Manager.
Each result set must be returned in the same format, such as the same number and types of columns and column names. Overloaded signatures are supported by Cognos BI, but each procedure must be defined with a unique name and a separate query subject for each result set. Output parameters are not supported.
After you import or create a stored procedure query subject, it displays as a broken object. You must run it to validate the underlying stored procedure and specify the projection list. Static metadata often does not exist for the stored procedure in the relational data source that describes what a result set may look like. The result set may be known only at run time. When a stored procedure is updated in the data source, running the stored procedure in Framework Manager updates the query subject using the newly generated query items.
Sometimes, functions are imported as stored procedure query subjects. Review the stored procedure definition to determine what the procedure expects to be passed and what it attempts to return. Edit and test each stored procedure query subject that you think can be a function. If the test fails, the query subject is a function and must be deleted.
As of Cognos BI version 10.2.1, you can specify the type of transaction that is used by stored procedure query subjects. By default, a query subject that contains a stored procedure is run in a read-only transaction. However, the stored procedure might include operations that require a read/write transaction. The transaction access mode property for data sources specifies the access mode of a new transaction. The options are as follows:
Unspecified: A new transaction is started in the default mode of the JDBC driver
Read-Only: A new transaction is started in read-only mode
Read-Write: A new transaction is started in read/write mode
The transaction statement mode property applies only to the compatible query mode and is ignored in the dynamic query mode.
3.4.2 Dimensions
Dimensions must be defined to enable the OLAP experience associated with a dimensionally modeled relational (DMR) package. A dimension is a broad grouping of data about a major aspect of a business, such as products, dates, or markets. The types of dimensions that you can work within Framework Manager are regular dimensions and measure dimensions.
Table 3-1 on page 31 presents an example of the dimensions in a project for sales analysis.
Table 3-1 Dimensions of a sample project for sales analysis
Name
Type
Description
Time
Regular
Dates of sales organized into years, quarters, months, weeks, and days when sales were made
Region
Regular
Locations of sales grouped into sales regions, countries, and cities
Product
Regular
Product details organized by product type, brand, model, color, and packaging
Customer
Regular
Customer information
Sales
Measure
Purchase details such as units sold, revenue, and profit
Query subjects and dimensions serve separate purposes. The query subject is used to generate relational queries and can be created using star schema rules, while the dimension is used for DMR analytics, which introduces OLAP behavior. Because query subjects are the foundation of dimensions, a key part of any dimensional model is a sound relational model. By creating a complete relational model that delivers correct results and good performance, you will have a strong foundation for developing a dimensional model.
3.4.3 Determinants
Determinants establish granularity by representing subsets or groups of data in a query subject and are used to ensure correct aggregation of repeated data. Determinants are closely related to the concept of keys and indexes in the data source and are imported from the database based on unique key and index information in the data source. It is preferred that you always review the determinants that are imported and, if necessary, modify them or create additional ones. By modifying determinants, you can override the index and key information in your data source, replacing it with information that is better aligned with your reporting and analysis needs. By adding determinants, you can represent groups of repeated data that are relevant for your application.
Determinants affect the grouping and aggregation of data, including other query subjects that have relationships with the query subject and also the query subject itself. When you define a non-unique item as a determinant, you should specify the Group by check box. This indicates to the Cognos BI server that when the keys or attributes associated with that determinant are repeated in the data, the server should apply aggregate functions and grouping to avoid double-counting. Do not specify determinants that have either or both of the following check boxes selected:
Uniquely identified
Group by
Determinants for query subjects are not the same as levels and hierarchies for regular dimensions but they can be closely related to a single hierarchy. If you plan to use your query subjects as the foundation for dimensions, consider the structure of the hierarchies you expect to create and ensure that you have created determinants that will support correct results when aggregating. The query subject should have a determinant specified for each level of the hierarchy in the regular dimension. Specify the determinants in the same order as the levels in the regular dimension.
If you expect to have multiple hierarchies that aggregate differently, you might consider creating an additional query subject with different determinants as the source for the other hierarchy.
Although determinants can be used to solve a variety of problems related to data granularity, also use them in the following primary cases:
A query subject that behaves as a dimension has multiple levels of granularity and will be joined on different sets of keys to fact data. An example is a time query subject that has multiple levels granularity and it is joined to the inventory query subject on the month key and to the sales query subject on the day key.
There is a need to count or perform other aggregate functions on a key or attribute that is repeated. For example, the time query subject has a month key and an attribute, days in the month, that is repeated for each day. If you want to use days in the month in a report, you do not want the sum of days in the month for each day in the month. Instead, you want the unique value of days in the month for the chosen month key.
You want to uniquely identify the row of data when retrieving text BLOB data from the data source. Querying BLOBs requires additional key or index type information. If this information is not present in the data source, you can add it using determinants.
A join is specified that uses fewer keys than a unique determinant that is specified for a query subject. There will be a conflict if your join is built on a subset of the columns that are referenced by the keys of a unique determinant on the 0..1 or 1..1 side of the relationship. Resolve this conflict by modifying the relationship to fully agree with the determinant or by modifying the determinant to support the relationship.
You want to override the determinants imported from the data source that conflict with relationships created for reporting. For example, there are determinants on two query subjects for multiple columns but the relationship between the query subjects uses only a subset of these columns. If it is not appropriate to use the additional columns in the relationship, then you must modify the determinant information of the query subject.
Determinants example
Table 3-2 on page 33 presents sample rows from a time query subject with unique foreign keys. Table 3-3 on page 33 presents sample rows from a time query subject that has non-unique month keys. These two data sets illustrate the concept of determinants.
In both data sets, the day key is the unique key of the table, so you can associate all of the columns in the table to this key. Because it is a unique key in both data sets, in both scenarios you identify the day key as a determinant with the check boxes in the following states:
Selected: Uniquely identified
Deselected: Group by
In both scenarios, the year key is not unique so the Uniquely identified box should remain deselected for this determinant. However, because the year key is all that is needed to identify a year in the data, the Group by box is selected to ensure that both the select distinct and group by SQL clauses are used to display individual years in reports instead of repeated values.
The values of the month key are what provide the difference between the two data sets in this example. Unlike the month key data in Table 3-2 on page 33, the month key data in Table 3-3 on page 33 is not sufficient to identify a particular month in the data (because January in different years would have the same month key value). For the Table 3-2 on page 33 scenario, only the month key is required for the month determinant because each key contains enough information to identify the group within the data. For the Table 3-3 on page 33 scenario, the month determinant requires both the month key and the year key to identify months as a sub-grouping of years. Table 3-4 on page 33 summarizes the determinants of both data sets.
Table 3-2 Sample data set with unique month keys
Year key
Month key
Month name
Day key
Day name
2013
201301
January
20130101
January 1, 2013
2013
201302
January
20130102
January 2, 2013
Table 3-3 Sample data set with non-unique month keys
Year key
Month key
Month name
Day key
Day name
2013
01
January
20130101
January 1, 2013
2013
01
January
20130102
January 2, 2013
Table 3-4 Determinant settings for table 3-2 and 3-3 data sets
Data set
Name of determinant
Key
Attributes
Uniquely identified
Group by
Table 3-2
Year
Year key
None
No
Yes
Table 3-3
Year
Year key
None
No
Yes
Table 3-2
Month
Month key
Month name
No
Yes
Table 3-3
Month
Year key, month key
Month name
No
Yes
Table 3-2
Day
Day key
Year key, month name, day name
Yes
No
Table 3-3
Day
Day key
Year key, month name, day name
Yes
No
3.4.4 Relationships
A relationship describes how to create a relational query for multiple objects in the model. Without relationships, these objects are isolated sets of data. Relationships work in both directions. You often must examine what is happening in both directions to fully understand the relationship.
When importing metadata, Framework Manager can create relationships between objects in the model based on the primary and foreign keys in the data source or by matching query item names. You can create or remove relationships in the model so that the model better represents the logical structure of your business. After you import metadata, verify that the relationships you require exist in the project and that the cardinality is set correctly. The data source might have been designed without using referential integrity. Often, many primary and unique key constraints are not specified. Without these constraints, Framework Manager cannot generate the necessary relationships between fact tables and dimension tables.
The cardinality of a relationship is the number of related rows for each of the two query subjects. The rows are related by the expression of the relationship, which typically refers to the primary and foreign keys of the underlying tables.
The Cognos BI server uses the cardinality of a relationship in the following ways:
To avoid double-counting fact data
To support loop joins that are common in star schema models
To optimize access to the underlying data source system
To identify query subjects that behave as facts or dimensions. 1 to n cardinality implies fact data on the n side and implies dimension data on the 1 side
By default, Framework Manager uses Merise notation in relationship diagrams. Merise notation marks each end of the relationship with the minimum and maximum cardinality of that end. When you interpret cardinality, you must consider the notation that displays at both ends of the relationship. Possible end labels are as follows:
0..1 (zero or one match)
1..1 (exactly one match)
0..n (zero or more matches)
1..n (one or more matches)
The first part of the notation specifies the type of join for the relationship:
An inner join (1) shows all matching rows from both objects.
An outer join (0) shows everything from both objects, including the items that do not match. An outer join can be qualified as full, left, or right. Left and right outer joins take everything from the left or right side of the relationship respectively but take only what matches from the other side.
Users see a different report depending on whether you use an inner or outer join. Consider, for example, users who want a report that lists sales people and orders. If you use an outer join to connect sales people and orders, the report shows all salespeople, regardless of whether they have any orders. If you use an inner join, the report shows only salespeople who have placed orders.
Data in one object might have no match in the other object. However, if the relationship has a minimum cardinality of 1, an inner join is always used and these records are ignored. Conversely, if all items match but the relationship in the model has a minimum cardinality of 0, an outer join is always used, although the results end up being the same as with an inner join. For example, the underlying table for one object contains a mandatory (non-NULL) foreign key for the other object. In this case, you must ensure that the data and cardinalities match.
The second part of the notation defines the relationship of query items between the objects.
Cognos BI supports both minimum-maximum cardinality and optional cardinality.
In a 0..1 relationship, 0 is the minimum cardinality, 1 is the maximum cardinality.
In a 1..n relationship, 1 is the minimum cardinality, n is the maximum cardinality.
A relationship with cardinality specified as being in a range from 1..1 to 1..n is commonly referred to as 1-to-n when focusing on the maximum cardinalities.
A minimum cardinality of 0 indicates that the relationship is optional. You specify a minimum cardinality of 0 if you want the query to retain the information on the other side of the relationship in the absence of a match. For example, a relationship between customer and actual sales can be specified as 1..1 to 0..n, in which case reports will show the requested customer information although there might not be any sales data present.
This means that a 1-to-n relationship can also be specified as any of the following items:
0..1 to 0..n
0..1 to 1..n
1..1 to 0..n
1..1 to 1..n
Use the Relationship impact statement in the Relationship definition dialog box to help you understand the cardinality of each side of a relationship.
When generating queries, Cognos software follows these basic rules to apply cardinality:
Cardinality is applied in the context of a query
A 1-to-n cardinality implies fact data on the n side and dimension data on the 1 side.
A query subject may behave as a fact query subject or as a dimensional query subject, depending on the relationships that are required to answer a particular query.
3.5 Organizing relational models
When building a model, it is important to understand that there is no single workflow that will deliver a model suitable for all applications. Before beginning your model, you must understand the application’s requirements for functionality, ease of use, and performance.
A well organized model helps users more easily find and understand the data in the model. It also makes the model easier for you to manage and maintain. By ensuring that a layer of model objects (either query subjects or dimensions) exists between the data source and the objects exposed to authors, you are better able to shield users from change.
A good approach is to import and work with small portions of metadata. In this way, determining how to build relationships and provide a clear view of the data for the users who will author reports using what you publish is easier.
A leading practice is to create several views, or layers, in the model. For example, depending on the complexity of your situation, you may not need a presentation view, and it might be possible to publish your business view for use by users.
Security can be defined in any of the views. The choice depends on your business requirements. For example, while security is typically applied in the business view, if you must prevent everyone from viewing an object, you add security to the object in the import view.
3.5.1 Data view
The data view, represented as a namespace, houses your data source-level query subjects and relationships. This view can be considered the import or physical layer.
Consolidating the creation of relationships in this view will take advantage of optimized queries in the query engine. If relationships are defined on model query subjects, then all associated joins and columns will be treated as a view and the Cognos BI server will not try to minimize the SQL for those items. For this performance reason, relationships should be defined on data source query subjects in the data view.
Avoid renaming data source query subjects or modifying the SQL in the data view to maximize the use of caching in the query engine. With minimal renaming the query engine can further cache results, which improves performance by reducing the need to re-query your data source for metadata.
Create one or more optional namespaces or folders for resolving complexities that affect querying using query subjects or dimensional objects. To enable an OLAP experience over a relational database through a DMR package, there must be a namespace or folder in the model that represents the metadata with dimensional objects.
3.5.2 Business logic view
The business logic view, often referred to as the logical view, is where you can begin to simplify your data for presentation. This is done by denormalizing your data view by consolidating snowflake dimensions into one model query subject, or by hiding codes found in fact query subjects.
This view contains all of your model query subjects, and query items can be renamed to be more user-friendly. You can add further value with business logic by creating calculations and filters in this view.
Ideally, this view can provide an insulating layer from changes to the data source, so if the schema of the underlying database changes, there would be no need to modify any existing reports based on the business view or the presentation view. When table or other structure changes occur in the database, you can remap the model query subjects in the business view or the presentation view to new data source query subjects without any impact to report authors.
Security can be dynamically applied with calculations, session parameters, parameter maps and macros.
3.5.3 Presentation view
The presentation view can further simplify and organize your model. Depending on your audience, you can skip creating a presentation view as the organization and work done in the business view may suffice.
Using folders to group relevant items, filters, and shortcuts and relate them to the query items created in the business view allows you to provide a clear and simple view into the data for report builders.
3.6 Relational modeling for performance
This section provides guidance for creating models that will enable efficient, high performing reports. The impact of modeling techniques on SQL generation and caching is also described.
3.6.1 As view versus minimized SQL generation
There are two approaches to SQL generation that the Cognos BI server can employ when planning queries: minimized SQL and as view. The key difference between these two approaches is whether the query service is free to optimize SQL generation as it wants to or if there are constraints imposed by the Framework Manager modeler that must be respected.
With minimized SQL, the SQL generated by the Cognos BI server contains only the minimum set of tables and joins needed to obtain values for the selected query items. If you are modeling a normalized data source, you might choose to focus on minimized SQL because it can reduce the number of tables used in some requests and perform better.
With the as view SQL generation type, the SQL stays the same no matter which items in the query subject are referenced.
Minimized SQL example
Figure 3-2 shows four product tables as four data source query subjects that are joined to each other. The associated query items are combined into a model query subject, as shown in Figure 3-3.
Figure 3-2 Product tables as four data source query subjects
Figure 3-3 Model query subject combining all product query items shown in Figure 3-2
If you test the products model query subject depicted in Example 3-1 on page 38 as a whole, you can see that the four product tables are referenced in the from clause of the query, as shown in Example 3-1 on page 38.
Example 3-1 SQL generated for testing all query items
select
PRODUCT_LINE.PRODUCT_LINE_CODE as Product_Line_Code,
PRODUCT_LINE.PRODUCT_LINE_EN as Product_Line,
PRODUCT_TYPE.PRODUCT_TYPE_CODE as Product_Type_Code,
PRODUCT_TYPE.PRODUCT_TYPE_EN as Product_Type,
PRODUCT.PRODUCT_NUMBER as Product_Number,
PRODUCT_MULTILINGUAL.PRODUCT_NAME as Product_Name
PRODUCT_MULTILINGUAL.DESCRIPTION as Product_Description,
PRODUCT.INTRODUCTION_DATE as Introduction_Date,
PRODUCT.PRODUCT_IMAGE as Product_Image,
PRODUCT.PRODUCTION_COST as Production_Cost,
PRODUCT.MARGIN as Margin
from
gosl..gosl.PRODUCT_LINE PRODUCT_LINE,
gosl..gosl.PRODUCT_TYPE PRODUCT_TYPE,
gosl..gosl.PRODUCT PRODUCT,
gosl..gosl.PRODUCT_MULTILINGUAL PRODUCT_MULTILINGUAL
where
(PRODUCT_MULTILINGUAL."LANGUAGE" - N'EN')
and
(PRODUCT_LINE.PRODUCT_LINE_CODE = PRODUCT_TYPE.PRODUCT_LINE_CODE)
and
(PRODUCT_TYPE.PRODUCT_TYPE_CODE = PRODUCT.PRODUCT_TYPE_CODE)
and
(PRODUCT.PRODUCT_NUMBER = PRODUCT_MULTILINGUAL.PRODUCT_NUMBER
If you test only the product name query item, you can see that the resulting query uses only the product multilingual table, which is the only table that was required. This is the effect of minimized SQL generation shown in (Example 3-2).
Example 3-2 SQL generated that has been minimized
select
PRODUCT_MULTILINGUAL.PRODUCT_NAME as Product_Name
from
gosl..gosl.PRODUCT_MULTILINGUAL PRODUCT_MULTILINGUAL
where
(PRODUCT_MULTILINGUAL."LANGUAGE" - N'EN")
Criteria that prevent minimized SQL generation
The following modeling scenarios will cause the corresponding query subject to function as a view, which means that the associated SQL will not be minimized:
Modifying the default SQL in the expression of a data source query subject
Adding filters or calculations to a data source query subject
Defining relationships on a model query subject
To allow minimized SQL to be generated, relationships and determinants must be defined on data source query subjects and not model query subjects. When a relationship is defined on a model query subject, the resultant SQL changes because now it is considered a query and not only a folder of expressions.
Minimized SQL takes better advantage of database optimization than does complex SQL, so you are advised to avoid the three scenarios just described. However, there might be times when losing SQL minimization is necessary, such as when you require model query subjects with overriding relationships to control query execution paths, or you need to change the SQL on data source query subjects. This keeps the number of rows that are returned from this query subject stable, regardless of the elements that are reported from the query subject.
3.6.2 Security-aware caching
As detailed in Chapter 1, “Overview of Cognos Dynamic Query” on page 1, the Cognos BI query service employs a sophisticated cache management system to reuse objects captured from queries to data sources and use those objects to satisfy subsequent requests for that information. Caching can drastically reduce user wait times because it will always be faster for the query service to reuse an object it already has in memory than to go back to the data source for the information. The performance benefits of caching are clearly noticeable except for lighter queries where retrieving data from the database is a sub-second operation.
In cases where the underlying relational database responses appear instantaneous to users, it may be best to disable caching, either at the package level or the level of a particular query inside of a Report Studio report. Caching can be disabled either in Framework Manager (by deselecting the Allow usage of local cache check box governor and then republishing the package), or in Report Studio (by setting the Use Local Cache property of a query in the query explorer area to No). The DMR cache can consume considerable amounts of memory, which may initiate internal memory management operations that impact performance. For DMR and pure relational packages, disabling caching, either in the report or in the model, will ensure that only the portion of the query service’s Java memory that is required to execute the largest report is consumed, and nothing more. This keeps the query service’s memory footprint minimal. When the cache is enabled but is not being utilized, such as might be the case with complex security requirements, then resource consumption becomes sub-optimal and may degrade performance. For Report Studio reports, DMR caching is disabled only if one of the following conditions is present:
The Use Local Cache property of the query object in the report is set to No
The Allow usage of local cache governor in the Framework Manager model is deselected when the package is published
The query service’s caches for DMR and pure relational packages are security-aware in that, by default, cached metadata and data will not be shared between users with different security profiles. Users with the same security profiles are also similar in these respects:
Sign-on information (user name and password) to the data source
Expanded values of data source connection command blocks
Model security defined in Framework Manager
Expanded values of macros within in the queries used to populate the cache
Pure relational caching
Unlike dimensional queries, such as those for DMR packages (which, as of Cognos BI version 10.2, persist in memory until a manual or scheduled clear cache operation occurs), results from pure relational queries are retained in memory only while that same data source connection remains open. Query service data source connections have a default idle timeout of 300 seconds (5 minutes), but this can be configured to another value using the properties window of the query service in Cognos Administration, as explained in the product information center:
Results that are cached from one user’s request can be used for a different user’s request under default settings, if both users have the same data security profiles.
DMR caching
To optimize your use of the DMR cache, it is important to understand its design, including what triggers the creation of a new cube cache or the reuse of an existing one.
The DMR cache stores members and cell values that combine to make a cube cache. The DMR cache for a certain package can have one or more cube caches in memory, as shown in Figure 3-4.
Figure 3-4 DMR cache composition
A cube cache primarily uses a member cache, which is shared by all the cube caches associated to the same package, and a cell value cache that is exclusively used by the cube cache.
Not all data requests will be cached because for some types of requests, caching provides no benefit. For example, very large batch reports that generate hundreds or thousands of pages of results consume more time writing results to the cache than are consumed by re-executing the same request multiple times. In addition, some queries perform well when executed on the underlying data source and may not benefit from using the cache. The query service automatically determines when to avoid writing results to the cache for queries that would not benefit from using it.
When using the cache, the query service caches each result in the context of all dimensions in the published package. Although many factors affect read/write performance from the cache, having a high number of dimensions negatively affect cache performance. Therefore, a good practice is to limit the choice of dimensions in a package to those that are required to satisfy the business requirements. This results in a noticeable performance gain for some situations.
Elements that compose the key to a cube cache
A cube cache is populated on-demand as metadata and cell values are retrieved during the execution of reports. Before a new cube cache is created, the query service checks to see if it can reuse an existing cube cache. Each of the cube caches associated with a package has a key that controls when that cube cache can be reused. If a new request has a key that matches that of a cache in memory, then that in-memory cache will be used. Otherwise a new cube cache is created (assuming caching has not been is disabled).
A DMR cube cache is secured by a composite key that requires matches on these elements:
Package instance
Data source connection
Data source sign-on
Resolved data source command block values
Detail filters:
 – Pre-aggregation filters
 – Filters defined in model
 – Slicers defined in report
 – Model security
 – Prompt selections
Resolved macro values
Runtime locale
Publishing different packages or versions of the same package from the same model will create different cube caches. If a version of a package results in multiple cube caches, the metadata cache will be shared by all of those cube caches through the application of security context on the metadata. Publishing the same package multiple times without any changes to the model causes two cube caches to be created, one for each package instance. A cube cache is reused only if the request originates from the same package instance.
A package references one or more data sources and every report execution is associated with a data source connection definition. A data source connection definition is composed of the connection string and, optionally, command blocks. By default, a cube cache is re-used only if the connection string (as sent to the database) and any resolved command block values are identical. This condition is also used when the query service is determining if it can reuse an open connection to a data source.
Pre-aggregation detail filters, which are typically computed by the database, include both those defined in the report and those defined in the Framework Manager. If the detail filters applied when populating a cube cache are not the same list of expressions as are in a new request, the cube cache cannot be reused because the members, cell values, or the members and cell values might be different. Any slicers, which are also known as context filters, that are defined in a report are converted into pre-aggregation detail filters but are applied only to the query against the fact table used to get cell values; they are not applied when loading members.
A report, the model, or the report and the model might have prompts defined. Not all prompts are applied to the key of a cache; only the prompts directly referenced in a query or indirectly referenced from the model (embedded in a query item) affect the cube cache's key. Prompts that are not referenced at all in the query are ignored with respect to the cache.
A cube cache is reused only if the associated macros whose values are resolved to create the cube cache are the same as those resolved values in a new query. If the current query has additional macros not present in the original query, the cube cache may be reused and those new macros are added to the cube cache’s key.
If object security is defined in the Framework Manager model, it implies that certain users do not have authorization to all objects in the model. In this case, by default, the profile associated with the object security is added to the key that controls sharing of that cube cache.
Finally, a cache is reused only if the runtime locale used to create the cache is the same as the current request's runtime locale.
Governors that control cache sharing sensitivity
Framework Manager offers governors so you can prevent certain elements from affecting a cache’s key. To ensure confidentiality, the default settings for these governors are the most restrictive possible. You can change these governor settings to allow for greater cache reuse, if your security requirements allow you do so. For example, users might have different data source credentials specifically because the database administrator wants to audit the queries that are submitted to the database such that there are no data authorization differences between these database user accounts. In such a scenario, you might want to set the Cache is sensitive to DB info governor to DB + Connection instead of the default value of DB + Connection + Signon.
 
Note: For more details, see the dynamic query mode governor section of the product documentation at this website:
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset