Planning a RDM project
This chapter describes the considerations of planning a reference data management project with IBM InfoSphere Master Data Management Reference Data Management Hub (InfoSphere MDM Ref DM Hub). The purpose is to detail the necessary resources and provide an overview of the activities that must be accomplished to provide enough factual information and understanding of reference data management to effectively drive and manage an implementation. More detailed descriptions of activities follow in subsequent chapters.
There are many comprehensive project management tools and methodologies in use; this chapter does not attempt to describe or to follow any specific methodology. This chapter describes general phases of a project and the considerations that are particular to reference data management with a checklist of inputs and output from each phase.
The roles and skills required of the project team members are outlined once in 3.4.2, “Transforming business roles to RDM roles” on page 63 to avoid repetition. It will be clear from reading the tasks involved in planning an InfoSphere MDM Ref DM Hub implementation which roles are applicable where.
3.1 RDM methodology overview
The structure used here to introduce a methodology for planning and implementing an InfoSphere MDM Ref DM Hub project is taken from the IBM Information Management Unified Method (IMUM). Only the basic phases are described, and only in the context of an InfoSphere MDM Ref DM Hub implementation. This methodology overview does not attempt to describe or explore IMUM, we only consider an operational and technical stream relevant to reference data management (RDM). Project management activities can be left to any of the preferred tools and methods that you choose in any implementation.
The phases are as follows:
Analyze: Define what the RDM has to accomplish.
Design: Identify the RDM components and their dependencies.
Configure: Build the RDM solution.
Deploy: Test and move into production.
Operate: Run and develop RDM.
3.1.1 Analyze
The analyze phase defines what the RDM solution must accomplish in terms of functionality, understanding, and using the features of InfoSphere MDM Ref DM Hub, for example, set mappings, value hierarchies, and the non-functional requirements such as usability and performance.
During this phase, the most common working practice involves workshop sessions where the high level business requirements and any available use cases are matched to the functionality and processes of InfoSphere MDM Ref DM Hub. Repeating analysis for various requirements and use cases raises a set of common questions and answers that can be used to formalize the results of analysis.
The results of this work can give a picture of the business processes that use reference data, the amount of duplication found in the various code sets, and the amount of translating that is currently carried out to match or roll-up values from one set to another. It soon becomes apparent that some sets are duplicated in various places and systems and can usefully be collapsed into one “master” set. However, there might be business reasons why two or more editions or versions of sets should exist separately and continue to do so.
For example, consider “product” as a set of values. There are three product set instances found, as shown in Table 3-1.
Table 3-1 Product sets
Set owner
Set purpose
Governance
Sales
product codes
List of all products that might be offered for sale
Owned by Sales and Marketing. Agreed with Manufacturing
Manufacturing product codes
List of all products that are manufactured; many products that are used as multiple components in other products
Owned by Manufacturing. Aligned with materials stocks and used in production scheduling
Accounting product code mapping
List of products that are costed, might be a mix of components and product. List of sales products by profit margin
Owns neither set but owns the mapping between them to use in costing work
For each set, an outcome from analysis should the following items:
Code set owner: Which department or system owns this set
Code set control: Specify whether set is an internal set or an external set, for example ISO currency codes
List of values and descriptions held in the set
List of applications where the set is found
List of business processes that use this set
For each set a list of related or similar sets
List of business and system processes where set values are mapped or transcoded
The activity of analysis also gives identification and familiarization:
Identification of issues in the business processes
Identifies where the existing use or maintenance of reference data sets is giving problems. These problems might be in complicated interfaces or in the need to manually compare sets and codes. Noting these problems gives an input into prioritizing later development.
Familiarization of RDM concepts for the business and technical teams and to the wider client environment
3.1.2 Design
In this chapter the design phase is confined to configuring InfoSphere MDM Ref DM Hub. There is a fixed data model in InfoSphere MDM Ref DM Hub that is designed to support a wide range of functionality for the handling and maintenance of reference data sets. As an overview, consider the functionality of InfoSphere MDM Ref DM Hub to do the following tasks:
Manage reference data sets.
Relate sets through mappings.
Make subscriptions to link to external systems.
Import and export sets.
The design phase makes decisions on exactly how to configure InfoSphere MDM Ref DM Hub. The first step in designing is to decide from the requirements and data sets found in the analyze phase, whether all business requirements can be met.
If significant requirements lie outside these areas, you can customize InfoSphere MDM Ref DM Hub by adding and extending the data model and by building custom services for add and update functions.
3.1.3 Configure
In the configure phase, the configurable parts of InfoSphere MDM Ref DM Hub are as follows:
Security and access includes creating user accounts and associating users to groups, and groups to roles.
Configuring components include adjusting transaction time-outs, configuring batch import and export, and customizing the InfoSphere MDM Ref DM Hub user interface web page.
Defaults are date and time formats; default start and end dates used in sets, mappings, and import and export wizards; the character used in separated value files (usually comma separated value CSV).
Managed systems is a term for any systems that are external to InfoSphere MDM Ref DM Hub where an integration is built to InfoSphere MDM Ref DM Hub. Managed systems are integration points held in InfoSphere MDM Ref DM Hub where the configurations and properties are stored that allow integration with a particular external system. The owners of the external system must collaborate with the integration from their side.
3.1.4 Deploy
The deploy phase is a set of tasks that moves the InfoSphere MDM Ref DM Hub system from a development environment into a production environment.
The technical specialists and client IT staff collaborate to build and test a working configured production environment. This setup is a repeat of previous installations of testing and development environments and incorporates the lessons learned of what specific procedures are needed in the unique client IT environment. All the configurations and customizations that are built into InfoSphere MDM Ref DM Hub are included in the installation. The joint effort with the client’s deployment team should execute and document the mechanism of transferring the code from the test or development environment instances into production.
With the environment prepared, an initial load of the reference data sets can take place. All import files that were tested can be used to load reference sets, including hierarchies within and between sets. At this point, the solution is connected to the production environment.
The complete description of deployment is described in Chapter 7, “Implementation” on page 159.
3.1.5 Operate
The operate phase is after the actual transition and use of the solution is established in production. The final implementation activities include a comprehensive review of all of the project documentation and an evaluation of the system implementation results.
The purpose is to ensure the InfoSphere MDM Ref DM Hub system is running smoothly in production and that the baseline system behavior is captured for future reference. Part of the implementation effort will have established business and IT processes, enabling the future adoption of new reference data sets, mappings, and interfaces to be added into InfoSphere MDM Ref DM Hub.
3.2 Iterative implementation of InfoSphere MDM Ref DM Hub
As with all software implementations, various methodologies are used to support various development styles and business practices; no one process suits all situations (iterative or waterfall or any process in between). The methodologies themselves are not described here, only the use of iterative techniques. The InfoSphere MDM Ref DM Hub solutions are suitable for an iterative implementation process for the following reasons:
The software is comprehensive and configurable, and can be extended and customized.
Each functional area can be treated separately for use and development. A small sample of code sets can be created quickly.
There is a dependency on the precedents of which artifacts are created in InfoSphere MDM Ref DM Hub, which becomes immediately apparent.
Small numbers of sets can be manipulated in mappings, imported, exported, and copied. If they meet acceptance criteria for operational InfoSphere MDM Ref DM Hub, they can be preserved. If not, they can be easily discarded.
This approach has the advantages of putting project team members and business users into an environment where experimentation is possible and there is an emphasis on communication, simplicity, feedback, and easier compilation of documentation.
For an iterative approach, the overall functionality of InfoSphere MDM Ref DM Hub is described here in discrete features. These building blocks are created directly into the InfoSphere MDM Ref DM Hub system.
Table 3-2 on page 59 lists five sample iterations that are based on groups of individual features that are related by functional area and are few enough to consider together. In all implementations, the specific use cases and requirements influence the exact order and content of iterations. Any iteration can be repeated many times and include steps from previous iterations.
Table 3-2 Sample iterations
Iteration 1
Iteration 2
Iteration 3
Iteration 4
Iteration 5
Add set type
Add set types with properties
Prepare CSV files and import sets
Add mapping types with properties
Move sets through one or more lifecycles
Add mapping type
Add Sets with tree hierarchies
Export sets
Add mappings with some values mapped
Add set versions
Add two sets
Add sets with level hierarchies
Add translations
Add mappings with many to one mappings
Copy sets
Add a mapping between the two sets
Create users and groups
Add managed systems
-
Merge sets
Make folder structure
-
-
-
Manage owners
The benefits of using iterations are as follows:
Team members can quickly learn the capabilities on InfoSphere MDM Ref DM Hub and discard any work that was illustrative or experimental.
Concurrently a more formal approach can build work in the iterations that can be saved and documented to become part of the established solution.
The results of repeated iterations give a basis on which to build test cases for functionality and user acceptance.
Every iteration tests and proves that the solution architecture is stable.
The durations of iterations become well known and as iterations are completed, the results give project management good input into planning, estimation, number of iterations, and an indication of the likely project schedule and effective collaboration techniques.
Using an iterative approach engenders a collective ownership; as the architecture and processes of the system settles, the team can move towards a complete working solution.
3.3 Business requirements
InfoSphere MDM Ref DM Hub has a concise focus on managing and maintaining code sets throughout an enterprise. The business requirements found in many implementations therefore have much in common. The following list and discussion of business requirements cannot be exhaustive, because there will always be special cases, but these do cover a large number of requirements found in real implementations. In planning any specific InfoSphere MDM Ref DM Hub project, you can expect to find a significant number of these requirements expressed by the business.
Reference data management requirements
Identify and collect the following requirements for reference data management:
Requirement to maintain and manage a standard view of reference data that is accessible and accepted throughout the enterprise.
In meeting this requirement, consider each code set that is stored in InfoSphere MDM Ref DM Hub and from the population of data sets. Select only those that have a truly enterprise-wide interest. Each of these enterprise sets contains only the number of values that are necessary. For example, a set of mandatory Customer Status codes might require all values in use throughout the enterprise included, and can be described as canonical (includes all possible values). In another example, the enterprise is interested in sales at a country level. The enterprise sales area set must include only countries where the enterprise does business. This will have a smaller number of values than in the address country code set.
Requirement to maintain and manage all reference data sets that are used in more than one business function or IT application.
There are many more code sets and variations of code sets found in various business areas, functions, and IT systems that are local. They might not have an importance to the enterprise view, but their complex management and mappings require a solution.
Requirement for a simple process to actively manage reference data in multiple systems.
The InfoSphere MDM Ref DM Hub enables a single point of authorship and revision. Code sets in InfoSphere MDM Ref DM Hub can be set by subscriptions to accept changes or propagate changes from and to external systems. The requirement is stated as making changes to reference data and distributing to downstream systems.
Requirement for a simple process to match the values across separate code sets. Now InfoSphere MDM Ref DM Hub compares any two reference data sets in one mapping. You can easily find the equivalent value in a similar set or different instance of the same set. This feature can help you automate in-flight reference data transcoding to convert values between different systems and formats.
Requirements to simplify complexity
The following examples are actual cases where the requirement was stated as “We need to make our applications and processes simpler. These requirements can all be enabled by using InfoSphere MDM Ref DM Hub to isolate and manage the reference sets involved:
Two hundred batch jobs have separate code equivalents.
There is siloed and parallel work duplication.
Different sources use different formats.
Many business processes are required.
Authorship is a business task but now given to IT to implement every case.
Data governance and compliance requirements
The data governance and compliance requirements have the following characteristics:
A need to demonstrate repeatable processes for enforcing data quality, ownership, auditing, reporting, storage, life-cycle management, security, sharing, and classification.
A need to incorporate into enterprise information management processes, for example, aligned in common logical data model, complete information catalog, metadata standards, and Information Management maturity model.
Are proved to various standards, for example, enterprise data warehouse, Solvency II, New Core banking systems, Foreign Account Tax Compliance Act (FATCA), Basel 111.
Data integration requirements
The following examples are of data integration requirements:
A need for a common unified classification for many sets.
A need for common consolidation rules and timings to allow an integrated flow for updates to the data warehouse. This is reference data including mappings and hierarchies.
Ability to import external codes sets such as ISO and any used by the international or statutory authorities.
3.4 Analysis of data
This section looks in detail at the reference sets that will be managed in InfoSphere MDM Ref DM Hub. For as many candidate reference sets as possible, the aim is to know the following information:
Reference set owner
Systems where reference set is used
Business processes where reference set is used
All values found in every instance of a reference set
Where the reference set has interfaces to and from
Versioning of the reference set if any
Validity dates of the reference set if any
Categorization of the reference set, with hierarchy, with compound key
Canonical or subset, that is, all values in an enterprise (canonical) or a subset used for a particular business process
In the process of getting and discussing this data, from sample data files, interviews, or workshops with business analysts and source system owners, it might be possible to discard sets or agree to adopt one instance of a set and impose that as a standard to take the place of other instances.
The outcome of analysis of this data is then used to provide information that informs other phases and activities in the InfoSphere MDM Ref DM Hub project, specifically the following items:
A list of business owners and numbers of sets owned.
This list can often be made into a reference data set itself and added as a property to other sets.
A measure of the complexity of the sets and any additional data required.
If many sets are found to need any additional data element, the elements can be included in a reference data set type, or in a linked code table added as a property to other sets.
An indication of the folder structure and any naming conventions needed to organize sets.
The number of mapping types needed to categorize mappings and the likely number of actual mappings.
Which sets are sources and which sets are targets in mappings.
It is likely that one set can appear as a target in one mapping and a source in another. For example District_Sales_Codes as a source to the target of Country_Sales_Codes; Country_Sales_Codes as a source to the target of Regional_Sales_Codes.
A list of external systems that can be linked by subscription to receive or propagate codes changes automatically.
3.4.1 Discovering reference data
The focus of InfoSphere MDM Ref DM Hub is on reference data sets and, by their nature, they are distributed throughout business functions and IT applications. There is often no central source from which you can identify all the existing sets. However, somewhere in the enterprise will be awareness of those sets that are causing problems. From this starting point, it is necessary to trawl through as many sources as possible. In discovery workshops, the business analysts and subject matter experts document those reference sets that they consider critical. Next is the task of discovering all sets. The following sources are likely:
Business analysts and subject matter experts
Business process descriptions
Logical data models and data glossaries
User interfaces, directly on-screen or from design documents
Physical models and database schema
The amount of analysis and discovery that can be done is limited by the normal constraints of resources and time. However, effort made here will simplify the solution in later stages of the implementation. For example, incorporating newly discovered attributes might require the rework of building new types and assigning existing sets to the new type, which must be achieved by deletion and re-creation.
3.4.2 Transforming business roles to RDM roles
This chapter has an assumption that a project team exists that has is a high level of business and technical skills and a considerable degree of continuity. Typically, the project team is composed of people from the client side and from IBM; their skills together can cover all the necessary subjects around an InfoSphere MDM Ref DM Hub implementation. All the work described is done by the team. Undertaking a project to implement InfoSphere MDM Ref DM Hub in a business is a substantial commitment that requires enabling people during the implementation process and all team members learning from their colleagues. Some part of the team typically remains after implementation to develop and maintain the solution.
The necessary skills range from a comprehensive knowledge of how the client’s business requirements are met in the current IT systems to detailed design and build of databases, interfaces, and services. The headings in this section are generic but well known; they vary from business to business. No numbers are given here because resources vary in availability and skill level. Every business will have a preferred spread of skills over a number of resources depending on availability and criticality.
The usual case is for InfoSphere MDM Ref DM Hub implementations to be staffed with a joint team made up of a combination of resources from the client and IBM with a third-party integrator if required. Any degree of skills transfer from IBM and any integrator to the client staff is possible depending on the client’s business practices, from requiring ongoing support in all development and maintenance of the product to a level close to self-sufficiency. The team effort of implementation is a good environment for skills transfer.
Enterprise architect
The enterprise architect role has three views:
A vision, shared by the business, of the future state that the company wants to achieve and an evolving road map of how IT will grow to support it getting there
A complete view of the existing business systems in use and their capabilities, ability to integrate, and potential for future enhancement
A prioritized set of all the shortcomings or difficulties with the present systems functionality to serve all business requirements
The enterprise architect’s involvement with the InfoSphere MDM Ref DM Hub project begins with the awareness of the business problem, as described Chapter 1, “Reference data management” on page 1 and continues through requirements analysis, product selection, and the overall implementation and integration. When InfoSphere MDM Ref DM Hub is implemented, the enterprise architect revises the views on the overall architecture by adding the InfoSphere MDM Ref DM Hub functionality and its relevance to business needs and capabilities. The ongoing promotion of use and development of InfoSphere MDM Ref DM Hub is overseen by the enterprise architect.
This role is universally found in the client organization and remains there, acting as liaison with business analysts, and learning the necessary influences, uses and potential of InfoSphere MDM Ref DM Hub.
Business analyst, subject matter expert
Many business analysts and subject matter experts are in any enterprise. Their knowledge of business processes, and associated IT systems and applications gives these analysts and experts a view of the actual use and distribution of reference data throughout. These people are also aware of the quality of reference data and the problems that must be overcome in using reference data. In planning and implementing an InfoSphere MDM Ref DM Hub project, a good practice is to consult as many business analysts or subject matter experts as possible because their roles often encompass subsets of departments, systems, and functions. Involving as many as possible will give the most comprehensive view across the enterprise and include all variations of practice and coding.
This role is universally found in the client organization and will remain there. As the implementation matures, more decisions regarding management of reference sets will be made by the business analyst community because learning and practice of InfoSphere MDM Ref DM Hub will transfer skills from the solution architect to the client team.
InfoSphere MDM Ref DM Hub solution architect
The InfoSphere MDM Ref DM Hub solution architect, sometimes known as the product consultant, designs the implementation based on business requirements and the goals of the client. The role collaborates with other technical and functional project team members to create a well-defined and agreed solution architecture, including any data models, services, interfaces, and custom and configuration work on InfoSphere MDM Ref DM Hub. All requirements and processes that require modifications to the standard product are evaluated by the solution architect for their feasibility and degree of difficulty, and designed into the implementation.
This role is typically supplied to the project by IBM or specialist integrator and their work of translating business requirements to technical requirements is done in close company with the business analysts and technical specialists. There should be complete agreement and documentation of the design. The solution architect working closely with business analysts and technical resources and will provide as much skills transfer as required by the client.
Technical specialist
The technical specialist has overall responsibility for the InfoSphere MDM Ref DM Hub solution construction, implementation, and system integration. This role ensures the solution can run in the client's technical environment, and that after it is installed, the product is configured as in the implementation design to meet the client's business requirements. The technical specialist is responsible for the technical design and correct functioning of any custom work in the solution and testing and integration with the client environment.
This role is typically supplied to the project by IBM or specialist integrator and can work with client technical resources. As with the solution architect, there is a good opportunity in the project environment for carrying out skills transfer as required. Tasks that are done by the role, such as scripting, and programming skills that support database performance tuning and leadership of developer teams, are probably done in-house to an extent; the technical specialists bring their deep specific product knowledge to these tasks.
Developer
The developer has specialist expertise and, during implementation, will create and assemble components to build any customization to InfoSphere MDM Ref DM Hub, write code, and execute unit tests, integration tests, and functional tests.
The developer role is a specialist role that works closely to the detailed designs that are provided by the technical specialist. A worthwhile approach is to use in-house developers, if available, because engagement on the project will build the InfoSphere MDM Ref DM Hub skills, and these skills will remain with the client.
Data steward
The data steward is a business user but with a deeper knowledge of how to use InfoSphere MDM Ref DM Hub in everyday operating tasks, principally through the InfoSphere MDM Ref DM Hub user interface. The data stewards execute tasks such as creating reference data sets, organizing the folder structures, creating types and mappings, and carrying out import and export jobs, which means all the regular maintenance and development of reference data management with InfoSphere MDM Ref DM Hub. A good practice is to introduce people who will be data stewards into the project at an early stage so they understand the solution and provide input on the current custom and use of data and codes, which might differ from existing specifications.
This role is universally found in the client organization and will remain there. Data stewards work with the business analysts and other users to actually operate the system.
3.5 Model design for planning
The model design here is a simplified view of the reference data management components and their relationships within InfoSphere MDM Ref DM Hub. These components are the major components that are considered for planning an InfoSphere MDM Ref DM Hub implementation. For a full discussion on the complete detailed data model in InfoSphere MDM Ref DM Hub, see Chapter 5, “InfoSphere RDM model design” on page 99.
Figure 3-1 shows a high-level view of reference data management components and their relationships in InfoSphere MDM Ref DM Hub.
Figure 3-1 High-level view of RDM components and their relationships
Reference data set type or mapping type
A data set type works as a definition. The data set type contains properties that apply to reference sets at both set-level and value-level. The InfoSphere MDM Ref DM Hub system contains many instances of sets. Every set must belong to one set type. The set type properties define and store metadata about the set.
The initial InfoSphere MDM Ref DM Hub installation has two basic set types: one for sets, and one for mappings (known as Default Type), which can be used for many sets and mappings. If more data must be stored to define or control a set and the set values, then a new set type must be created. Examples of custom properties are as follows:
Reason_Changed_Code: A mandatory code added when a value is expired
Source_System_Code: A field containing the origin of the set or value
Figure 3-2 illustrates a reference data set type that contains properties applied to all associated sets. Types can be created with custom properties added.
Figure 3-2 Reference data set type contains properties applied to all associated sets
Figure 3-3 illustrates the data mapping set types using an identical structure to reference data set type. The data mapping set type can be customized.
Figure 3-3 Data mapping set types use an identical structure to Reference data set type
The simplest approach is to create one type for every reference set so the structure in Figure 3-2 is always 1:1. Although this approach is simple to use initially, it can lead to an excessive amount of types to manage and repeated information.
Set types and mapping types can be organized by owner or by function or any classification system that works in a particular business. Consider using a naming convention for sets, types, and mappings. Naming conventions also apply to folders in the same way.
Reference data sets
Reference data sets are the instances of code tables that are managed in InfoSphere MDM Ref DM Hub.
Subscriptions
Subscriptions are a means to link to external systems, referred to here as managed systems (Figure 3-4). This mechanism allows automated import and export of changes between InfoSphere MDM Ref DM Hub and the managed system.
Figure 3-4 Reference data sets are linked to any number of external systems through a subscription
Any set can be linked to any number of external systems through one subscription. The flexibility of using subscriptions to manage external systems requires understanding the supported business processes and setting permissions on whether InfoSphere MDM Ref DM Hub is allowed to perform automatic updates in an external set or whether InfoSphere MDM Ref DM Hub will accept automatic updates from external sets. The most common scenario is that changes are authored in InfoSphere MDM Ref DM Hub and propagated to external sets.
All versions of a set are carried in the subscription, therefore, the external set must be aware of the versioning that is used or have a limitation built on what data to accept.
Folders and naming conventions
A folder structure in InfoSphere MDM Ref DM Hub is used to organize reference data sets. The folders can be organized into a hierarchy so that users can create as many folders and subfolders as they want. Each reference data set can exist in only one folder, different from some file explorers where copies can exist in multiple locations. It is possible to copy and move a set to another folder but InfoSphere MDM Ref DM Hub enforces a different name. The copy is intended for shortening the set authoring process, not for allowing duplicate sets.
The folder structure is seen in an InfoSphere MDM Ref DM Hub user interface and can be modified by dragging and dropping. When a new set is created, it is then listed under the folders and can then be dragged into the appropriate folder.
Imposing a naming convention on the organization of folders is beneficial. This convention establishes a consistent and logical way to distinguish folders and their contents. Many businesses already have their own formal or informal naming conventions to draw from and understand the subject areas and departmental boundaries. A strong naming convention enables users to browse the content of InfoSphere MDM Ref DM Hub more effectively and to add their own folders and sets without having to re-think the process of structure.
An example of a naming convention in current use takes two views of categorization, a set view and a departmental view.
Figure 3-5 shows a fragment of naming convention used in folder structure and also for naming of types and sets.
Figure 3-5 Fragment of naming convention used in folder structure
Set category
Sets are split into one of three categories:
Master set: This set’s content is regarded as the enterprise standard.
Source set: This set’s content serves one or more business needs and may be mapped to other sets and also to the master set.
RDM internal set: This set is used within InfoSphere MDM Ref DM Hub, for example a set of status codes.
Departmental category
Departmental category is shown in the illustration by the second level where the business structure is represented, and can be seen acting as a master data set, a source data set, or an RDM internal data set. The numbering given to the departments themselves is another taxonomy within the enterprise. The following values shown are samples:
01_Design
04_Production
02_IT
There is no constraint on which department can use which set category.
Reference data set values
After sets are established, an unlimited number of values might be added to any of the sets. Values can be added either manually or by using the import functionality. When adding a value or values to a set, you are constrained by the properties that are defined in the associated set type that is established for this set. When adding values, you can see the conditions of the properties fields, for example, whether they are mandatory, forced to be unique, part of a primary key, or have a default imposed.
Values translations
Values can be stored in an unlimited number of languages; the translation is a manual task but the input can be manual or done by the import function. The languages available are held in a reference set as part of InfoSphere MDM Ref DM Hub administration and are used to qualify the language used in other sets.
Values in hierarchies
Within any reference set, the values can be arranged in a tree such as hierarchy. A set might have an unlimited number of hierarchies and each one might contain all or some of the values in the set.
Hierarchies in sets are useful where there is a common set but different views are required, for example the GroupOrganizationCode, split by function.
Figure 3-6 shows one set represented in three hierarchies. The first hierarchy (on the left) contains all values; the second hierarchy (middle) contains only values that are used in Finance, the third hierarchy shows all values in a different order.
Figure 3-6 One set represented in three hierarchies
Sets in hierarchies, level-based hierarchies
When creating value properties in a reference data set type, you may choose to define the property by selecting another pre-existing reference data set. Only the values found in this other reference data set can then be used to populate the property.
There are the choices of format when creating a property. See Figure 3-7 on page 73. Selecting Reference Data Set allows you to nominate any set from which values might be taken to populate the property. Only values from that set are allowed.
Figure 3-7 Property format
An example of using set hierarchies is to answer the business requirement:
All Marketing reference data sets must have the number of the Approval Board included in the set data
To satisfy this business requirement, you can make a property at set level named Approval Boards. This property can be a free text field allowing anything to be written in it. There is also the option of choosing to link the property to a reference data set named Approval Boards. Only values found in Approval Boards can then be used in the set property.
Figure 3-8 shows reference data set MARKETING with the property Approval Board controlled by the contents of another reference data set. Here the values can only be AB001, AB002 and so on.
Figure 3-8 Reference data set MARKETING with the property Approval Board
Set mappings and value mappings
As with reference data sets, mappings cannot be created unless there is a pre-existing data mapping set type. As an example of the requirement and use for mappings, consider the case where analysis shows many instances of a product code set within a business. Several departments hold their independent versions of product codes for their own internal use, and there is a business requirement to communicate and transcode between the departments and their IT systems.
It is apparent that all instances of the product code set have a collection of common metadata, and this is needed in the mappings between any two sets. The common data is, for instance, a field containing a URL that points to a web page, which describes product usages. The URL property can be added to a mapping type named ProdCdTp, and the following rule introduced:
All product code set mappings must use ProdCdTp mapping type
This way enforces consistency in the mappings and helps locate all the mappings.
After the types and sets are identified, a mapping can be made. A mapping takes one set called a source and another set called a target. The use of the words source and target is subjective and in each case the user must make a decision based on the user’s business knowledge. This mapping is at set level.
The next step is to select values from the source set and link them to values in the target set. In many cases, the codes used are the same and the mapping is straightforward. However, in other cases, different values from different sets might have the same meaning or a close equivalent, so the user, with expertise, must establish a meaningful link.
Not all values in a mapping must be included; it is possible to map one value from either the source or target to many other values in the other set. This way is useful where a source set has a finer grain of codes and a target set has fewer codes, each of which covers a wider subject. For example, a source set with two codes “steel” and “iron” can both map to one code in the target named “metal.”
From the data in the sets shown in Figure 3-9, there is a requirement to consolidate the regional product codes to an enterprise master set of product codes.
Figure 3-9 Three sets of product codes, Sales Region 1, Sales Region 2, and a Master product code set
Figure 3-10 shows the result of mapping of Sales Region 1 product codes to the Sales Master set of product codes.
Figure 3-10 The result of mapping Sales Region 1 product codes to the Master set of product codes
3.6 Implementation
Planning for implementation takes in the experience of previous iterative cycles and will enable a straightforward implementation. The commented list here is an example. It does not replace he InfoSphere MDM Ref DM Hub documentation; it is a quick reference of the most important installation and configuration steps taken from an actual implementation.
Installation of infrastructure
The basic installed and tested components are DB2, WebSphere Application Server, and WebSphere MQ. These each require a small adaptation to the local environment file structures and paths.
Installation of InfoSphere MDM Ref DM Hub
InfoSphere MDM Ref DM Hub requires users and groups to be defined. This step can be done by linking to any pre-existing (external) repository such as LDAP or WebSphere Application Server, or through InfoSphere MDM Ref DM Hub. In this instance, the InfoSphere MDM Ref DM Hub the role-based security concept was used, where both activities and stated entities are related to roles that are linked to groups that are defined in WebSphere Application Server. The actual users are members of these groups. The following examples are of standard groups that were used:
RDMRole_Administrators
RDMRole_All
RDMRole_Approvers
RDMRole_Custom
RDMRole_Integrators
RDMRole_Stewards
InfoSphere MDM Ref DM Hub post installation activities
The following activities are done after the installation:
Adding any customer-specific customizations to InfoSphere MDM Ref DM Hub
Using scripts to apply additional constraints to the database (mandatory and optional changes, format and length changes)
Using scripts to create materialized query tables to be used for reporting
Using scripts to configure notifications arising from subscriptions
Testing and reviewing
Testing is done at all stages of development. Tests that are done in deployment are the final acceptance that the InfoSphere MDM Ref DM Hub is ready for use (it can “go live”). User acceptance testing is not normally repeated here because that process was closed earlier in the project.
For deployment, the following tests are repeated:
All InfoSphere MDM Ref DM Hub customizations unit tests
All customized code unit tests
System end-to-end testing where sets and types and mappings are moved through their full lifecycle
Integration testing in the user IT environment
User interface testing
Populating InfoSphere MDM Ref DM Hub
An order of creation must be observed when InfoSphere MDM Ref DM Hub is populated with data.
Figure 3-11shows a summary of the dependencies in InfoSphere MDM Ref DM Hub. Reading from left to right, the sequence is that types must exist before sets or mappings can be created. Sets must exist before they are used in mappings or linked to provide values in other sets.
Figure 3-11 Dependencies in RDM
The result of analysis and iterations identifies the actual reference data sets that are to be loaded into InfoSphere MDM Ref DM Hub. Sets can be loaded manually or by using a batch import job. Another possibility is to import sets with their associated types at the same time. In practice, a mix of manual and import routines will be used depending on the complexity of sets. Perhaps the effort of creating an import file is the same or greater than manually entering a set. All import files that were proved in previous iterations can be used at this stage.
Mappings can be built either manually or by importing; the same considerations of complexity apply as for sets.
Other tasks to complete during implementation include distributing reference data sets and mappings manually or by exporting, incorporating any workflow process, and setting up subscriptions.
This description is for planning purposes only. See Chapter 7, “Implementation” on page 159 for a full discussion and information about implementation.
 
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset