Posts From This Author
About Our Authors
Master Data Processes and Data Usage
By Tom Haughey on May 20, 2010View Full Bio →
In the previous blog, we defined master data (MD) and master data management (MDM). Here will describe MDM processes and different types of MD usage. Data modeling plays an important role in MDM processes.
MDM Processes
Processes commonly performed in MDM solutions include:
- Source data identification. Source data includes data from internal and external systems. Typically the same data is stored in multiple places in the organization.
- Data collection and recognition of change data. MDM will need to access the existing sources, extract the data and recognize what MD has changed since the last extract.
- Data transformation. Once the data is recognized and collected, it will need to be transformed (and sometime enhanced) to a useable, consistent format.
- Data normalization. The purpose of this is to eliminate redundancy. This usually accomplished through conventional data modeling, including reverse engineering of existing MD databases.
- Enforcement of business rules. Business and integrity rules will need to be applied to the data to ensure data consistency.
- Data quality. It is essential that the quality and confidence of the data be increased through the detection, correction and prevention of errors in the data.
- Data consolidation. In this, disparate data is merged either physically or dynamically. Data mapping, which defines the transformation rules between source and target data, will be important in achieving this.
- Data storage. We persist the data to preserve it, to share it, to ensure its integrity and to allow access to it in multiple ways.
- Data movement. Because MD is highly shared, it will be moved around the organization more than any other type of data.
- Data governance. Standards, controls and organizations need to be in place to provide guidelines for all of the other processes.
Layers of Data
There are six layers of data in any organization.
- Metadata,which describes the logical or physical meaning of the data. Metadata comes from many sources, including existing databases, transformation rules, data models, reporting and analytical applications, among others. It is kept in repositories and in the database’s system catalog. Sometimes it is even included in database tables.
- Reference Data, which contains standard codes and descriptions. Tables containing this data usually have a comparatively small number of rows and columns, but are essential for consistency of the data.
- Core Business Entities (Master Data). This represents the parties to the transactions, facts and summaries of the enterprise. This is classical MD, such as, Customer, Product, Vendor and Employee. These are subject to change over time. An organization may have many transactions daily, though these do not necessarily change in and of themselves. Overall, MD can be changed often, for example as a customers call in modifications to their indicative data.
- Hierarchy Data.This is data that describes the structure of certain core entities, such product or organization. Typical hierarchies are an organizational or financial reporting structure. Two typical examples of hierarchy data are the product structure at the operational level and the product reporting structure at the analytical level.
- Transaction Event Data.This represents the fundamental activities of the business the traditional focus of systems, such as Order, Delivery, or Brokerage Trade. This includes aggregate data, which are usually summaries of event data, rolling up along the levels of hierarchy data.
- Audit Data.This data that tracks the life cycle of individual changes. It usually includes the before-image of the changed data. It can include server logs. This data is generally not made available to the typical business consumer.
The term MD typically includes Core Business Entities, Hierarchy Data and Reference Data. Of course, Metadata needs to be defined for all data.
Changing the Grain of Data
For the general purposes of MDM, it may be necessary to make two surprising types of changes to any master data, namely to generalize the MD and to increase the grain of MD.
MD is often generalized to make it more easily maintained and shared. For example, one could have entirely specialized entities for Customer, Employee and Vendor. Instead one could generalize these into Business Party, thereby subsuming Customer, Employee and Vendor as three as subtypes. Generalization should include entities, attributes and relationships. For example, instead of a specialized Customer and Cosigner relationship, it could be generalized into Business Party and Business Party Relationship. This would enable greater data sharing.
The grain of the original data may need to be increased to allow for multiple companies and countries. This change in grain is usually done by adding to the current natural key. For example, organizations have customers. However, a multinational company may need to add Company Code and Country Code to the primary key of customer (and other entities) allows it to differentiate data coming from or belonging to different multinational sources.
Three Types of MDM
Typically three types of MDM are defined, namely, collaborative, operational and analytical. They are often presented as different styles of MD Management. This is not quite so. Operational and analytical describe the MD content and how the MD is used. Operational MD supports the transaction events of the business. Analytical MD supports reporting and business intelligence environments. Operational and analytical data are not completely mutually exclusive. An organization will have both types of MD. The content of the data can be similar but also quite different between operational and analytical MD. For example, Operational MD can include basic customer data. Analytical MD can include customer demographics and marketing data. They will each move differently throughout an organization. Operational MD will feed transactional systems, such as a product bill of materials. Analytical MD, such as a product roll-up hierarchy, will feed reporting systems. So these represent kinds of master data and methods of MD usage, not so much types of MD management. However, collaborative MDM describes a method for creation and maintenance of the MD in which different parties in the organization work together to produce the MD. MDM should be inherently collaborative.
Next Blog
In our next blog, we will describe the Patterns of MDM, namely, Coexistence, Registry, Transactional Hub, and Consolidation.
Follow all Expert Blog updates by subscribing to the
RSS feed.
About the Author
Tom Haughey is considered one of the four founding fathers of Information Engineering in America. He is currently President of InfoModel, LLC, training and consulting company. His courses on data management, data warehousing, and software development have been delivered to Fortune 100 companies around the world.
There have been no comments yet.




















