The Different Layers of Master Data Management

Master data can be defined as data that is non-transactional, top level and relational business entities or elements that are joinable in observable ways. An organization may use master data on more than one platform or across a variety of software programs or technologies. Master data can also be referred to as data and objects which have been agreed on and shared across an organization. Master data covers a variety of data ranging from reference data, transactional, unstructured, analytical, hierarchical and metadata. It is used across major industries like the healthcare industry. Master Data Management in the Healthcare Industry has become a very sought-after service because of its usefulness and importance.

The Variety of Data

As opposed to what people might think, data is not homogenous. Data has different properties, behaviors, and management needs. So in its essence data is heterogeneous. The heterogeneous nature of data is one of the major reasons why master data management is needed. Master data management is used to in its essence to manage the qualitative differences among data entities both at the logical levels and physical levels. In recent times, there has been strong evidence that data can be categorized within a taxonomy that recognizes the different roles that data plays in the operational transactions of the enterprise.

The Layers of Master Data Management

Data can be classified into different layers and we will discuss 4 of these layers and their significance to a master data management system below.

Layer One – Metadata

The first layer of data is in most schemes is metadata. In a logical data model, the metadata is the descriptive information about entities, attributes, and relationships. For physical data, the metadata is the information about tables and columns.  Physical metadata is found in system catalogs of databases which is then materialized as tables.

Usually, metadata has semantic content that needs to be managed. The tables and columns have meanings and the metadata has to be implemented before the database is and should remain unchanged throughout the lifespan of the database. If you change the metadata, there will be impacts on all platforms the data exists. In some platforms you can run a debug test to try to correct it.

Layer Two- Reference Data

Reference data is any kind of data that is used solely to categorize other data found in a database, or solely for relating data in a database to information beyond the boundaries of the enterprise. Reference data is also referred to as code tables or lookup tables or domain values. This layer of data consists of a codes column and a description column. Typically, these tables have just a few rows in them. In general, the data in these tables changes infrequently. Because of this apparent structural simplicity, low volume, and slow rate of change, these tables don’t usually get a lot of attention because the data doesn’t change.

However, they can represent anywhere from 20% to 50% of the tables in an implemented database. Also, although they receive little attention, IT professionals and data analysts prefer to not change the values in them. Reference data does, however, have some things in common with metadata. The physical values of reference data have semantic content. The semantic property is why this data is used to drive business rules. If business rule logic refers to actual data values, it is a near certainty that these values will come from reference data tables. Reference data can be defined as follows:

Layer Three- Enterprise Structure Data

Enterprise structure data is what businesses use to make reports of their activities such as charts showing organizational structures, accounts, or a businesses process map. With enterprise data, one of the main setbacks is the difficulty involved in managing hierarchies. It is also quite difficult to change this kind of data. Inevitably, historical reports have to be produced from the perspective of the product line being the responsibility of either line of business.

Layer Four – Transaction Structure Data

In order for a transaction to occur, at least two parties must be involved and exchange operational data. The most common entities involved in a transactional data are the product and the customer (B2C) or the exchange of two products or two services (B2B). Transaction structure data is data that represents the direct participants in a transaction, and which must be present before a transaction executes. One major characteristic of transaction structure data is the fact that it is usually implemented as single tables that contain hidden subtypes.

Certain columns in a product table, for example, will only apply to certain kinds of products, or to products at a certain point in their life cycle, or to some kind of externally imposed grouping such as dangerous products. Sorting out what columns are relevant to a particular product record is a difficult and frequently neglected master data management challenge.

Transaction structure data typically consists of entities with large numbers of attributes, which makes them very easy to spot in data models. This class of data inevitably has problems of identity management. It is easy to appreciate for customers, whose names may be incorrectly captured or change. Yet even products can change their identifiers as they pass through their life cycle or are rebranded. Standardization of identity is extremely difficult to achieve with transactional data, even though it is the subject of many initiatives in this regard.