Tom Haughey

Data Management and Service Oriented Architecture (SOA)

By Tom Haughey on March 15, 2010
View Full Bio →

Data management is essential to all system development. Service Oriented Architecture (SOA) is no exception. SOA is a software architecture that defines the use of loosely coupled services to support process and information requirements of the business and its software consumers. It is called service-oriented because the resources in a SOA environment are made available on a network as independent services. These services can be accessed without knowledge of their underlying implementation. As an architecture for modern systems, SOA is here to stay.

Two data management disciplines are important to SOA, namely, data modeling and data mapping. Let us examine the context of SOA first to help us understand where these fit in.

SOA has 5 levels of abstraction, as follows:

1. Business event layer handles the business occurrences (events) that are triggers or responses.

2. Processes layer handles the activities performed to support the business events.

3. Services Layer contains the atomic or composite services that support the process layer.

4. Data Services (DS) Layer deals with the physical connection to the data.

5. Data Abstraction (DA) Layer holds mappings between the required data and the physical data. It provides the data to the consumer.

Here’s where data management comes in. SOA provides data according the Canonical Information Model governed by the Canonical Data Model. What is a canonical model? It is a logical model independent of technology, implementation and application specific data and rules. There are two general kinds of data: data at rest and data in motion. Correspondingly, there are two canonical models, a canonical data model (CDM) and a canonical information model (CIM).

The CDM represents data at rest. It is a logical model of common data independent of application. This model is not physically implemented or materialized as such. It is the basis for definition of the data requirements. This model is represented in ER (entity relationship) format and is normalized.

The CIM is data in motion. It represents exchanges of data. It is hierarchical and represents the information as parties will use it. The data exchanges in the CIM are hierarchical projections from the CDM. The CIM can have redundancies. This model is usually represented in XML format but can be in other formats

In SOA, data can come from many sources. This is where data mapping comes in. Each source is a physical database and has its own physical data model. For example, insurance policy data can come from many source systems in the company. The consumer wants to see one policy status. Data mapping is what achieves this. It is the process of showing how source data is transformed into target data. Target data is the data the consumer uses and sees. Source data is the original data needed to provide the target data. Data mapping is ETL (extract-transformation and load) on-the-fly. Think of a data map as a spreadsheet that defines the source data, the target data and the transformation rules. Source data can be in heterogeneous formats and contexts. Target data should be in one format. This is what makes SOA so practical – you do not have to change the sources just for SOA. You define the target, you define the source and you define how one is transformed into the other.

The canonical data model shows you the data and its rules. The canonical information model shows you the data as consumers will use it. Data mapping shows you how you go from one to the other. The glossary defines all the business terms. The following diagram illustrates this:

Master Data Management (MDM) plays an important role in all systems, especially SOA. MDM will be discussed in our next blog.

Follow all Expert Blog updates by subscribing to the RSS RSS feed.

About the Author

Tom Haughey is considered one of the four founding fathers of Information Engineering in America. He is currently President of InfoModel, LLC, training and consulting company. His courses on data management, data warehousing, and software development have been delivered to Fortune 100 companies around the world.

There have been no comments yet.

Name:

Email:

Comment:

An … a day keeps the doctor away. What word is missing?

Notify me of follow-up comments?