Tom Haughey

The Conceptual Data Model

By Tom Haughey on March 4, 2011
View Full Bio →

A conceptual data model is a high level or coarse data model which is preliminary in structure, possibly abstract in content and sparse in attributes, that is intended to represent a business area. It is preliminary in structure because it may contain many-to-many relationships. It is abstract in content because it may need to suppress some details so as to subsume different subtypes. It is sparsely attributed because it will usually contain only major business attributes. Because of these characteristics, you cannot implement a conceptual data model. An example is shown at the end of this blog.

Given these characteristics, why in the world would we have one? There are three main reasons:

  • For management presentations. If you have to justify or explain a project to management, you do not have forever. Management has a short attention span. This is not a value judgment about management. If you are presenting to senior, or any non-technical management, you need to get to the point in (say) 15 minutes or so. You have to be able to explain the issue to them clearly and quickly. A conceptual data model is ideal for this purpose. For this reason, the conceptual data model should fit on a single page.
  • For planning. A project that is in the planning stages does not need detail yet. You need only know what overall data, and sometimes only what types of data, are contained in the business area. A mass of detail is not necessary for this. The conceptual data model serves this purpose.
  • To start top-down modeling. There are two circumstances for which a conceptual data model is necessary: new development and major maintenance. New development is development from scratch. Only about 20% of software projects are new development. It is dangerous and cumbersome to do new development only bottom-up. In major maintenance substantial new requirements are added to an existing system. Such projects will start bottom-up when reverse engineering the existing database. The conceptual data model will be necessary to understand in what way to introduce the new requirements.

Let us expand our thoughts on bottom-up in data modeling. Use-case modeling is very popular these days. Use-cases are a simplified form of process modeling and an enriched form of process definition. It is helpful to employ use-cases to expand a data model. BUT the data model should not rely exclusively on use-cases. All data modeling should always be a combination of top-down, bottom-up and middle-out. New development starts top-down. Maintenance starts bottom-up. Developing a data model exclusively from use-cases is essentially doing a data model bottom-up. It restricts the kind of creative thinking that should naturally go into a data model. The conceptual data model is used to start the top-down process. The use-cases provide the bottom-up perspective. Working with the data model in and of itself, exploring business needs, and then expanding the data model from within is the middle-out part of the process. It is where the real creativity occurs. The data modeler and use-case modeler should be simultaneously exposed to requirements from the business experts. Then, individually, each develops its own unique deliverable from these; the data model in one case, the use-case in the other. And then these two models should be concurrently interacted and balanced one against the other. A future blog will discuss model interaction.

Caveat
The term “conceptual” is not an absolute term with one usage across the IT industry. For example, any methodology or product based on the French Merise methodology will define “conceptual” differently than in this blog. In the Merise methodology, the term “conceptual” refers to the fully detailed, business oriented data model that is independent of optimization and technology. In the US, the term “logical” data model is generally used to describe such a model.

Follow all Expert Blog updates by subscribing to the RSS RSS feed.

About the Author

Tom Haughey is considered one of the four founding fathers of Information Engineering in America. He is currently President of InfoModel, LLC, training and consulting company. His courses on data management, data warehousing, and software development have been delivered to Fortune 100 companies around the world.

PhilJJr
March 28, 2011

Great Article!  I find that an Entity level model with Definitions in the Entity “display box” a good way to get the conversation going as well.

Phil

Tim Hosking
July 8, 2011

I am really not sure why a conceptual model has to only be “coarse-grained” or “high-level”. I would prefer to say that a conceptual model only contains attributes that are important to the business. I would not expect to see foreign keys or “audit” attributes like “Last Updated By User Id” in a conceptual model. But I would expect to see the full number of attributes that a business person would be looking for. Otherwise, how can you ensure you have all the information needs of the business covered? I would agree that you can use complex attributes at times - e.g. there may be no need to spell out all the possible fields in a structured address entity - these can be shown as a few attributes, like “Structured Street Address” along with “City”, “State”, etc. That would make it easier for business people to read.
Also, if you only include a few attributes, you may miss the opportunity to find that new entities are needed, or you may miss a pattern that could lead to a simpler design.
In my view the conceptual model is a vital input to the logical model, which will produce the database design.

I see conceptual models as being much more than jsut a few entities with their definitions, but much more about capturing all the business information needs as well as the business semantics.

Further, I would be wary of basing a data model on use cases as they talk about the interaction a user expects to have with a system, and often do not cover the entire business process. In my experience business process modellers and data modellers should be working closely together. In BPMN you have the data object whcih can be used to link to entities in a data model. If a BP modeller finds that there are no entities he can link to in a process that updates some data, then has uncovered a gap in a data model. Conversely, if he sees an entity in a data model with no corresponding process, he may have uncovered a gap in his process models. The same works for the data modeller watching the work of the BP modeller.

Maniragaba Innocent
September 9, 2011

Hello, dear I’d like or wish to ask you helping me to design a conceptual data model of the management of parks information related to registration of visitors, their payment and transportation agencies. thank.

Name:

Email:

Comment:

3 plus 11 is equal to?

Notify me of follow-up comments?