Steve Hoberman

Modeling Events and Documents

By Steve Hoberman on January 14, 2011
View Full Bio →

Welcome back. In my Data Modeling Master Class, one of the definitions I provide for an entity is that in almost all cases it is either a who, what, when, where, why or how. I go on to explain each of these, explaining ‘why’ as “Why is the organization in business?” The organization is in business because of all of the periodic (minute, daily, monthly, etc.) transactions that occur, such as orders, credits, returns, etc. There are numerous synonyms for ‘why’, the two most common being ‘event’ and ‘transaction’. I’ll use the term ‘event’ to refer to the ‘why’ in the rest of this blog.

The ‘how’ concept is “How does the business record that these transactions ever took place?” In other words, ‘how’ represents the paper trail. The “paper trail” could be a document (paper or electronic medium) and represents how this event or transaction gets documented. There are numerous synonyms for ‘how’, such as ‘agreement’, ‘contract’, ‘document’, and ‘evidence’. I’ll use the term ‘document’ to refer to the ‘how’ in the rest of this blog.

Events and documents go hand in hand. For example, if someone places an order for goods, the event is the order and the document can be the purchase order, packing slip, and invoice. If an event is a credit, the document is the credit memo. If an event is a purchase, the document is the receipt. And so on.

At a conceptual level, modeling event and document is pretty straightforward. Here is one common way of showing both:

Each Event can be recorded through one or many Documents.

Each Document must record one Event.

At a logical and physical level however, it gets trickier and I have seen lots of variations in how this is modeled. One common approach is to combine both the event and document together, usually into an Event entity. The advantage of this approach is simplicity, the drawbacks however are that events and documents behave differently and there could be conflicting requirements across events and documents. For example, regulations can require the actual documents be stored (or links to the documents). Reporting can require analyzing the rich information in documents such as emotional-sounding words in a discussion group or searching for the plaintiff in a contract.

The other approach is to model event and document separately. The advantage of modeling them separately is we can model individual behaviors and model for specific requirements, yet modeling them separately makes it more challenging to know where to place data elements and relationships. For example, would Order Number belong to the Purchase Order or to the Order Event? Would Product relate to the Order Event or to the Invoice? Also at times distinguishing event from document can raise ambiguity on where the line is drawn between data and process. The order process is outlined outside the model, most likely in a data flow diagram or prototype. The order entity contains only the data elements and business rules surrounding the order, and does not capture process. Sometimes the distinction between the Order Process and Order Event can become blurry when looking to distinguish the Order Event from the order documents such as the purchase order or invoice.

So like many things in our world, there are multiple ways to model events and documents. Feel free to share your ideas here too. Many of you know I send out a monthly design challenge, and I plan on making this event/document challenge the subject of my next puzzle, so make sure you’re on the list (sign up here: www.stevehoberman.com).

Until the next blog!

Follow all Expert Blog updates by subscribing to the RSS RSS feed.

About the Author

Steve Hoberman is one of the world’s most well-known data modeling gurus. He understands the human side of data modeling and has evangelized “next generation” techniques. Steve taught his first data modeling class in 1992 and has educated more than 10,000 people about data modeling and business intelligence techniques since then.

Jeff Harris
February 6, 2011

Steve,
Agreed to your comments and I fully appreciate your view, but the complexity comes in from the reason for storing the data.  In an example, an ODS (Operational Data Store), one would ideally like to capture the details of the document and not so much the original document.  In a DWH (Data Warehouse) one would only look at the rolled-up effect of the data and not too much regarding the details.  And in an application database, one there would like to record the original document (e.g. scanned in copy) and most probably key details about the document. 
So in short what I am saying is that to answer your question in regards to whether you combine the event and the document etc., is majorly dependent on the implementation that you will be performing.  What works in an application database will not always work in an ODS, and the same would apply in with a DWH.
I know I have not shed any light on the topic, but most probably muddied the waters even more; I just wanted to put thought points to the readers.
Jeff

Name:

Email:

Comment:

3 plus 11 is equal to?

Notify me of follow-up comments?