Alec Sharp

Taking it from the Top

By Alec Sharp on September 9, 2010
View Full Bio →

Oh-oh, here we go again. My last few posts started out as a single point, but by the time I’d covered all the necessary background and preamble, it took four posts, and therefore four months, to cover my initial topic, “generalization and recursion.” This one appears set to follow a similar trajectory. Here’s how things have evolved:

  • At a recent conference I asked one of the other speakers a question, which was dismissed as being “simply a Fifth Normal Form problem.” I pushed back a little, and it was immediately apparent that the speaker knew the words to the formal definition of 5NF, but not what it really meant. “Aha,” I thought, “I have a blog post.”
  • As soon as I started writing about 4NF and 5NF, though, I realized that jumping over First, Second, and Third Normal Forms was a bit rash, especially given that very few people, in the grand scheme of things, really understand them. I run my Data Modeling workshops all over the world, and a constant theme is that about half of the participants have studied normalization in school, and virtually none of them – even the data management professionals – can explain normalization in a way that is understandable to mere mortals. This is a problem – if we can’t explain some of our basic tenets, we’re unlikely to get the traction we need.
  • Once I started writing about how I make normalization relevant to mere mortals, I realized I had to step even further back and describe the graphic principles that make normalization easy to explain. That led to the starting point for this post – graphic principles in data models.

So, let’s look at some of the importance principles in how we draw our data model diagrams, beginning with a real-life example to illustrate its importance.

A story
Many years ago, I had an experience on a consulting engagement that illustrates the importance of the layout of a data model diagram. Since then, the same theme has been repeated many times, and not just with data models – the same principles apply in process models, lifecycle models, and so on – but this example is a personal favourite.

My client had engaged a consulting organization to develop a data model that would reflect some important changes in a critical business area. (I was working in a different area at the time.) As I understood the situation, the job went well until it was time to present the model for validation to an audience of senior business and IT directors, at which point things unraveled. The consultants conducted several review presentations, but no one was getting anything, except perhaps a headache. In exasperation, I’m told, one of the senior managers suggested that perhaps Alec (the funny-looking, bald-headed, Canadian consultant) should be asked to present the model on the grounds that “He seems to be able to make things understandable.”

I was game, so I set about learning the model, which was no problem – it was extremely well thought out, with just the right level of generalization the business needed, and appropriate use of some common patterns. So what was the issue? Well, I can’t say that the graphic layout was 100% of the problem, but it was part of it. The consultants had used a form of the “fundamental entities on the bottom, dependency flowing up” layout popular in the Oracle world. (More on this next month.) Models drawn in this fashion are far better than the vast majority of models I encounter, but just don’t align with the instinctive preferences of most people, including me. The model was excellent, as I said, but I really didn’t know until I’d redrawn it – that was actually my first step.

I didn’t have access to the model in a data modeling tool, so I used my favourite tool – the Post-it. I wrote each entity name on a Post-it, and then laid them out according to dependency, strictly top-down. That helped me see the data model in the way that is most intuitive for most people. Think about all the diagrams you see that could be drawn left-to-right, bottom-up, middle-out, or randomly, but are consistently drawn top-down: family trees, organization charts, biological (or other) classification schemes, decision trees, and so on. In all cases, the fundamental concepts go to the top, and the more granular ones go to the bottom. Why not apply the same guidelines to data models, where there is a fundamentally important factor – dependency – that lends itself perfectly to an organized layout? When time or sequence is involved, the natural order is from left to right, but we’ll save that discussion for a future post on process modeling. I’ll also acknowledge that in different cultures what is “natural” is different, and as someone who works all over the world, I encourage you to adapt these guidelines to the bottom-up, left-to-right environments you are working in.

Returning to the story, I ended up presenting the model (someone else’s!) to an audience of senior managers. Using some guidelines I’ll cover in a future post on “presenting models” I built the model iteratively, starting with the fundamental (“kernel”) entities across the top of a whiteboard and working my way down through the family tree of dependent entities. It went really well, and lots of good questions came up. After about half an hour, though, came what is essentially the punch line of this story. One of the senior IT managers abruptly stopped my presentation by saying “Wait a minute, there’s something here I don’t understand. We went through several presentations of this model, and none of us got it, and now this fellow has been presenting for 30 minutes and we’re all eating out of his hand. (my italics) Can someone tell me what happened?”

I wanted to say “Well, it’s because I’m an awesome Canadian data modeler” but reason prevailed, so I took a different tack. My answer was essentially “The model you’d originally been shown was excellent, which is what I hoped to show – all I did was reorganize it in a way that made it easier to understand.”

Since then, variations on this scenario have played out many times – taking an existing model, redrawing it so it aligned with people’s natural inclination, and getting a great response. In fact, a high percentage of the income I earn from data modeling involves precisely this – redrawing an existing model to make it understandable to a broad audience. As often as not, some level of simplification is also involved to bring it from the physical or logical level up to the conceptual level, which is yet another topic for a future post. By the way, I’m speaking about E-R models depicting operational data, not dimensional or star schema models.

The bottom line is that models in which dependency flows from top to bottom are most understandable to most people, just as process models and project timelines convey information best when they’re drawn left-to-right (cultural variations notwithstanding.)
Given that dependent entities are essentially always at the “crowsfoot” end of a one-to-many relationship, we’ll see on our diagram that the crows are usually standing “upright.” Relationships that are not based on a parent-child dependency will go from side-to-side, so the crows will be sleeping (or drunk) with their happy little crowsfeet pointed to the side. In a conceptual model, that’s the way all M:M relationships would be drawn – side-to-side. Here’s an example showing some typical configuration elements – it mixes things that would normally be seen only at the logical level with things that belong at the conceptual level (like the M:M relationship) but it will serve for now to illustrate what I’m talking about. I’ll have lots more to say about this next month!



You’ll note that none of the crowsfeet point upward – if they were, that would be a sure sign of a dead crow, something you’ll never see in one of my diagrams. Hence my dictum “No Dead Crows.” Jessica Marz, a participant in one of my Data Modeling workshops a few years ago, took this guideline to heart and produced the following graphic for me. I love it!

Next month

I’m just getting warmed up on the topic of following sound graphic principles in laying out an E-R diagram. In the next post, we’ll look at some specific guidelines and examples related to the “no dead crows” guideline, as well as a few other graphic principles to keep in mind. I know from 30 years in the game that data modelers are loathe to give up their preferred approaches, and I can already here some objections (and maybe some cheers?,) but please come back for more on this important subject. In the meantime, let me know what you think.

Follow all Expert Blog updates by subscribing to the RSS RSS feed.

About the Author

Alec Sharp has managed his consulting and education business, Clariteq Systems Consulting Ltd., for close to 30 years. Serving clients from Ireland to India, and Washington to Wellington, Alec has expertise in a rare combination of fields - data management, business analysis, business process improvement, and enterprise architecture.

Alan Gredell
August 25, 2011

Alec, this is just how Bill Smith taught it as well.  Excellent points about how much easier it is to understand concepts by putting the fundamental ones at the top, and the ones dependent upon them below. 

The point about many-to-many relationships being side-by-side wasn’t (I dont think, but it’s been several years) taught by Bill, but it makes perfect sense.  I’m really enjoying this series!

Todd Everett
October 19, 2011

Great post - thanks for this.  I just discovered your series and am reading through them now and learning what I can.  I never did understand the “up and left” philosophy and I always draw my models “down and right”.  But I never though about ensuring parent to child relationships always go down and associative always go left/right.  I’ll take up that tip.  Thanks.

Name:

Email:

Comment:

What is missing: North, South, East?

Notify me of follow-up comments?