Posts From This Author
About Our Authors
Recursion via Subtypes
By Alec Sharp on August 25, 2010View Full Bio →
Way back in March, I set out to write a post on “those data modeling soulmates, generalization and recursion.” Here’s the chronology of how that worked out:
- In April, I ended up writing a post on entity naming, spurred by explaining the name of an associative entity arising from a recursive M:M relationship;
- Then, in May’s post, I discussed “generalization in general,” which did manage to touch on recursion when we looked at how a generalized entity coupled with a recursive 1:M relationship is far more flexible than a rigid hierarchy of ungeneralized entities;
- June’s post looked at five common errors in generalization.
Finally, I’m writing the post I initially set out to, focusing on a very specific way of combining generalization with recursion in which the recursion occurs through one or more of the subtypes. With all of the preceding posts covering the basics, this may well turn out to be the short post I always intend to write.
Recursion via subtypes
Many interesting business rules can be expressed in an entity-relationship data model by combining recursion and subtyping, but even among experienced modelers, I don’t see this technique used as often as I’d expect. Perhaps this post will provide some encouragement to give it a try.
The basic variations
There are two ways in which recursion can occur through the subtypes:
1 – Via a relationship between subtypes
2 – Via a relationship between the supertype and one of the subtypes
The following diagrams illustrate each variation. I’ve shown them with 1:M relationships, but they could just as well be 1:1 (although that’s rare) or M:M.

Of course, either a supertype or a subtype can be related recursively to itself, but here we’ll just consider the aforementioned two patterns.
I’ve had it pointed out to me that this isn’t really recursion, because the relationship line doesn’t “connect a box to itself.” However, it most definitely is recursion because in both cases one instance of an entity is connected to another instance of the same entity. The confusion arises because the connection is through a subtype of the entity.
Let’s also stress that point about connecting one entity instance to a different instance of the same entity. This can also be a source of confusion, especially for the inexperienced, as follows.
1 – Subtype to subtype relationship
The diagram in case 1 can be misinterpreted to mean that one of an instance’s subtypes is related to many of the same instance’s other subtype. This is impossible, because a given instance has precisely one subtype, at least if you follow the “mandatory, mutually exclusive, 1:1” guideline I covered in an earlier post. Thus, one subtype instance couldn’t possible be related to any number of its supertype’s other subtype instances, because there wouldn’t be any. What we actually have in this case is that Subtype 2 of Instance A may be related to Subtype 1 of Instance B, Subtype 1 of Instance C, and so on.
2 – Supertype to subtype relationship
In case 2, it could appear to some that an instance is related to one or more of its own subtype instances via a 1:M relationship. This can’t be correct because (a) the supertype is already related to the subtype via the mutually exclusive, 1:1 supertype – subtype relationship, and (b,) more important, one supertype instance couldn’t be related to many of its own subtype instances because there can be precisely one subtype instance for any one supertype instance. What the diagram actually illustrates is that the Instance A supertype may be related to subtype 2 of Instance B, subtype 2 of Instance C, and so on.
One final point of clarification – like all recursive relationships, these are fully optional. That is, they are optional in both directions. Otherwise, you have the data modeling equivalent of an infinite loop.
Let’s have a look at some examples to illustrate these two patterns at work.
Recursive relationship between subtypes
When a Trade Order is placed on a stock exchange, it is essentially an offer to buy or sell some quantity of a security. In this simple example, we’ll consider Trade Orders offering to either buy or sell a specified number of shares of a stock. Note the use of the words “or” and “either” – good clues that subtyping might be involved.
The stock exchange then “fills” placed Trade Orders by “matching” them with complementary Trade Orders where the buying and selling prices are the same or within some tolerance. Sometimes, this is a 1:1 match – a Sell Trade Order offering 10,000 shares of XYZ Industries is matched with a Buy Trade Order seeking 10,000 shares of XYZ. Just as often there are multiple “fills” or “matches” for a Trade Order. For instance, the Sell Trade Order for 10,000 might be filled by matching it with three separate Buy Trade Orders for 5000, 2000, and 3000 shares. Each of these is effectively a separate sale, and is referred to as either a “trade,” a “match”, or, the term we’ll use, a “fill.”
Using subtyping to distinguish between Buy and Sell Trade Orders is appropriate in this case, with a recursive relationship between the subtypes to represent the “filling” of Trade Orders. The example on the left uses a recursive M:M relationship to represent the following rules:
- Each Buy Trade Order may be filled by one or more Sell Trade Orders
- Each Sell Trade Order may be filled by one or more Buy Trade Orders
In the example on the right, the M:M relationship has been resolved, resulting in the Associative entity Fill. It nicely illustrates the rule that each Fill is a match between precisely one Buy Trade Order and one precisely one Sell Trade Order.

Recursive relationship between supertypes and subtypes
A familiar but very useful example is our old friend the “Party” (or “Legal Entity”) data structure in which Party has two subtypes, Person and Organization. This, in combination with recursion, allows the depiction of all sorts of important rules. For one client, a government registry, it was that an Organization may be owned by one or more Persons and/or one or more other Organizations. This rule can be simplified such that each Organization can be owned by one or more Parties, the supertype. That’s because in supertype-subtype structures, if a fact (attribute, relationship, or rule) is valid for all subtypes, then it is depicted as a fact about the supertype. In this example, the rule can be conveniently represented as a recursive “ownership” relationship between the Party supertype and the Organization subtype. We can’t have a recursive “ownership” relationship on the Party because that would allow Organizations to own Persons, and Persons to own Persons. Your job might make you feel that this is true, but it isn’t!
As in the earlier example, the left side illustrates the initial rule with a M:M relationship, and the right side shows it with the relationship resolved. By the way, in the model we developed for this client, there was another recursive relationship which recorded which Persons held “directorships” in which Organizations. Try modifying the model to reflect that as well.

Closing thoughts
Over the years, I’ve had many, many experiences in which this technique has substantially clarified not just my understanding of the situation, but my client’s understanding. This is one of the subtle but important benefits of representing certain rules with recursive subtype relationships – it can greatly clarify and simplify the situation. I recall modeling flight schedules at a global airline, and the team was well and thoroughly stuck by how to handle all the different types of flights that had been created by marketing arrangements – code share flights, co-share flights, overlap flights, joint venture flights, and others. Handling all the variations seemed intractable until a clear-headed data modeler (I wish it was me, but it wasn’t) stepped back and stated three important facts as assertions (always a useful technique before jumping into diagramming!):
1 – Each Flight is made up of one or more Flight Legs (an uninterrupted journey from one airport to the next);
2 – Each Flight Leg is either a “real” Flight Leg (as recognized by Air Traffic Control) or a “fictitious” flight leg (created for marketing purposes) which was a lovely generalization that resulted, ironically, in subtyping;
3 – Each “fictitious” Flight Leg is based on precisely one “real” Flight Leg, and each “real” Flight Leg may support one or more “fictitious” Flight Legs.
This led to a structure that vastly simplified the situation by employing both subtyping and recursion. Now model it – I’ve given you enough information.
(By the way, our model didn’t use the terms “real” and “fictitious” – we came up with something more palatable to the airline.) The important point is that this depiction was immediately embraced by the airline folks who saw it as a powerful simplification and a way to see that all the variations they’d created were fundamentally the same. Now that’s data modeling!
The other point I want to close with is that these examples shouldn’t be taken as an argument that all rules should be implemented within the data structure and be visible on the diagram – far from it. Many, perhaps most, rules should be expressed declaratively as (surprise!) a business rule. This is true in data modeling just as it is in business process modeling. In that arena, inexperienced practitioners often make the mistake of expressing complex rules within a modeling notation such as BPMN (Business Process Modeling Notation) where they can be extremely difficult to interpret by the people the diagram is intended to serve. A picture may be worth a thousand words, but some well-chosen words are often preferable to an overly complex picture.
Well, so much for the “short” post I’m always hoping to write. Maybe next time…
As always, I’d love to hear your thoughts on this post, and will welcome any suggested topics for future posts.
Follow all Expert Blog updates by subscribing to the
RSS feed.
About the Author
Alec Sharp has managed his consulting and education business, Clariteq Systems Consulting Ltd., for close to 30 years. Serving clients from Ireland to India, and Washington to Wellington, Alec has expertise in a rare combination of fields - data management, business analysis, business process improvement, and enterprise architecture.
“Dying career”? That made me think of this:
http://it.toolbox.com/blogs/infosphere/data-governance-is-career-suicide-38647
Certainly, top-down modeling is an (the?) important aspect of what I do, but I spend a lot more time reverse-engineering existing (pre-packaged) physical models and integrating them. Which means I actually spend a fair amount of time writing SQL for ETL purposes (I’m in the data warehousing game, so we get a lot of data force-fed to us with lot of surprises in every load!)





















October 13, 2010