Although there is some crossover, there are stark differences between data architecture and enterprise architecture (EA). That’s because data architecture is actually an offshoot of enterprise architecture.
In simple terms, EA provides a holistic, enterprise wide overview of an organization’s assets and processes, whereas data architecture gets into the nitty gritty.
The difference between data architecture and enterprise architecture can be represented with the Zachman Framework. The Zachman Framework is an enterprise architecture framework that provides a formalized view of an enterprise across two dimensions.
The first deals with interrogatives (who, when, why, what, and how – columns). The second deals with reification (the transformation of an abstract idea into concrete implementation – rows/levels).
We can abstract the interrogatives from the columns, into data, process, network, people, timing and motivation perspectives.
So, in terms of the Zachman Framework, the role of an enterprise architect spans the full schema.
Whereas a data architect’s scope, is mostly limited to the “What”(data), and from a system model/logical (level 3) perspective.
We’re working in a fast-paced digital economy in which data is extremely valuable. Those that can mine it and extract value from it will be successful, from local organizations to international governments. Without it, progress will halt.
Good data leads to better understanding and ultimately better decision-making. Those organizations that can find ways to extract data and use it to their advantage will be successful.
However, we really need to understand what data we have, what it means, and where it is located. Without this understanding, data can proliferate and become more of a risk to the business than a benefit.
Data architecture is an important discipline for understanding data and includes data, technology and infrastructure design.
Data modeling is a key facet of data architecture and is the process of creating a formal model that represents the information used by the organization and its systems.
It helps you understand data assets visually and provides a formal practice to discover, analyze and communicate the assets within an organization.
There are various techniques and sets of terminology involved in data modeling. These include conceptual, logical, physical, hierarchical, knowledge graphs, ontologies, taxonomies, semantic models and many more.
Data modeling has gone through four basic growth periods:
Early data modeling, 1960s-early 2000s.
With the advent of the first pure commercial database systems, both General Electric and IBM came up with graph forms to represent and communicate the intent of their own databases. The evolution of programming languages had a strong influence on the modeling techniques and semantics.
Relational data modeling, 1970s.
Edgar F. Codd published ideas he’d developed in the late 1960s and offered an innovative way of representing a database using tables, columns and relations. The relations were accessible by a language. Much higher productivity was achieved, and IBM released SQL (structured query language).
Relational model adoption, 1980s. The relational model became very popular, supported by vendors such as IBM, Oracle and Microsoft. Most industries adopted the relational database systems and they became part of the fabric of every industry.
Growth of non-relational models, 2008-present. With increasing data volumes and digitization becoming the norm, organizations needed to store vast quantities of data regardless of format. The birth of NoSQL databases provided the ability to store data that is often non-relational, doesn’t require rigor or schema and is extremely portable. NoSQL databases are well- suited for handling big data.
Data modeling is therefore more necessary than ever before when dealing with non-relational, portable data because we need to know what data we have, where it is, and which systems use it.
The location and usage of data are key facets of EA. Without the context of locations, people, applications and technology, data has no true meaning.
For example, an “order” could be viewed one way by the sales department and another way to the accounting department. We have to know if we are dealing with a sales order from an external customer or an order placed by our organization to the supply chain for raw goods and materials.
Organizations using erwin Evolve can synergize EA with wider data governance and management efforts. That means a clear and full picture of the whole data lifecycle in context, so that the intersections between data and the organization’s assets is clear.
You can even try erwin Evolve for yourself and keep any content you produce should you decide to buy.