As data-driven business becomes increasingly prominent, an understanding of data modeling and data modeling best practices is crucial. This posts outlines just that, and other key questions related to data modeling such as “SQL vs. NoSQL.”
Data modeling is a process that enables organizations to discover, design, visualize, standardize and deploy high-quality data assets through an intuitive, graphical interface.
Data models provide visualization, create additional metadata and standardize data design across the enterprise.
As the value of data and the way it is used by organizations has changed over the years, so too has data modeling.
In the modern context, data modeling is a function of data governance.
While data modeling has always been the best way to understand complex data sources and automate design standards, modern data modeling goes well beyond these domains to accelerate and ensure the overall success of data governance in any organization.
As well as keeping the business in compliance with data regulations, data governance – and data modeling – also drive innovation.
Companies that want to advance artificial intelligence (AI) initiatives, for instance, won’t get very far without quality data and well-defined data models.
With the right approach, data modeling promotes greater cohesion and success in organizations’ data strategies.
But what is the right data modeling approach?
The right approach to data modeling is one in which organizations can make the right data available at the right time to the right people. Otherwise, data-driven initiatives can stall.
Thanks to organizations like Amazon, Netflix and Uber, businesses have changed how they leverage their data and are transforming their business models to innovate – or risk becoming obsolete.
According to a 2018 survey by Tech Pro Research, 70 percent of survey respondents said their companies either have a digital transformation strategy in place or are working on one. And 60% of companies that have undertaken digital transformation have created new business models.
But data-driven business success doesn’t happen by accident. Organizations that adapt that strategy without the necessary processes, platforms and solutions quickly realize that data creates a lot of noise but not necessarily the right insights.
This phenomenon is perhaps best articulated through the lens of the “three Vs” of data: volume, variety and velocity.
The three Vs describe the volume (amount), variety (type) and velocity (speed at which it must be processed) of data.
Data’s value grows with context, and such context is found within data. That means there’s an incentive to generate and store higher volumes of data.
Typically, an increase in the volume of data leads to more data sources and types. And higher volumes and varieties of data become increasingly difficult to manage in a way that provides insight.
Without due diligence, the above factors can lead to a chaotic environment for data-driven organizations.
Therefore, the data modeling best practice is one that allows users to view any data from anywhere – a data governance and management best practice we dub “any-squared” (Any2).
Organizations that adopt the Any2 approach can expect greater consistency, clarity and artifact reuse across large-scale data integration, master data management, metadata management, Big Data and business intelligence/analytics initiatives.
For the most part, databases use “structured query language” (SQL) for maintaining and manipulating data. This structured approach and its proficiency in handling complex queries has led to its widespread use.
But despite the advantages of such structure, its inherent sequential nature (“this, then “this”), means it can be hard to operate holistically and deal with large amounts of data at once.
Additionally, as alluded to earlier, the nature of modern, data-driven business and the three VS means organizations are dealing with increasing amounts of unstructured data.
As such in a modern business context, the three Vs have become somewhat of an Achilles’ heel for SQL databases.
The sheer rate at which businesses collect and store data – as well as the various types of data stored – mean organizations have to adapt and adopt databases that can be maintained with greater agility.
That’s where NoSQL comes in.
Despite what many might assume, adopting a NoSQL database doesn’t mean abandoning SQL databases altogether. In fact, NoSQL is actually a contraction of “not only SQL.”
The NoSQL approach builds on the traditional SQL approach, bringing old (but still relevant) ideas in line with modern needs.
NoSQL databases are scalable, promote greater agility, and handle changes to data and the storing of new data more easily.
It perhaps goes without saying, but different organizations have different needs.
For some, the legacy approach to databases meets the needs of their current data strategy and maturity level.
For others, the greater flexibility offered by NoSQL databases makes NoSQL databases, and by extension NoSQL data modeling, a necessity.
Some organizations may require an approach to data modeling that promotes collaboration.
Bringing data to the business and making it easy to access and understand increases the value of data assets, providing a return-on-investment and a return-on-opportunity. But neither would be possible without data modeling providing the backbone for metadata management and proper data governance.
Whatever the data modeling need, erwin can help you address it.
erwin DM is available in several versions, including erwin DM NoSQL, with additional options to improve the quality and agility of data capabilities.
And we just announced a new version of erwin DM, with a modern and customizable modeling environment, support for Amazon Redshift; updated support for the latest DB2 releases; time-saving modeling task automation, and more.
New to erwin DM? You can try the new erwin Data Modeler for yourself for free!