Posts From This Author
About Our Authors
Data Security, Data Governance, and Information Architecture
By David Loshin on January 24, 2012View Full Bio →
We have been doing a lot of work with clients who are essentially seeking to renovate their information architectures, largely under the guise of a technical challenge such as creating a master data repository, or in improving the quality of the way their business processes use information. There are two common themes that crop up with predictable frequency: the desire for better understanding and management of shared metadata and the need to analyze existing data sets to understand existing assumptions, data dependencies, and embedded integrity relationships. Both of these expectations can be met through an empirical analysis of data sets selected from across the organization. Different analytical approaches can be used (such as data profiling tools or crafted queries of table metadata or value frequency). These methods help expose the underlying structure, gaps in semantics, issues with referential integrity, and potential value anomalies so that the information architect can use reverse engineering skills as input to the information architecture process.
Curiously, though, there is one aspect of data governance that presents a curious barrier to achieving the stated goals, especially because its assertion and enforcement is designed to prevent the kind of analysis that is so often desired. The issue is information security, and before delving into the issue in great detail, I think a good way to illustrate the dilemma is through some personal anecdotes.
The first happened a few years back: we were on a task to develop an information architecture strategy for a mid-sized financial institution, and our colleagues employed by the client were keen to not just perform a data quality assessment, they wanted to watch us as we went through the process. They had already sunk a significant investment in data management tool technology (yes, many $$$), and were just waiting for the tools to be installed, configured, and made available for use.
The process actually seemed to take a little longer than expected. The tool had to be reinstalled. It turns out some other aspect of the suite configuration was done incorrectly, so they had to uninstall everything and start again. The tool itself had to configured for use. We had to be granted certain permissions for use of the tool. The tool had to be set up within a particular network domain. Individuals within the network domain needed particular rights for accessing the tool. Individuals with rights to access the tool also needed rights to access the data. We did not have those rights, so they had to set up the configuration again.
And so on and so forth. Essentially, a stream of security and protection and access right policies prevented us from getting access to both the tool and the data. This went on for days until our direct contact finally walked over to me, handed me an externally-connected USB drive, told us that he had run an extract and downloaded the data onto the drive, and asked if I could please load the data onto my own machine and do the assessment? Of course, the company had a strict rule about copying data to external drives as well as handing corporate data to non-employees, but that prevented them from being able to do the assessment so that they could improve the processes.
In a second case, our client not only had strict rules about copying data to external drives, they were physically enforced at the system level. You could not write to an external drive. And the data management/data architecture people we worked with did not have access rights to the data they wanted to analyze. So in essence, if they wanted to analyze the data to understand inherent structure and relational dependencies, they’d have to direct the people with access rights to perform the analysis, and then work with them to review the results, presumably without looking at the data! The upshot is that they are at a standstill: they can conjecture data architecture requirements but have not yet determined how to verify which data sets already conform to their expectations in a way that is consistent with their access control policies.
Third situation: With yet another client, external consultants can’t look at the data, but our internal contact does have data set access rights, but yet again, has neither access rights to analysis tools nor the ability to install software on his desktop system. They are extremely limited in their analysis because without being able to access the data through any type of analysis tools, they have to manually replicate the results using sets of SQL queries and manual review of query results.
Each of these scenarios demonstrates that one set of data governance policies, namely those associated with access rights, data security, and protection, pose constraints and barriers to data access, which then prevent the execution of processes related to other data governance, information architecture, or metadata management tasks. Interestingly, one potential root cause is that data access control and security management are viewed as system policies, not data management policies, and this demands collaboration between the system security team and the data governance team to negotiate the methods by which both sets of policies can be observed.
But this raises a more serious data governance question: to what extent do you extend visibility and access rights for the purpose of analysis, architecture, and stewardship? And in turn, to what extent must an organization specify the requirements for assigning access rights for those purposes? Do these type of requirements change the underlying job requirements for data management personnel? And ultimately, whose job is it to ensure that the data stewards and analysts continue to meet the requirements?
With a growing concern about data breaches, laws with increasingly strict about penalties for exposure of private or uniquely identifiable information, and greater media publicity of organizations whose lax policies or nonexistent controls allowed data leaks, the question of governance over data security is not going to go away soon. So clearly, there is fertile ground for evolving ways of interleaving management of data access policies with data architecture and metadata management policies, and we will look into some ideas in upcoming posts.
Follow all Expert Blog updates by subscribing to the
RSS feed.
About the Author
David Loshin, president of Knowledge Integrity, Inc, is a recognized thought leader and expert consultant in the areas of data quality, master data management, and business intelligence.
There have been no comments yet.




















