Avoid Recurring Data Projects: Metadata the Unsung Hero
November 15, 2021
Organizations are increasingly engaging in projects to accelerate data democratization. Data democratization’s focus is bringing users closer to data, breaking down silos, and creating value from the organization’s valuable data. To accomplish this, organizations take on many initiatives such as leveraging modern cloud architectures, data virtualization, replication tools, implementing data governance programs and technologies, establishing Centers of Enablement for BI, introducing low code/no code AI and ML platforms… and the list goes on.
At the strategic level, these programs all make sense and get complete buy-in, that is, until the value promised in the business case is not realized. Organizations like Adastra are often called in to help diagnose and remediate ground-level issues that prevent the complete fulfillment of data democratization. Too often, organizations rely on the promise of technology without truly ensuring a sustainable, reusable data layer. Even in organizations that have implemented some Data Governance programs, we often get engaged to help address data management issues. We want to introduce herein the concept of data reusability meaning, having the ability to leverage the same data across multiple use cases and environments with the assurance that the source data is well understood.
To start, organizations need to have a strong Data Governance program, including technology platforms and organizational capabilities fully developed and deployed. This, coupled with modern architecture, helps lay the foundations of data reusability, but this is not enough. Unfortunately, as time goes by, even the best curated data lake will eventually turn into a data swamp. While this is good for organizations like Adastra and ensures predictable revenues, it also pains us to see our clients in this boat. The question is… “Why does this continuously happen?”. The answer lies in a small unsung hero, often overlooked, called Metadata.
The concepts of Metadata are well understood, but too often not fully leveraged and properly mastered. We have seen recently that organizations that have embraced the mastering of Metadata have seen significantly reduced costs in data management and have attained the nirvana of data reusability. As a starting point, let us fully re-iterate the elements of Metadata management.
Metadata is the data behind the data. It is the description, where it originates, and how it is represented. Metadata management aims at correctly defining, integrating, managing, and sharing reliable metadata within an organization through the combination of organization, policies, processes, procedures, standards, and technology. This is different from Data Quality which aims to ensure that the data is consistently based on definitions, and can be corrected through Data Quality tools and business enablement.
With the adoption of emerging technologies and the shift to digital business strategies, organizations are experiencing an influx of data. This new and vast data offers in-depth insights and holds great potential, but without properly defined and managed metadata, there is little context and comprehension, and the value of the data is lost. With a centralized metadata management system, data across the organization can be used for various analytics needs, ensuring its uncompromising purpose spans your organization’s business initiatives.
The Business Glossary is business vocabulary that is well understood across the organization. It is a list of business terms that ensures everyone has the same understanding of their meaning as it pertains to specific data element, e.g. what is a sale, what is a product
The Data Dictionary is the collection of attributes for each data element. It details the properties of the data element, e.g. attribute type, attribute name, rule for validation
Data Lineage is the cataloging of the movement of data elements from one system to another. It provides critical information in terms of the source of the data, the target environment, transformations, mappings and transitions.
We have found that organizations that have gone through the exercise of collecting, cataloguing and mastering Metadata are 10x more likely to succeed in their projects, have 7x less costs after implementation and see a 10x fold in the ROI. We also see these types of projects deliver faster and within budget.
At first glance, the task to master Metadata seems daunting, time consuming and tedious. It can also feel like this will be very expensive to do across the enterprise. At Adastra, we have developed tools and accelerators to help clients quickly, accurately and cost effectively master Metadata. We also provide organizational guidance on developing competencies to maintain high quality of Metadata going forward.
As you prepare for your next impactful data project, don’t forget about the little hero that will make you successful: Metadata.