Creating a Single Source of Truth with a Governed Data Lake

Leveraging a big data, data lake environment to consolidate enterprise data into a single governed source of truth.

Project Overview

 The IT department of a national telecom provider was tasked with bringing viewership data under the IT umbrella. This involved updating, consolidating, and extending the data, as well as implementing management and governance on the data.

Previously, viewership data resided inside disparate business units and was not readily available to everyone who could benefit from it. RevGen was asked to lead the effort to understand the current environment and plan the overall migration into IT. In addition, we were responsible for recommending the needed changes, management, and governance that could be put in place to ensure accuracy, completeness, and confidence in the data going forward. 

Client Challenge

Our client had many business decisions that could have benefited from the insight of their viewership data. However, only portions of the overall data existed within the organization, and this data was built specifically to meet the needs of single business units and was not available enterprise wide.

Disparate data

After many years of mergers and acquisitions, the organization was made up of multiple legacy companies, each with its own unique collection methods and rules applied to viewership data. Standardization of the data was needed. Additionally, the existing data was not built for broad organizational use and had logic and rules applied for individual business units’ specific use cases.

Lack of quality controls

No one truly understood the extent of the data completeness and accuracy. Consistent data quality efforts were not in place.

Lack of holistic viewership data

There was no single place within the organization that all data for each of the company’s viewing platforms was collected for analysis.

Approach

RevGen knew a thorough analysis of the existing solutions was needed to first understand the current day environment’s processes and challenges. Also, understanding the needs of the current and future data consumers was an important step in understanding the desired result — a dataset processed enough to be used by the whole enterprise but not so specific as to exclude any business units. Along with our client’s IT representatives, we entered an analysis and planning phase which was immediately followed by implementation phases.

Solution

RevGen produced a multi-phased implementation roadmap to accomplish both near-term and long-term goals. The near-term phase included getting the core data from their live TV platform moved under the IT umbrella with audit logging framework, data quality framework, and standardization in place.

Long-term included the addition of platforms, one by one, following business prioritization and continued build out of overall data quality. RevGen assisted the IT department during implementation by taking on lead architecture, project management, and data engineering roles.

Multi-phased implementation roadmap

Detailed multi-phased roadmap that gave our client’s leadership the direction, staffing, and budgeting they needed to proceed with confidence.

Data quality framework

Detailed data quality framework that included technical implementation along with overall workflow including roles and responsibilities.

Consumption layer

Defined standardized datasets that would function as the consumable data, which was agreed upon enterprise wide.

Results

Within a very short timeframe RevGen was able to assist the company in understanding, planning, and delivering on the near-term phases of work. This resulted in final meaningful data that was available in a single governed data lake source for anyone in the enterprise to consume. This data lake continues to grow and has allowed for high value business use cases to be solved easily without analysts having to spend time sourcing, standardizing, and reconciling data. Today, their data lake is a multi-tenant environment serving the data and analytic needs of many different business units.

Enabled high value business use cases

This work helped enable, such as data monetization, increasing media revenue, reducing programming expenses, personalizing and improving the customer experience, mitigating churn, optimizing pricing, identifying upsell and cross-sell opportunities.

Improved user experience and efficiences

Saved analysts time pulling data from multiple systems and eliminated the need to reconcile and standardize disparate data across legacy companies.

High performing and extensible big data platform

This was the first enterprise-wide implementation of a cost-effective, extensible, high performing, multi-tenant Hadoop platform, utilizing technologies such as HDFS, Hive, Spark, HBase, and Kafka to enable additional high value big data use cases for the organization.

Success Stories

Subscribe to our Newsletter

Get the latest updates and Insights from RevGen delivered straight to your inbox.

Please Check to Accept Our Privacy Policy