Skip to main content

DATA INSIDER

What Is Dark Data?

Dark data is all of the unused, unknown and untapped data across an organization, generated as a result of users’ daily interactions online with countless devices and systems — everything from machine data to server log files to unstructured data derived from social media.

Organizations may consider this data too old to provide value, incomplete or redundant, or limited by a format that can’t be accessed with available tools. All too often, they don’t even know it exists.

However, dark data may be one of an organization’s biggest untapped resources. Data is increasingly a major organizational asset, and competitive organizations will need to tap into its full value. Further, more stringent data regulations may necessitate complete management of an organization’s data.

The following article explores the definition of dark data and how it can affect your organization, how organizations can research, access and analyze their dark data, and how they can create a comprehensive strategy to prepare for a new data future.

What Is Dark Data: Contents

What Is Dark Data?

What percentage of dark data is dark?

Globally, about 55% of an organization’s data is considered “dark” — that is, data that is unknown, undiscovered, unquantified, underutilized or completely untapped. This is according to the global 2019 State of Dark Data report by TRUE Global Research, sponsored by Splunk. Drilling down, the research found that a third of respondents estimated that more than 75% of their organization’s data is dark, while just 11% reported that less than a quarter of their data is dark.

The numbers ranged across seven global economies surveyed. For instance, while 44% of Chinese respondents reported that at least half their data was dark, 65% said the same in France and Japan. (The global average: 60%.)

How does dark data relate to big data?

Gartner defines big data as the high-volume and/or high-variety information assets that demand cost-effective, innovative forms of information processing. However, as big data continues to grow exponentially, so too does the amount of hidden dark data. The amount of enterprise machine data, for example, is rising much more quickly than traditional organizational data, while also containing information with increasing importance to strategic business decision-making. Perhaps not surprisingly, then, it’s also a major source of dark data.

What is dark data analytics?

Dark data analytics is any software or solution that enables organizations to better locate, identify and tap into previously unknown data for critical business decision-making. According to the State of Dark Data report, respondents viewed analytics as the leading solution that could adequately address mounting dark data challenges, enabling a wider swath of less technical employees to understand their organization’s needs. Specifically, a dark data analytics solution will provide a more comprehensive, insightful and accurate understanding of users’ data and give them a big picture of their environment.

How is dark data related to data analytics?

Going forward, it’s likely that dark data analytics will be powered by AI, simply because AI has the ability to process enormous amounts of diverse data at a very high speed. Also, with copious amounts of information, including dark data, AI has the power to produce deep, nuanced and more accurate business insights. Thus, while current AI can’t supplant human thought and creativity, it has the ability to rapidly process data at a scale and a velocity that can’t be replicated by humans. It’s not much of a stretch to foresee that AI and dark data will be further intertwined.

Discovering Dark Data

How is dark data related to data analytics?

A dark data assessment can be conducted in several ways. For one, organizations can hire an independent consultant to assess their environment and thoroughly scour for hidden and overlooked dark data.

Another option is to assess the entirety of your environment, to improve visibility and the efficiency and effectiveness of data management. It will also help organizations identify compliance violations and find security gaps, vulnerabilities and malicious activity that might put an organization’s data at risk.

What dark data research is available?

With most enterprises in very nascent stages of addressing these challenges, concrete dark data research is only just emerging. Leading global analyst firm Gartner, which coined the term “dark data,” has addressed how enterprises can begin to manage “data hoarding” and other related topics.

Before it was officially known as “dark data,” consulting firm Deloitte alluded to impending data challenges with a report on how organizations can find opportunities within unstructured data, providing critical foresight into industry-wide struggles around unknown data later on.

The most recent dark data research at the time of writing is the 2019 State of Dark Data report. Conducted by TRUE Global Intelligence and sponsored by Splunk, the report surveyed more than 1,300 business managers and IT leaders around the world on how their organizations collect, manage and use data. In addition to the findings noted above, significant findings include:

  • Seventy-six percent of respondents agreed that “the organization that has the most data is going to win.”
  • Sixty percent said that more than half of their organizations’ data is dark, and one-third of respondents said more than 75% of their data is dark.
  • The top obstacles to recovering dark data are: the volume of data, followed by the lack of necessary skill sets and resources, according to business leaders.
  • More than half (56%) admit that “data-driven” is just a slogan in their organization.

The report also addresses concerns around the rise of artificial intelligence and the challenges of increasingly data-oriented job roles.

Dark Data: Real Word Uses and Trends

What are the trends in dark data?

A major trend in dark data is a greater awareness of the importance of effectively managing all of an organization’s data. While it’s clear that data is the fuel that will drive future success, most organizations are unprepared to address associated challenges, limited by a skills gap and complacency. Significant dark data trends include:

  • Data will be key to future business success: More than two-thirds of organizations (71%) expect data to become more valuable over the next 10 years, while nearly all expect data to become more influential to their decision-making.
  • Data skills will be critical for future jobs: The vast majority of enterprises believe that data skills will continue to become more important for workers in all roles within their organization, not just IT. What’s more, becoming a senior leader in any organization will almost certainly require being data-literate.
  • Enterprises lack tools to make data usable: Despite the growing importance of data, enterprises acknowledge that their organizations lack the necessary resources, tools and skills to make the abundance of data actionable or take advantage of its benefits.
  • Enterprises have a dark data skills gap: Many enterprise employees are reluctant to learn new skills, with 69% saying that they’re content to keep doing what they’re doing, even if it means not getting promoted.
dark data table dark data table

What are the uses of dark data?

One very important use for dark data will be its role in fueling AI-powered solutions — more data increases the wealth of information that AI can analyze, and should allow AI tools to produce even deeper, more accurate insights.

The number of specific use cases are vast. One of the biggest will be creating and developing new and more productive enterprise business strategies. That includes helping organizations determine which department in the organization owns what data, the type of data owned by management and leadership, and what data they should own. It can also be used to improve quality assurance processes; detect and correct process errors; and look for privacy loopholes, security vulnerabilities and potential compliance violations.

Going forward, dark data can be used proactively to create new data management strategies around rapidly growing technologies such as IoT, and provide the fuel for short- and long-term trend analysis to demonstrate quantifiable results to managers, directors and leadership.

What are some dark data use cases in healthcare?

Because the need for efficiencies — as well as a complete data picture and innovative approaches — are acutely important to healthcare, unleashing the power of dark data will have an enormous, long-term impact on the healthcare industry. More effective use of data can help hospitals, doctors offices and specialists hyper-personalize patients’ medical experience, resulting in better care, and improve the security around personal medical data.

Getting Started

Laying the groundwork for a data culture shift

Enterprises face conflicting challenges: They know that data and AI hold almost unlimited potential to transform their business and that data literacy will be essential to their job function and performance. Yet at the same time, they also have limited confidence in their data acumen, AI expertise and ability to make sound decisions based on the data they have.

In light of this contrast, below are essential recommendations for enterprises attempting to move forward from a place of data uncertainty into a data-driven future.

organization statistic graphic organization statistic graphic

Be prepared for AI. Stay on top of burgeoning technologies such as AI and machine learning, while also finding use cases appropriate for your industry and organization. Among other things, business and IT leaders should follow general developments in AI and understand how these technologies are maturing in various markets. Also consider the potential for automation to create greater efficiencies and accuracy, and hone your ability to work effectively with large volumes of data.

Build a data-centric culture. Creating the necessary infrastructure will be the first step in making a data-driven future a reality. From there, take steps to understand your data and commit to bringing more of it into the light as a critical part of your business strategy. You’ll also need to put automation and AI on your IT roadmap and infuse data and analytics into strategic decision-making.

Recruit for data skills. In light of an industry-wide data skills shortage, you’ll need to step up recruitment of new data talent. That will mean creating a talent pipeline, collaborating with local colleges, and attending job fairs, tech meetups and other events. Competition is stiff for skilled, data-literate workers; to stand out from your competitors, be sure to raise your organization’s profile as a forward-thinking enterprise to both attract and retain top talent.

Provide training opportunities. It will be incumbent upon you to ensure that your existing workers get the necessary training they need to keep up with new technologies that will help transform your business. Provide opportunities to grow by partnering with online learning sites, sending staff to conferences and events, and providing tuition rebates. Encourage your workers to take charge of their own career development and professional goals, but then give them the tools to see their goals to fruition.

The Bottom Line

Set the stage for a data-driven future

There’s near-universal understanding that data is driving everything — from product development and supply chain to customer experience and overall business strategy — to an unprecedented degree. Yet many of today’s business leaders aren’t fully prepared for this revolution. This presents a challenge to organizations, but there are also opportunities.

Without a doubt, organizations will have to work hard to recruit, hire and train a data-literate workforce to prepare for the realities of a data-oriented future. They’ll have to work hard to instill a data-driven culture. And finally, they will have to take steps to bring all of their data into the light. Data is an increasingly valuable business asset, and businesses will need the people, processes and technology to manage — and maximize the value of — all of it.