Skip to main content

DATA INSIDER

What Is Data Fabric?

Whether viewed as a concept for data organization, or an architecture to put it to use, the term data fabric defines a methodology for data integration across all storage and use environments, and applying a common set of protocols, procedures, organization and security. The data fabric concept is inextricably linked to other big data concepts that include data lakes, data warehouses, data meshes and even data lakehouses.

A core principle of data fabric architecture is that it is applied across all data structures and data sources in a hybrid multicloud environment, from on-premises to cloud to edge.

The end goal of a data fabric is to make an organization’s data useful to as many people as possible — as well as data scientists and data engineers — as quickly and as safely as possible, by creating standardized data management and data governance practices for optimization, and to make it visible and provide insights to multiple business users — all while maintaining control, protection and ensuring data security.

Noel Yuhanna of Forrester Research is credited with being one of the first to quantify the idea of a data fabric architecture. Yuhanna refers to data fabric as a platform that helps organizations adopt new business processes faster. According to Yuhanna, data fabric “automates the ingestion, curation, transformation, governance and integration of data across disparate data in real time and near real time.”

The data fabric concept is a step toward moving enterprise data away from central, on-premises databases, decoupling data from physical servers and providing each user with the data access they need in the format they need it, regardless of where they — or the data — are located.

In the following article, we’ll detail use cases of a data fabric framework, compare it to other data architectures and discuss the benefits and how it can bring value to your organization.

What Is Data Fabric? | Contents

What value does data fabric bring to an organization?

Concepts like a data fabric architecture were unknown and unnecessary when all data was stored on-site on a server. As the complexity and size of new data networks has grown, along with the popularity of hybrid cloud infrastructures, more complex workloads and the increased risk around data pipelines, IT departments are faced with increasing requirements around keeping their data safe, secure and usable, as well as tracking and accessing it in real time, on-demand throughout its lifecycle.

A data fabric solution can be an effective way to ensure that all of an organization's data, in all formats, is accounted for and searchable when needed. In a world of highly disparate storage systems, concepts like a data fabric architecture are essential to ensuring data quality and the usability of an organization's most valuable asset. The reason? Data fabrics bridge data silos, without creating new ones, by bringing connectivity at the compute/processing layer rather than the data/storage layer. This in turn, makes them accessible, enabling self-service data sharing, consumption and collaboration, across the entirety of the environment.

One of Gartner's top strategic trends in 2022, the analyst firm finds that data fabric can reduce data management eforts by up to 70%, thanks to the use of analytics to learn and actively manage data recommendations and usage. 

What are the steps for implementing a data fabric framework?

Implementing a data fabric framework can be done in any number of ways depending on an organization's skill, timeframe and budget, as well as the data environment. The integration among various components in a data fabric is usually handled via APIs and through the common JSON data format. That said, industry experts are still defining the concept of a data fabric infrastructure. There are also vendors who will offer an end-to-end data management solution that meets many of the characteristics of a data fabric, and more organizations are adopting a data fabric approach as part of their service offerings.

What are some use cases for data fabric architecture?

A data fabric architecture is designed to create an overall schema for managing all types of stored datasets so that it can be made useful when needed. Anything from a sales forecast to a report on the status of an organization’s IT infrastructure or user endpoints can make use of these types of data.

The use cases of data fabric architecture are the same as the use cases of any kind of data in an organization, from sales to marketing to IT to cybersecurity and more. In almost any use case, however, data in an organization is typically unstructured, semi-structured or structured. Structured data, for instance database records, can be stored in a relational database and put to use without additional manipulation.

Unstructured data consists of data that has not been cleansed or organized and must be made ready for use when called upon. Machine learning, analytics, sensor data, cloud computing and productivity applications, for instance, are various types of unstructured data many organizations may gather and save for future use. Semi-structured data (e.g., zip files, web pages and emails) contains elements of both, including data of a known type stored alongside unstructured data.

If you research uses of data fabric, you will find dozens of potential use cases based on the data fabric’s ability to help organizations access and use their data more quickly and effectively. Some common scenarios involve:

  • Real-time data analytics
  • IoT analytics
  • Customer intelligence
  • Fraud detection
  • Increases in operational efficiency
  • Supply chain logistics
data-fabric-use-cases

There are dozens of potential use cases for data fabric architecture that help organizations access and make use of their data strategically.

Key advantages of a data fabric include:

  • The ability to intelligently ingest, clean, organize and manipulate data for future consumption, regardless of where it is sourced. A data fabric supports data from multiple applications originating anywhere in an organization’s IT infrastructure — from edge to cloud — and makes data preparation and orchestration easier for end users.
  • A set of guidelines and policies that ensure data is stored, cleansed, labeled and otherwise manipulated in a consistent way across the organization: These guidelines govern both structured data stored in the format that end users require (such as a data center), or unstructured data stored in a data lake and cleansed on extraction.
  • A data fabric architecture encompasses all data structures throughout an organization's ecosystem, from on-premises servers to edge networks to virtualization and cloud storage. Ensuring consistency in a hybrid cloud environment is one of the biggest data challenges facing modern IT departments. A data fabric enforces consistency at every step.
  • The ability to apply artificial intelligence (AI) and machine learning (ML) to your data in a consistent, effective manner for data analytics. Forward-looking organizations that use AI- and ML-based IT tools gain a distinct advantage in a data fabric architecture. Ensuring consistency, connectivity and quality across an organization’s data landscape paves the way for AI use at scale.
  • A data fabric architecture makes all your data available in the right place at the right time. By bringing together disparate elements of your network, you get the advantages of having an architecture that is highly modular and functions seamlessly with your apps, processes and overall IT infrastructure.

What is the difference between data lakes and data fabric?

A data lake is a place for storing data and data assets, whereas a data fabric is a methodology for extracting and using it. The two terms are related; many experts see a data fabric as the best way to get the most value out of data stored in a data lake. But the two terms have distinctly different meanings.

A data lake is a repository of data in its raw format, not sorted or indexed in any way. The data can be anything from a simple file to a binary large object (BLOB) like a video, audio, image or multimedia file. Any manipulation of the data to make it usable — discovery, extraction, cleansing and integration — is conducted when the data is extracted.

Data fabric is a methodology for integrating all of an organization’s data — from across all storage and use environments — and applying a common set of protocols, procedures, organization and security.

What is the difference between data fabric and data mesh?

Data fabric and data mesh are two connected concepts that are not clearly delineated. In general, data fabric and data mesh are similar in that they are concepts and methodologies for determining how organizations deal with large amounts of stored data. A data fabric methodology attempts to govern data by building a management layer on top of it, wherever it’s stored. The data mesh approach is different in that aspects of certain types of data management are under the control of the teams or groups in the organization who use that data.

While both approaches provide an architecture to access data across numerous technologies and platforms, a data fabric is a technology-centric architectural approach addressing the complexity of data and metadata, while a data mesh focuses on organizational change, placing more emphasis on people and process than architecture.

The Bottom Line: Consider a data fabric to meet your growing data needs

While many organizations are undergoing their own digital transformation, much of IT infrastructure history is about playing catch up with their traditional data. Thus, it’s not surprising that data architecture, one of the fastest growing areas of business, often requires a revamp. When it comes to data storage, access and usage, it's nearly impossible for anyone to keep up. However, an increasingly popular approach for addressing scale is automation, which data fabrics provide in spades for essential functions such as governance, data protection and integration, among other things. Now that organizations are embracing technologies such as data lakes, data warehouses and others, it makes sense to add the fabric approach before your organization grows any further.

What is Splunk

 

More resources: