The future of data management: it’s already here

(Peera_stockphoto/Shutterstock)

Analysts such as Gartner claim that data materials are the future of data management. But in reality, the future is already here. We see many signs of market maturity, ranging from total addressable market projections to vendors pushing ROI. Data Fabric’s unique ability to integrate enterprise data and reduce the need for repetitive tasks in data discovery, analysis and implementation are the reasons why many believe this will be the breakout year for the modern data integration approach.

Gartner defines data fabric as a design concept that serves as an integrated layer, or fabric, of data and connection processes. A data fabric enables data distributed across multiple locations and used by different applications to be accessed and analyzed in real-time within a unifying data layer, under the same management and security. And they do this by leveraging both human and machine capabilities.

The data dust model continues to grow into an established technology largely because data is growing exponentially, data sources are becoming more distributed, and many enterprises have not yet figured out how to get the useful data needed to drive their bottom line. Consequently, the businesses that use data materials will be the ones that achieve success.

Dissecting datasets – more than a sum of their parts

Some believe that a dataware is just another term for a metadata management system. To be sure, enterprises must incorporate a metadata-driven design to dynamically support different data delivery styles and to ensure a successful data material. But this is just the beginning.

Despite the successful use of data virtualization in data fabrics, it is wrong to define a data fabric as a system that virtualizes and hides other data sources. Yes, data virtualization creates a data abstraction layer to integrate all data without physically moving it. But data materials don’t stop there either. Others view a data fabric as a method to access all the file-level data from any machine in their data center. This is true, but, again, it is only a piece of the real data material.

Using both human and machine capabilities, a data warehouse includes all of the above components and provides an orchestrated approach for collecting, unifying, and controlling data sources throughout the enterprise data management system. In fact, many early adopters built a dataware to solve a narrower problem or succeed in a specific use case, only to discover other ways its capabilities could be used.

The convergence of triggering factors

During the Covid-19 pandemic, many industries have adopted digital transformation to survive. These changes have increased the demand for accessible data, leading to greater adoption of the data fabric concept. But the data material adoption shift was already well underway. The three Vs (volume, variety, and velocity of data) are always a problem, compounding other data issues that data fabrics are well-suited to address.

Take security management and fraud detection/prevention for example. Data Dust can automatically detect data abnormalities and take appropriate action to correct them, reduce losses and improve regulatory compliance. A data fabric enables organizations to define governance norms and controls, improve risk management and improve monitoring – something that is growing in importance, given legal standards for data management and risk management have become more demanding and compliance/management essential. It also increases cost savings by avoiding potential regulatory fines.

A data fabric represents a fundamentally different way of connecting data. Those who have adopted one now understand that they can do many things differently, providing an excellent route for businesses to rethink a host of issues. Because data materials span the entire spectrum of data work, they address the needs of all constituents: developers, business analysts, data scientists, and IT team members collectively. As a result, POCs will continue to grow across departments and divisions.

As the need for data sharing for big data, small data, analytics, business agility and AI/ML continues, enterprises are now realizing the utility of having multi-API access that goes back to the same data structure.

According to Gartner, Data fabric is becoming increasingly popular because it is a single architecture that can address the levels of diversity, distribution, scale and complexity of an organization’s data assets. They also state that the approach reduces integration design time by 30%, deployment by 30% and maintenance by 70% because data fabric designs leverage the ability to use, reuse and combine different data integration styles.

The same report commends the approach for driving automated data and metadata discovery, data quality and integration that drive improved data management. Automating repetitive tasks in most data quality, mastery, and integration solutions is known to reduce the overall cost of these solutions by 35-65%, depending on the approach in place.

(photography firm/Shutterstock)

It also allows organizations to take advantage of application resiliency, which functions despite the failure of the system component—a difficult task that becomes more difficult when applications are distributed. Resiliency is becoming increasingly important as organizations continue to rapidly deploy software across multiple tiers and technology infrastructures. Yet achieving resilience requires planning at all levels of architecture and constant revisiting.

The desire to standardize APIs, increase consistency of access, and create easy ways to import and consume all types of data within an organization is becoming paramount. A well-crafted data fabric addresses these goals and makes applications resilient to changes and errors in data sources.

The rise of irrefutable benchmark proof

Organizations are also looking for ways to leverage very large public datasets, such as the Wikidata dataset, which is the structured part of Wikipedia and other Wikimedia projects. The largest open RDF dataset, Wikidata, contains 17 billion triples and about 100 million entities, which may be why enterprises are increasingly interested in using these public data sources with their own internal data. Available, public data also provides an opportunity for organizations to easily compare standardized work of various data material enablers, provided vendors/integrators benchmark how quickly they create databases and how well queries perform at massive scale. As benchmarks become more publicly available, they will further demonstrate that the technology underlying and supporting data fabrics can deliver exceptional results.

Enterprise knowledge graphs as an entry point

Because data fabrics describe an integrated set of data management technologies, this means that they can be constructed in a variety of ways. However, capabilities such as semantic knowledge graphs, active metadata management, and embedded machine learning (ML) are essential components to ensure a successful data fabric design.

Enterprise Knowledge Graphs (ECGs) enable all three characteristics, so they are considered an ideal entry point for the creation of data materials. In fact, many are adopting ECG to build a single data layer rather than having to rip and replace their existing data warehouses and data lakes.

In the above report, Gartner asserts that “data material is the foundation,” as the method improves existing infrastructure, gradually adds automation to overall data management, and combines traditional practices with emerging practices. In the same report, Gartner says that to earn success with data materials, organizations must ensure that they dynamically support the combination of different data delivery styles (through metadata-driven design) to support specific use cases. Operationalize a data material by implementing ongoing and evolving data engineering practices for the data management ecosystem. And build the data fabric leveraging existing, well-understood and established integration technologies and standards, but continue to educate the team on new approaches and practices such as DataOps and data engineering, including in edge environments.

The data material has been a developing trend in the past few years. But the future is now and there has never been a better time to start.

About the Author: Navin Sharma is Vice President of Product at Stardog, a leading enterprise knowledge graph (ECG) platform provider. Navin is self-described intrapreneur and a seasoned product management executive who thrives at the intersection of technology innovation and business challenge to create value for both the employer and the customer.

Disclaimer for Uncirculars, with a Touch of Personality:

While we love diving into the exciting world of crypto here at Uncirculars, remember that this post, and all our content, is purely for your information and exploration. Think of it as your crypto compass, pointing you in the right direction to do your own research and make informed decisions.

No legal, tax, investment, or financial advice should be inferred from these pixels. We’re not fortune tellers or stockbrokers, just passionate crypto enthusiasts sharing our knowledge.

And just like that rollercoaster ride in your favorite DeFi protocol, past performance isn’t a guarantee of future thrills. The value of crypto assets can be as unpredictable as a moon landing, so buckle up and do your due diligence before taking the plunge.

Ultimately, any crypto adventure you embark on is yours alone. We’re just happy to be your crypto companion, cheering you on from the sidelines (and maybe sharing some snacks along the way). So research, explore, and remember, with a little knowledge and a lot of curiosity, you can navigate the crypto cosmos like a pro!

UnCirculars – Cutting through the noise, delivering unbiased crypto news