The New Frontier for Data Management: Edge-Core-Cloud

Print Friendly, PDF & Email

By Eric Bassier, Quantum Corp.

internet connection safe

With the proliferation of IoT devices, sensors and robots spanning industries such as autonomous vehicles, manufacturing, surveillance, smart cities and healthcare, edge computing is creating vast amounts of unstructured data that will, at some point in its lifecycle, need to be centralized at the core or moved to the cloud. And as HPC and machine learning require more data than other types of workloads, edge-core-cloud data management is even more critical.

This presents organizations with complex data management challenges that must be overcome if they want to use their data in the most efficient way possible. However, thanks to technological advancements, organizations can follow several best practices to ensure they strike the right balance for storing and managing the data created.

Extracting More Value from Data

As the volume of data collected at the edge has grown, organizations’ storage requirements and priorities have shifted. For example, because edge sites are now so active, ensuring that the infrastructure is running on a secure environment has taken on newfound importance, guarding against ransomware and other cyber attacks. Along with the ability to move data to the cloud or core data centers for additional processing and long-term storage, organizations also need to take a smarter approach to data storage to ensure cost efficiency.

These challenges are supplemented by the growing need to apply AI and machine learning capabilities to the raw data as it’s collected. The need to analyze and learn from this data is accelerating the push to centralize unstructured edge data to wherever this compute power resides. This is particularly true in use cases that generate unparalleled amounts of machine and sensor data from a variety of sources.

Once data has been captured and initial analysis performed – either locally or in the cloud – consolidating data in a core data center means organizations can extract more value from their data and ensure that key information is shared across offices and teams. For example, within an organization, the APAC office might be working primarily with video content while the U.S. office might be working with telemetry data and the European office with another type of data. Instead of tying together these remote edge offices across a complex and expensive infrastructure, the organization can centralize the data from all three offices in one large data lake that’s accessible by all offices. The data can also be replicated either to the cloud or to a second and even third data center so that it’s highly protected in case of an outage. So, how can organizations go about putting this approach into action?

Centralizing Data Storage

The first consideration for organizations is whether data will be kept on-premises in the core or moved to the cloud. This will depend on business requirements. If data goes directly to the cloud, the organization will get the scalability and cost-efficiency benefits that comes with it.

However, there will also be constraints in terms of performance and SLAs. As such, we’re seeing many organizations adopt a hybrid approach that delivers the best of both worlds. Organizations can capitalize on the flexibility of the cloud while also enjoying the increased security and performance of on-premises for certain types of data.

In this new paradigm, data management becomes critical — for example, the ability of software to understand what type of data is being worked on and where it needs to go. These tools classify data and use policy-based automation to proactively move data to where it’s needed at that moment in time – for example, to a performance tier or to the cloud to leverage elastic compute resources. This same policy-based automation can also be used to protect and secure data based on compliance needs, or delete data after its defined useful life. By automatically placing data where it will be most effective, these tools enable organizations to improve process completion times, increase the number of projects per resource, reduce wasted effort, and improve cost efficiency.

Finally, organizations must take the time to understand how different processes and storage methods impact SLAs. Retrieving data from the cheapest storage option might take the longest amount of time, so there must be a balance between cost efficiency and the ability to access the right data when it’s needed.

Ultimately, it’s clear that the rise of edge computing is forcing organizations to reconsider how and where they process, store and manage all the unstructured data being generated. Although there are challenges, centralizing data can enable organizations to get the most out of their data in today’s world of ever-increasing data volumes.

Eric Bassier is senior director, products, at Quantum Corp.