Let’s talk for a minute about data silos. Real world silos are, of course, those towers on farms that are used to store grain for future use or sale. They’re towering buildings that usually contain only one type of raw material. The silo concept generally works as a metaphor for describing large collections of raw data that’s stored separately from other raw data.
Servers and devices often silo data. Different machines store data, but don’t necessarily share it all with other devices. Applications generate and store data, but only some might…might…be shared if a well-written API (application programming interface) is being used. Over time, organizations find themselves with a lot of data, but most of it is isolated, stored in separate metaphorical silos, never to be part of a larger whole.
How edge computing creates the perfect storm for data silos
When it comes to enterprise networking, especially edge-to-cloud, data silos occur naturally. Every device at the edge produces data, but much of that data may remain at the device, or at the very least, the cluster of devices at that edge location. The same is true of cloud operations. Data is created and stored at many different cloud providers and, while they sometimes exchange data, most of it lives isolated from the rest of the enterprise.
Also: How edge-to-cloud is driving the next stage of digital transformation
But insights and actionable strategies come when all data across the enterprise is accessible to appropriate users and systems. Let’s look at one example that might occur at the fictional home goods retailer, Home-by-Home, we discussed previously.
Home-by-Home sells a wall mounted lighting fixture that uses plastic brackets to affix it to the wall. Usually, it’s a great seller. But in March and April every year, the company gets a flood of returns because the brackets crack. The returns are from all over the country, from Miami to Seattle. That’s our first data set, and it’s known to the stores themselves.
The brackets are built by a partner company in a factory. Normally, the factory operates at temperatures above 62 degrees Fahrenheit, but in January and February, the factory’s ambient temperature drops to an average of 57 degrees. That’s our second cluster of data, the temperature in the factory.
Neither data set is connected to the other. But as we explored in some depth a while back, some plastic production processes begin to fail under 59 degrees or so. Without being able to correlate a data set at a factory with returns statistics from stores, the company wouldn’t be able to know that a slightly cooler factory was producing substandard brackets, which were failing all over the country.
But by capturing all the data and making data sets available for analysis (and AI-based correlation and big data processing), insights become possible. In this case, because Home-by-Home made digital transformation part of its DNA, the company was able to make the connection between factory temperature and returns, and now customers who purchase those lighting fixtures experience far fewer failures.
Your data is everywhere, but is it actionable?
This is just one example of the potential to harvest data from edge-to-cloud. There are a few key ideas here that are all interrelated.
Your data is everywhere: Nearly every computer, server, internet-of-things device, phone, factory system, branch office system, cash register, vehicle, software-as-a-service app, and network management system is constantly generating data. Some of it is purged as new data is generated. Some of it builds up until storage devices become clogged due to overuse. Some of it sits in cloud services for each login account you have.
Your data is isolated: Most of these systems don’t talk to each other. In fact, data management often takes the form of figuring out what data can be deleted to make room for more to be collected. While some systems have APIs for data exchange, most are unused (and some are overused). When complaining about some local businesses, my Dad used to love using the phrase, “The left hand doesn’t know what the right hand is doing.” When data is isolated, an organization is just like that.
Insights come when correlating multiple inputs: While it’s possible to subject a single dataset to comprehensive analysis and come up with insights, you’re far more likely to see trends when you can relate data from one source to data from other sources. We earlier showed how the temperature of a factory floor has a distant, but measurable, connection to the volume of returns in stores across the nation.
To do that, all that data needs to be accessible across your enterprise: But those correlations and observations are only possible when analysts (both human and AI) can gain access to many sources of data to learn what stories it all tells.
Making data usable and turning it into intelligence
The challenge then is making all that data usable, harvesting it, and then processing it into actionable intelligence. To do this, four things need to be considered.
The first is travel. Data must have a mechanism to move from all these edge devices, cloud services, servers, and whatnot to somewhere it can be acted upon, or aggregated. Terms like “data lake” and “data warehouse” describe this concept of data aggregation, even though the actual storage of the data may be quite scattered.
Also: Digital transformation powered by edge-to-cloud comes to life in this scenario of a big-box retailer
These two issues, the storing of the data and the movement of data both require considerations of security and governance. Data in motion and data at rest needs to be protected from unauthorized access, while at the same time making all that data available to analysts and tools that can mine the data for opportunities. Likewise, data governance may be an issue, as data generated in one geographic location may have governmental or taxation issues were it to be moved to a new locale.
And finally, the fourth factor to consider is analysis. It has to be stored in a way that’s accessible for analysis, updated often enough, cataloged properly, and curated with care.
A gentle introduction to data modernization
Humans are curious creatures. What we create in real life, we often reproduce in our digital worlds. Many of us have cluttered homes and workplaces because we’ve never found the perfect storage location for every object. The same, sadly, is often true of how we manage data.
As we discussed earlier, we’ve siloed so much of it. But even when we pull all that data into a central data lake, we don’t have the best ways to search, sort, and sift through it all. Data modernization is all about updating how we store and retrieve data to make use of modern advances like big data, machine learning, AI, and even in-memory databases.
The IT buzz-phrases of data modernization and digital transformation go hand-in-hand. That’s because a digital transformation can’t take place unless the methodologies of storing and retrieving data are a top (often the top) organizational IT priority. This is called a data-first strategy and it can reap substantial rewards for your business.
See, here’s the thing. If your data is tied up and trapped, you can’t use it effectively. If you and your team are always trying to find the data you need, or never seeing it in the first place, innovation will be squelched. But free up that data, and it unlocks new opportunities.
Not only that, poorly managed data can be a time sink for your professional IT staff. Instead of working to drive the organization forward through innovation, they’re spending time managing all these different systems, databases, and interfaces, and troubleshooting all the different ways they can break.
Modernizing your data not only means you can innovate, it also means you can free up your time to think instead of react. That also provides you time to deploy more applications and features that can open new horizons for your business.
Find the value and actionable insights hidden in your data
The process of data modernization and adopting a data-first strategy can be challenging. Technologies like cloud services and AI can help. Cloud services can help by providing an on-demand, scale-as-needed infrastructure that can grow as more and more data is harvested. AI can help by providing tools that can sift through all that data and organizing coherently, so your specialists and line-of-business managers can take action.
But it’s still a big ask for most IT teams. Usually, IT doesn’t set out to silo all that data. It just happens organically as more and more systems are installed and more and more to-do items are put on peoples’ lists.
That’s where management and infrastructure services like HPE GreenLake and its competitors can help. GreenLake offers a pay-per-use model, so you don’t have to “guestimate” capacity usage ahead of time. With cross-application and cross-service dashboards and a wide range of professional support, HPE GreenLake can help you make your data everywhere challenge into a data first strategy.