In the realm of data and analytics, the adage "time is money" holds truer than ever. According to research conducted by IDC, a staggering 75 percent of business decision-makers think that data rapidly loses its value within a short period. This places organizations in a race against the clock to extract valuable, actionable insights before their data becomes obsolete. While challenges such as data quality and the effectiveness of algorithms contribute to the complexity of this issue, one often underestimated factor is data storage. Far from being just a repository for data, modern storage infrastructure plays a pivotal role in the analytics ecosystem.
Making informed decisions regarding data storage is crucial for expediting the delivery of insights and facilitating sound decision-making processes. In a recent discussion with Shawn Rosemarin, VP of R&D - Customer Engineering at Pure Storage, we delved into the challenges faced by businesses in this domain, considering their implications in two of the hottest technology trends - artificial intelligence (AI) and analytics.
Today, organizations store data for entirely different reasons than they did just a few decades ago. During the early stages of the digital revolution, data storage primarily served compliance, governance, or performance tracking purposes. As Rosemarin aptly puts it, "It's only in the last couple of decades that we started to say … what could we glean from our historical data … what could we glean from what happened in the past to help us try to predict what might happen in the future?”
This shift in perspective has led to significant advancements in leveraging data for informed decision-making. It has paralleled the explosion in data volume generated by organizations and the evolution of technologies for data capture and analysis. However, the question of how and where to store data has often been an afterthought. It might come as a surprise to many that a substantial portion of the world's data still resides on aging mechanical disks or even older tape storage systems. Data stored in this manner poses challenges in terms of accessibility, both in financial cost and energy consumption.
Rosemarin points out, "When we look at an all-flash data center, and we look at the benefits of flash versus disk, and we look at the current environment we’re in – the more I can free up energy … electricity consumption … human overheads and management – the more I can focus that energy savings, efficiency, and humans on what I’m trying to do – which is to solve AI and analytics challenges.” This is particularly critical in fields like drug discovery, where time-to-insight can have a profound impact not just on business outcomes but also on human health, particularly in the fight against pandemics. For instance, one of Pure's clients, McArthur Lab, deals with processing millions of data points daily to combat antimicrobial resistance. Migrating their storage infrastructure to Pure Storage technology has resulted in a staggering 300x increase in the speed of certain analytical processes. This has enabled researchers to accelerate the identification of "superbugs" and the assessment of potential cures.
During our conversation, Rosemarin also shed light on Pure's collaboration with Chungbuk Technopark, a South Korean innovation center specializing in incubating deep and machine-learning solutions with local companies. Realizing the need for AI-optimized solutions to reduce energy consumption in their data storage infrastructure, they transitioned their operations to Pure Storage infrastructure, experiencing a two-fold increase in processing speed for AI workflows.
Data quality remains a major challenge for businesses transitioning into a data-driven culture, particularly in the context of time-to-insight. Ensuring data matches reality is a significant hurdle. Rosemarin illustrates this with the example of a doctor taking notes during a patient consultation, emphasizing that discrepancies between what occurred and what was recorded can lead to issues.
The effectiveness of models hinges on the accuracy of data. Therefore, addressing gaps in data quality should be a top priority for organizations undergoing this transformation. Storage infrastructure often plays a role in data quality, impacting accessibility for validation and correction, governance to ensure adherence to regulations, and data silos caused by non-scalable storage solutions.
When asked for advice for enterprises in ensuring their storage infrastructure supports today's AI-driven analytics initiatives, Rosemarin emphasizes the importance of embracing simplicity and eliminating complexity. Simplification in data storage infrastructure can significantly accelerate "quick win" analytics initiatives, freeing up time and resources for analytics and AI projects.
In conclusion, the prevalence of flash storage is rapidly growing in various aspects of technology, from phones and computers to appliances and cars. However, the data center remains one area where spinning disks are still common due to the complexity involved in migrating to flash storage. Pure Storage sees this as a challenge they can help their customers overcome, not just for the sake of technological advancement but also for the planet's future. Energy efficiency and sustainability are pressing concerns, and adopting more efficient storage solutions, including all-flash technology, can contribute to reducing an organization's environmental footprint. In essence, considering storage infrastructure more seriously is a worthwhile endeavor, especially as it pertains to accelerating time-to-insight and addressing the challenges of the data-driven age.
Comments