Solving Big (Fast Data) Problems: A Conversation with WekaIO

Solving Big (Fast Data) Problems: A Conversation with WekaIO

I recently sat down with David Hiatt, Global Director, Business Development and Strategic Alliances at WekaIO to talk about data storage challenges and how new applications are impacting storage architectures, following a recent webinar on “Fuel Innovation and Discovery with NVMe™ Parallel Storage Performance and the Efficiency of a Private Cloud ” that we participated in with them. Here are highlights from our discussion:

What surprised you about this year and how organizations are tackling their data storage challenges, now that we are halfway through 2019?

David: “As the year progresses, it continues to be difficult for IT to predict storage requirements to support mixed user workloads and business demands. Conventional wisdom—adding more traditional enterprise NAS—no longer delivers a competitive advantage. While easy to deploy, these systems were not designed for the unprecendented performance, scale and processing of data required by modern user workloads and new applications such as Machine Learning and AI. With every new technological deployment, it is important to consider what is most important to you, for example, performance or scalability – to mitigate any potential challenges you might face.” 

I agree with David – IT faces demands from various stakeholders and the stakes for building an infrastructure that transforms data into actionable insights are high. Competitive advantage is not going to be driven by adding on more of the same technology, but rather mapping out a strategy that shifts the data center from necessary repository to indispensable business asset. To effectively shift, IT needs to move from infrastructure strategy to data strategy.

Shifting from Infrastructure Strategy to Data Strategy

What are some tips you can share on what organizations must consider as they build – or revise – their data strategy?

David: “The new wave of applications in AI/ML/DL, financial services, life sciences, high-velocity analytics, engineering, and HPC ingest and output massive amounts of data at multi-petabyte scale. A data strategy is critical in not only managing the data, but transforming the data from input to insight. As businesses consider their data strategy, they should think carefully about the following:

  • Identify overall business goals and outcomes
  • Outline workloads and applications that support the business goals and will drive to the desired outcomes
  • Identify how is data currently stored and whether it is stored in multiple data silos and why; and
  • Consider what is important, especially for this new wave of applications – high performance to satisfy I/O intensive workloads , scalability while maintaining performance, durability, and affordability – to each workload

Once the data strategy is outlined and you better understand the workloads that drive business goals, you can start to identify the right solutions to move the strategy forward.”

Fueling Innovation with Effective File Management

One question we often get asked is how you can bring your current infrastructure up-to-speed to better manage the data explosion – especially given that data is the new currency and lifeline to your business. Our relationship with WekaIO is a great example of helping organizations deploy an advanced storage option to not simply manage their files, but drive business success.  Let’s face it, you might describe your relationship with files as “love/hate” – you love what they offer the business, but you hate to manage them. Yet when managed properly, they can fuel innovation and discovery.

What’s your advice on navigating file and data management?

David: “As we highlighted in our recent joint webinar on HPCwire, it’s important to understand how you can get the best of both worlds:  storage optimized for high performance and storage that’s optimized for capacity, durability and affordability. Although it offers blazing performance, keeping all data on an NVMe storage file system can be costly, so you need to combine it with a highly-dense, cost-efficient solution. This requires a level of management and strategy to understand what data goes where, how can you automate data movement across your tiers to deliver optimal performance, capacity and cost and how you can simplify this overall environment.

It was fun to dive into the topic of NVMe-native parallel files systems and private cloud storage, and share some of the use cases, specifically how it is transforming the data pipelines for autonomous vehicles and for genomic sequencing. For example, one use case resulted in the reduction of machine learning cycle times by cutting analysis time by over 80x! Ingest, analysis, and processing times for genomics were all significantly reduced. In environments where arriving at answers is a business-critical operation, eliminating performance bottlenecks and reducing complexity and cost at scale is vital.

You’re Not Alone!

It was great to connect with David Hiatt from WekaIO and hear more from him on managing a data strategy and deploying a massively scalable file solution that can support that strategy. With data growing exponentially, building a strategy and identifying the right solution may feel daunting and as though it’s in constant flux.  Have no fear, we have a few resources to help you as you continue to identify new and possibly more effective ways to manage your data strategy and deploy the right file management solutions.

For more information:

Related Stories

What is the 3-2-1 Backup Strategy?