What the Industry is Doing About its Zetta-Scale Problem

November 17, 2020 6 min read Technology

What the Industry is Doing About its Zetta-Scale Problem

By Ted Marena

It’s no headline news that the amount of data generated is accelerating at a breakneck pace. But for those of us on the data infrastructure end of things, it’s what keeps us up at night. Any way you look at it – storage, memory, processing, or networking – we as an industry have to push the boundaries of existing technologies, or the world won’t be able to keep up with data.

So, what are we doing about it? A lot. And, our new series of meetups is where we bring various experts from across the industry to talk about what’s at the cusp of next-gen data architectures.

Get Used to Talking in Zettabytes

Some stats to feed your data hunger: in 2025, the US is expected to generate 30.6ZB of data while China will generate 48.6ZB of data.¹ If that’s a little hard to grasp, let me help you.

Numerically speaking, a Zettabyte is 1,000 Exabytes or 1,000,000 Petabytes or 1,000,000,000 (a billion) Terabytes or 1,000,000,000,000 (one trillion) Gigabytes. If you think that’s a far-off future, you’re wrong. Already two years ago the industry shipped almost a zettabyte of new storage devices. We’ve crossed that chasm before most even took noticed.² And, we’ve been busy building the foundation to manage where our Herculean sea of data is headed.

What Did You Miss?

Back in July, the Bay Area Storage Solutions Meetup group held an inaugural event. Our first topics discussed were how a graphics processing company views fabrics and NVMe-oF™, how Western Digital is rethinking storage efficiencies through Zoned Storage and how one startup is challenging the control of main memory with OmniXtend™, a cache coherent memory fabric.

Ahead of the next virtual Meetup on Dec. 2, 2020, let me get you up to speed on the discussions and the technologies presented in July.

NVMe™-Over Fabrics

The first presentation explained why NVMe-over-Fabrics (NVMe-oF) is needed given the faster speeds of storage that exist today. In some architectures, the network has now become the bottleneck for performance. NVMe-oF is an open standard that defines how to share storage across multiple servers/CPUs. Improving the storage throughput enables increased performance for applications such as machine learning and AI.

[ HEAR FROM THE INDUSTRY’S MOVERS AND SHAKERS ABOUT ZONED STORAGE ]

This presentation walked through the implementations of their RDMA (Remote Direct Memory Access) and specifically the support over Ethernet, RoCE (RDMA over Converged Ethernet). Application examples and network performance benchmarks were showed for various implementations. To support RDMA requires both hardware at the network interface point and system software on the host to be aware of RDMA.

Zoned Storage

The next presentation was on Zoned Storage by Dave Landsman from Western Digital. Dave is on the board of the NVM Express group which sets the open standards for storage and data architectures. He explained how both HDDs and SSDs consist of numerous regions/blocks or zones and that each device physically can only be sequentially written. For most systems, this restriction was not apparent because the drive controller was doing the data management. This conventional implementation works but is not scalable for higher densities.

The zoned storage standard for HDDs is SMR (Shingled Magnetic Recording). The zoned storage implementation for SSDs is called ZNS (Zoned Namedspaces). This standard requires the host device software to cooperate in organizing the data to be stored on a ZNS SSD. There are numerous zoned block software options which can be implemented. These software details can be found at www.zonedstorage.io.

The advantages of implementing zoned storage is that data is intelligently placed on the drives. By doing this, the drive controller has minimal data management tasks to perform. The result is that zoned storage enables higher densities, better QoS and lower TCO. Zoned storage SMR HDDs and ZNS SSDs can address the explosive data growth and support zettabyte scale for data centers and cloud providers. Recently Western Digital announced its first ZNS SSD, the Ultrastar® DC ZN540 ZNS NVMe SSD. Learn more at www.westerndigital.com/zoned-storage

OmniXtend

The last presentation was about OmniXtend. This architecture breaks the strangle hold of main memory from the CPU. OmniXtend is an open, cache coherent memory fabric based on low cost Ethernet. Although there are many memory interface architectures, none are all open, based on Ethernet and preserve coherency.

OmniXtend allows all nodes on a network to share main memory equally. No longer does a CPU own main memory. OmniXtend serializes the cache coherency bus, TileLink and sends that in layer 2 over Ethernet frames. This enables not just CPUs to access main memory, but also GPUs, FPGAs, ML accelerators, etc. to equally share memory coherently.

The open source hardware group, CHIPS Alliance, is developing OmniXtend further. Currently there is a FPGA implementation of multiple quad RISC-V cores which share L2 cache and can also access the cache on the other boards via Ethernet. Although it is early in the development of OmniXtend as a standard, if this becomes adopted it would enable new architectures in data centers and better solve memory intensive workload applications.

Join Us on Dec. 2

ZNS SSDs are picking up. Get to know the ecosystem surrounding zoned storage and hear from some of the industry’s biggest movers and shakers. Save the date here.

Data Age 2025, IDC May 2020
https://www.businesswire.com/news/home/20190307005812/en/TRENDFOCUS-Combined-SSD-HDD-Storage-Shipped-Jumps

Forward-Looking Statements

Certain blog and other posts on this website may contain forward-looking statements, including statements relating to expectations for our product portfolio, the market for our products, product development efforts, and the capacities, capabilities and applications of our products. These forward-looking statements are subject to risks and uncertainties that could cause actual results to differ materially from those expressed in the forward-looking statements, including development challenges or delays, supply chain and logistics issues, changes in markets, demand, global economic conditions and other risks and uncertainties listed in Western Digital Corporation’s most recent quarterly and annual reports filed with the Securities and Exchange Commission, to which your attention is directed. Readers are cautioned not to place undue reliance on these forward-looking statements and we undertake no obligation to update these forward-looking statements to reflect subsequent events or circumstances.