From Petabytes to Zettabytes — Get Serious about Ceph Storage Scalability at Cephalocon

May 16, 2019 3 min read Data

From Petabytes to Zettabytes — Get Serious about Ceph Storage Scalability at Cephalocon

Cephalocon is taking place next week in Barcelona, and we have several exciting technology developments to share pertaining to NVMe™ SSD and capacity-optimized HDD storage devices, along with community-driven and open source software approaches to improve on Ceph Storage Cluster Storage efficiency, performance, and costs.

These latest storage developments are particularly relevant for Ceph administrators managing multi-petabytes to hundreds of petabytes of storage.

Time to Get Serious about Ceph Storage Scalability

25% of surveyed Ceph users reported 1 Petabyte to 100 Petabytes of raw storage capacity, according to last year’s Ceph User Survey. As more Ceph users encounter storage pains at the multi-petabyte range towards a zettabyte, now is the time to take the steps for better-managed Ceph Storage Cluster scalability than what can be achieved to date.

Managing Massive Data with Ceph

Which type of storage devices are used today by Ceph users? According to the Ceph User Survey:

89% use HDDs
66% use SAS/SATA SSDs
Only 32% use NVMe

Already, there has been a lot of great work accomplished for Flash SSD storage efficiency and I/O performance in BlueStore. But, that 66% using SAS and SATA SSDs, it’s time to have a talk with Western Digital on the long-term viability of those two storage interfaces for the long-term scalability of Ceph storage environments.

NVMe SSDs are the path forward to greater flash storage performance and capacity (read our technical guide here), while capacity-optimized HDDs continue increase their double-digit terabyte capacity points. We need your collaboration on the methods we’re going to use through a community-driven and open source effort with NVMe SSDs technology to improve on the success of BlueStore.

Movin’ On Up (the Stack)

What if we can move some HDD and SSD read and write operations “up the stack,” or host side, into the Linux® kernel level and application stack levels for Ceph block, file, and object storage functions?

There’s an open source and community-driven effort to do this; here’s some of what we found out happens when you consider these types of “up the stack” initiatives:

Enable I/O Determinism to let applications organize the reads and writes from/to storage. Applications have the knowledge of which reads and writes are going to happen, so let applications organize storage reads and writes.
Lower cost of devices and less overprovisioning: If applications are organizing reads and writes, this allows Western Digital to reduce the amount of expensive DRAM on a per SSD basis and reduce NAND overprovisioning.
Host-Managed SMR: “Up the stack” enables the deployment of capacity-optimized SMR HDDs to take advantage of improved HDD areal density over time. (See how Dropbox did it).

Here’s a 17-minute overview video on moving SSD FLT functions “up the stack”:

Headed to Cephalocon 2019? Top 3 Discussion Topics You Should Have with Us:

NVMe SSDs storage options for Ceph block performance
Capacity-optimized Hard Disk Drives for Ceph
Community-Driven and Open Source Software approaches to make better use of NVMe SSDs and HDDs under a single unified management framework to manage both types of storage devices

For further reading, here’s a 2-page solution brief on Western Digital Ultrastar NVMe SSDs and HDDs for Ceph available here.

We’ll see you at Cephalocon!

From Petabytes to Zettabytes — Get Serious about Ceph Storage Scalability at Cephalocon

Time to Get Serious about Ceph Storage Scalability

Managing Massive Data with Ceph

Movin’ On Up (the Stack)

Headed to Cephalocon 2019? Top 3 Discussion Topics You Should Have with Us:

Related Stories

A Balancing Act: HDDs and SSDs in Modern Data Centers

What is the 3-2-1 Backup Strategy?