In my last blog I walked through what is NVMe, its evolution, value proposition and why it’s important for data-driven businesses. In this blog post I’ll take a deep dive into NVMe features for enterprise and Edge storage.
Whether its high-performance Artificial Intelligence, hyperscale always on systems or even personal gaming, the NVMe protocol meets the ever-demanding and dynamic needs of enterprise and edge storage environments. But it’s not only performance that makes this new protocol significant for data-driven workloads. NVMe also has some very innovative features that bring unique benefits to existing workloads and open possibilities for new applications. Let me share a few NVMe features you should be aware of:
1. No (custom) device driver required
Early PCIe connected SSDs all required their own device driver to do anything. If a user would upgrade the operating system, switched operating systems or hypervisors, or even just upgraded a kernel with a security patch, it would often be required to deploy a completely new device driver to access the SSD. This was an error-prone, headache inducing process, which didn’t directly benefit an enterprise.
NVMe SSDs, however, are supported out-of-the-box in the major modern operating systems and hypervisors. Because the interface has been standardized, a single device driver can support any NVMe SSD from any manufacturer.
2. I/O Multipath, Namespaces, and SR-IOV
Beyond performance, the NVMe protocol also supports IO multipath, which is particularly useful for redundancy and load balancing purposes. This is a mandatory feature for high availability systems: if one path is not accessible or busy, data access is available via the other path. Namespace owners can either have exclusive control or share them among each other (See Image #1). Shared namespace owners can operate concurrently with command atomicity. NVMe namespace sharing combined with multipathing builds the foundation for enterprise-class storage systems.
IO virtualization, together with namespaces, makes NVMe very interesting for enterprise SAN, hyperscale server SAN, virtualization, and hyperconvergence use cases. Taking it one-step further, SR-IOV (Single Root I/O Virtualization) allows different virtual machines (VMs) to share a single PCIe hardware interface. With SR-IOV, hypervisors need not participate in IO activity and still share components, which help in improving IO performance and overall system utilization as well as infrastructure consolidation.
3. Multi-stream Writes
SSDs are different in how they wear in comparison to hard drives. Due to the characteristics of NAND flash, SSDs have a finite lifetime dictated by the number of write operations known as program/erase (P/E) cycles NAND flash can endure (learn more about SSD endurance).
Multi Write Streams is another useful feature, which helps an SSD place similar data at contiguous locations in order to minimize garbage collection efforts. Properly implemented, this reduces write amplification, improves user write performance, and lowers write latencies. By reducing the number of system (device management) writes and increasing user writes, it can also increase device lifetime (See Diagram #2).
4. Asynchronous Events Capture (SMART & Firmware Image Load)
The NVMe protocol also supports asynchronous events such as SMART status check, error reporting, firmware commit, sanitize, etc. These are critical for the host to understand, react and take the device under control if needed. These events may not be executed immediately nor do they timeout. Whenever these events occur, the host receives the message and triggers actions (e.g. if temperature exceeds the pre-determined thresholds, throttling may start or IO operations can be stopped until the device cools down or a fan can run at a higher speed). The host can also issue asynchronous firmware download and commit commands to download, verify, and make it available at a specific image slot.
Firmware upgrades are common as new features are added, bugs are fixed, and security patches are released. However, in an enterprise data center, downtime means loss of revenue. Therefore, these devices need to support planned downtime, have more than one firmware image available, and minimize downtime as much as possible. The controller can use checksum, cryptographic hash, or digital signatures to verify and validate the image and make it available at the specific firmware slot. The images are generally available to load after the next reset cycle.
5. Boot Partition
NVMe specifications also define a boot partition, which is ready to read even before the controller is ready. The partition space may have system initialization codes to boot to a pre-OS environment such as UEFI. This option is useful for secure boot applications as well.
6. Power & Thermal Management
The underlying NAND devices consume higher power during the write operation as compared to read operations. The higher the power, the higher the heat dissipation. By recognizing the IO pattern and then allowing higher power to a selected few devices can be an interesting feature. This gives users a lot of flexibility in managing power and temperature challenges while delivering enterprise level performance. For example, an application can set higher power and thermal budgets for the write-intensive workloads and lower ones for the read-dominated workloads. This can help to manage the overall thermal and power budgets for enterprise server/ storage systems.
NVMe Features: from Performance Leadership to New Possibilities
NVMe devices have demonstrated leadership performance, with to up to 1.2M IOPS in a single drive from our Ultrastar® SN260 product. Application workloads such as databases, virtualization, data mining, real-time analytics, IoT and other high performance compute can take advantage of high throughput and lower latencies. Moreover, unique NVMe features open the door for new applications in data centers, the cloud and the Edge.
We are just starting to scratch the surface of the data revolution. New discoveries coming from IoT, machine learning and new applications are transforming the value of data, and require organizations to rethink how data is captured, preserved, accessed and transformed. NVMe will be a pervasive technology in enabling new opportunities at scale. I’ll share more about how our customers are taking advantage of NVMe today and how Western Digital’s breadth of expertise and level of integration give us an unmatched ability to deliver carefully calibrated solutions for every type and use of data.
Stay tuned for my next blog where I’ll share NVMe use cases from real-world businesses today.
If you are looking to learn more about NVMe, watch this webcast: NVMe Storage Q&A Webinar – Facts, Myths, and Answers to All of Your NVMe Questions
Rohit has more than 10 years of compute & storage industry experience in various capacities of increasing cross functional responsibilities.