VMware Virtual SAN (VSAN) is new approach to hypervisor-converged infrastructure that I’ve written about extensively on this blog. I want to take the opportunity to give an overview of the role and benefits of highest capacity flash when used in VSAN capacity tier and share some suggestions and best practices for implementation.
VSAN Capacity Tier
For those who are not familiar with VSAN, you can learn more about VSAN here. The whole idea behind this new software-defined architecture is to bring compute and storage together. That means using server side storage while providing the functionality of the traditional approach where compute and storage are separated and connected via some means of network. Be it IP based or FC based.
In VSAN, the storage in the server is put in either one or multiple disk groups. Each disk group contains a bunch of disks together and one of these disks is categorized as the “cache tier” and remaining disks as the “capacity tier”.
A basic VSAN configuration needs 3-nodes meaning 3 servers and should contribute to both compute and storage. As you scale beyond that, each node can contribute either for compute or storage or both.
VSAN can be architected in two different ways:
- Hybrid VSAN: In this approach the cache layer is created using flash and capacity tier is created using magnetic media. What it means from a technology point of view is that when an application reads data from, or writes data to, storage, it first gets written or read from this cache layer (occasionally, read might happen from the magnetic layer, but this is not desirable).
VSAN’s software is designed to divide the portion of flash storage for read cache and write buffer with a defined percentage of the total capacity of the cache tier. That means the cache tier contributes to read and write cache while the capacity tier contributes to storing the data only. - All-Flash VSAN: With all-flash, the approach slightly changes. The capacity tier not only contributes to storing data, but serves as read cache. In this case the cache tier is used for write buffer while the capacity tier is used for read cache and storing data.
Building a Large Capacity Tier in All Flash VSAN
As any IT or storage manager knows, data in an organization is always growing and it seems there is no recess. Though VSAN can scale up to 64 nodes in each cluster and serve the ongoing demand of data growth, buying and managing more infrastructure is most likely not the first step any organization would like to adopt to cater the growing data need.
With VSAN, it is very simple to add one or more nodes to scale out, while from an administrator and IT manager perspective, there is still work to be done as you need to plan for additional real estate space, power, cooling and last but not the least additional hardware purchase budget.
Here are some of the considerations that should be evaluated before scaling out a VSAN environment, and specifically why I see higher capacity drives are key to keeping VSAN infrastructure footprint low and achieving better cost and management efficiencies when tackling the rapid growth of data.
1. Deduplication
Deduplication is one of the data services provided at storage layer. To simplify, it means “avoid retaining duplicate copy of data”. Deduplication is absent in VSAN today but will most likely to be added later. Some argue that deduplication services have CPU cycle cost in order to provide better storage utilization. One can argue that adding one more storage node in a VSAN cluster can provide a similar level of storage utilization and avoid additional CPU cycles involved in the de-duplication process. As morepowerful hardware configurations become available at lower costs, this may be feasible from a cost perspective, however, as I wrote, I see a great operational and cost advantage in maximizing efficiencies and minimizing hardware.
2. Compression
Compression is another similar data service feature provided at the software defined storage layer. However, this is currently absent in the VSAN layer. In the absence of compression, the data growth will be significantly higher than in compressed state.
3. Number of Failures to Tolerate
This is a VSAN high availability feature. You can define failure to tolerate from 0 to 3 and by default it is 1. The default policy means every VM disk in the environment will have a mirror copy. So in case one VM disk fails it can be rebuilt from the other. From enterprise deployment perspective, FTT of 2 or higher is recommended (meaning 2 or more mirror copy of VM disks) to make the environment sustainable in case of multiple failures. The failure could be a node or actual storage disks failure.
So as we copy the same VM disk multiple times, we need additional capacity to take into account such need. Users can define failure to 0 in order to save capacity, however, I don’t think any storage administrator would take this approach, as high availability is a key requirement for any organization.
4. Server Size and Scale-out Nodes
Storage bays in any given server are limited. Generally if we look at standard server specification, we see 16 to 24 drive slots available. From a VSAN perspective, 2 to 4 drive slots are used for installing vSphere and providing the VSAN cache tier. vSphere installation can happen in SD card slots to save some drive slots, but you must remember that not all. Each node will still have at least 2-3 disk groups which means the cache tier will need to consume some of the drive slots.
5. Snapshots
With VSAN 6.0, VMware now supports 32 snapshots per VMDK file of each VM. Virtualization or storage admin definitely plans to use this feature for operational efficiency and benefits. Though each snapshot does not create a full disk size image but creates a delta disk. This additional space needs to be planned and higher capacity drive can be useful in this regard.
6. Blade Server Configuration
Blade server type configurations have a very low number of drive slots and are not an ideal fit for VSAN. To overcome this challenge, VSAN 6.0 will support direct attached storage. Though this direct attached storage needs to be certified by VMware, it is again important that the direct attached storage enclosure be populated with large capacity to avoid the above mentioned challenges.
7. Host Maintenance
It is very common that hosts may need some kind of maintenance. With VSAN enabled cluster, when we execute this operations, all the existing data in that host needs to be placed in other available hosts. The migration of data from maintenance hosts needs to be planned similarly and larger capacity drives will come handy.
8. High Capacity Drives
When seeking more capacity, the best approach is to use high-capacity drives. The challenge of growth can be mitigated in a very simple solution, and to a great extent. If further needs require, additional node (servers) can be planned. But implementing higher capacity drives will reduce the early need of adding more hardware nodes.
High-Capacity Flash – SanDisk® Implementation
SanDisk’s flash solution portfolio offers a large number of drives ranging in capacity and endurance, application profile (RI, MU and WI) and storage interfaces (SAS, SATA and PCIe). When seeking a VSAN solution, I can imagine the following configurations to be a good start:
Moderate data growth
For environment where data growth appears to be moderate, a 1.6 TB drive can be considered to start with. Assuming 12 drive slots available in the server are to be populated with such capacity tier drive, around 20 TB of storage can be created in each node and it is a significantly large pool to accommodate moderate size data growth.
High Data Growth
For environments where data growth is significant, I would propose to use our 4 TB drives. Based on the above assumption, each node can go up to 50 TB of capacity.
Conclusion:
There are many ways to approach and contend with the growth of data on the software and hardware end. From a management consideration, reducing the amount of hardware delivers simplification and ease, as well as higher cost efficiencies. Using higher capacity drives in available slots can help organizations keep infrastructure growth in check.
In my next blog, I will share more thoughts on what drives (capacity and performance) you should consider for your VSAN deployment.