Data Warehouse in the Era of Flash: 20-28TB DW in a 2U server

Data Warehouse in the Era of Flash: 20-28TB DW in a 2U server

Faster insights from your business data – that’s what you expect from your data warehouse to help you achieve your business outcomes. So let’s talk about the systems that enable you to achieve that and how to choose the right one for your workloads.

In the first post in this series, “Data Warehouse in the Era of Flash – Better, Stronger, Faster”, I wrote about the role of storage in data warehouse systems, why some familiar practices are tied to HDDs and don’t apply to flash storage, and how Microsoft’s SQL Server Data Warehouse Fast Track program certifies balanced data warehouse systems.

In this blog post I’ll look at two Fast Track systems rated for 20TB to 28TB that you can deploy in a single 2U server. These systems deliver very similar results, varying somewhat in capacity and performance.

Data Warehouse Lenovo SanDisk
Lenovo System x3650 M5

Lenovo System x3650 M5

The first system is based on the Lenovo System x3650 M5, FT-rated for 20TB data warehouse. (Microsoft’s DWFT certification appears on page 4.) This is a 2-socket server that takes 2U of rack space, which Lenovo classifies as a Standard data warehouse system with a capacity range of 10-40TB.

This system stores the data warehouse and tempdb on six IBM 1300GB Enterprise io3 Flash Adapter for System x, which is Lenovo’s branding of SanDisk® Fusion ioMemory PX600-1300.

Note the “measured throughput” which is 265 queries/hour/TB using SQL Server’s row store, or 1,961 queries/hour/TB using column store.

Multiplying queries by terabytes gives us 5,300 queries/hour using SQL Server’s row store features, and 38,220 queries/hour using SQL Server’s in-memory columnstore features. CPU utilization is 86% using row store, and 98% using columnstore.

Data Warehouse HP SanDisk
HP DL380 Gen8 server

HP DL380 Gen8 server

The second system is based on the HP DL380 Gen8 server, FT-rated for a 28TB data warehouse. (Disregard the document title, you’ll see on Microsoft’s DWFT certification on page 6 that the FT-rated capacity is 28TB.) This is also a 2-socket server that takes 2U of rack space.

This HP system stores the data warehouse on four HP 2.6TB HH/HL Light Endurance (LE) PCIe Workload Accelerators, which is HP’s rebranding of SanDisk Fusion ioMemory PX600-2600.

The “measured throughput” is 202 queries/hour/TB using SQL Server’s row store, or 1,414 queries/hour/TB using column store.

Multiplying queries by terabytes gives us 5,656 queries/hour using SQL Server’s row store features, and 39,592 queries/hour using SQL Server’s in-memory columnstore features. CPU utilization is 96% using row store, and 98% using columnstore.

Points of comparison

The Lenovo system uses six 1.3TB cards, which provides six controllers (one on each card) to access the flash storage. That delivers higher performance, and contributes to the 96% CPU utilization with row store.

The HP system uses four 2.6TB cards, delivering higher capacity at slightly lower performance from only four controllers, contributing to the 86% CPU utilization with row store.

Both systems deliver 96% CPU utilization with columnstore.

What are the Benefits of These Systems?

Balanced. These systems are pre-configured and validated to be balanced data warehouse systems, avoiding the over- or under-spending in one area of the system; it’s just right for this capacity.

Complete. These are complete solutions, ordered off the server vendors’ price lists. They include the rebranded SanDisk ioMemory flash, which is covered by the server vendor’s warranty and support services, and SQL Server 2014 Enterprise Edition for a complete set of business intelligence tools.

Performance. These modern servers and CPUs take full advantage of SanDisk flash storage to deliver performance that lets you maximize the benefit from this data warehouse system. More users, more queries – more business insights, better business results.

Simple and Compact. Everything you need is installed in these 2U servers. No additional infrastructure is required to deal with external storage.

Reliable and Economical. Using SanDisk flash instead of HDDs improves system reliability. SanDisk flash is significantly more reliable than HDDs with up to 10,000 times fewer uncorrectable bit errors, requires much less power to operate, and generates far less heat so you save on cooling cost.

Either of these systems is a great data warehouse solution for customers needing a 20-28TB data warehouse.

Learn more about data warehouse in the era of flash on our data warehouse hub, and at SQL PASS Summit visit SanDisk in booth #222.

Related Stories

Net-Zero Supply Chain