Columnstore indexes are a powerful memory-optimized column-oriented index intended for data warehouse workloads—but far too few architects know about them or their benefits. Columnstore indexes routinely improve performance by 10x, 100x (such as in the demo below), and even 1,000x—rendering results in seconds that classic row store data structures required minutes to satisfy. Microsoft recommends columnstore indexes as the default choice for data warehouses—and I concur. In this blog I provide:
A compelling video demo of the power of columnstore
A brief introduction to columnstore indexes for data warehousing
Better together: Columnstore on SanDisk® flash (TPC-H world records and Microsoft Data Warehouse Reference Architectures)
The Power of Columnstore
I work in a division of SanDisk called the Data Propulsion Lab (DPL). There we do what I call “flipping the /faster bit”, and columnstore does exactly that for SQL Server data warehouses.
So how much faster is it? See for yourself by viewing the following video. And if you find it compelling that columnstore improves warehouse query performance times from over two minutes down to two seconds, then I invite you to read the rest of the post.
Demo: SQL Server Columnstore Performance
Introduction to Columnstore Indexes
Columnstore indexes were introduced in SQL Server 2012. Even though as stated Microsoft recommends columnstore indexes as the default choice for data warehouses, they remain largely unknown.
Transparent to the app, columnstore has been shown to render queries at sub-second response times which previously took many minutes or longer. In fact, columnstore indexes can increase SQL Server data warehouse query performance by one, two, or even three orders of magnitude.
In SQL Server 2012 some adoption blockers exist, yet columnstore is nonetheless a game changer for many apps.
In SQL Server 2014, columnstore was wholly re-architected and improved, potential blockers have been largely removed, and it is profoundly changing the way we interact with our data.
In SQL Server 2016, Microsoft has continued to enhance this remarkable feature.
How It Works
If we’re looking for a subset of columns from one or a few rows, given the right indexes, SQL Server can do a superlative job of providing an answer. Yet if a query by design needs to hit lots of rows—reporting, aggregations, grouping, scans, etc.—SQL Server has never had a good mechanism—until columnstore.
Subscribe to Blog via Email
By providing your email address, you agree to the terms of Western Digital's Privacy Statement
Contrasted with traditional row stores in which data is physically stored row-by-row, columnstore stores values for all rows for a given column. A given data page will store values only for the given column to which that page is allocated. Storage is pivoted relative to the row stores we’ve all grown up with. Columnstore literally turns row storage on its head—and this isn’t the only thing that’s different.
SanDisk flash is the storage of choice for many configurations, including these seven Data Warehouse Fast Track Reference Architectures (DWFTRA) certified by Microsoft in collaboration with HP and Lenovo:
Table 1. List of publicly available Microsoft Data Warehouse Fast Track Reference Architecture Certifications
Stay tuned for additional information on SQL Server columnstore on SanDisk flash, including real-life performance examples and additional insight using our solutions. We’ll review the architecture, as well as the challenges, workarounds, travails, and big wins at customer sites. See whether columnstore can change your users’ world and why columnstore may be a compelling reason to upgrade.
Columnstore indexes flip the DW /faster bit, manifesting my motto, “Change the world or go home!”
Jimmy May is a SQL Server Certified Master and is a SQL Server Technologist for SanDisk Data Propulsion Lab.Jimmy was a Principal Architect in Microsoft IT and formerly a Senior Program Manager for the SQL Server Customer Advisory Team (SQL CAT) where he managed the Customer Lab which hosts the biggest, fastest, and most interesting SQL Server apps from around the world. His last day job was Senior Database Architect for one of the world’s largest, SQL Server high-throughput OLTP DBs. He was a founder & on the executive committees of both the Indiana Windows User Group (www.iwug.net) & Indianapolis Professional Association for SQL Server (www.indypass.org).His passion is “flipping the /faster bit”; his motto is “change the world or go home”. Visit his SQL Server performance and personal productivity blog at http://sqlblog.com/blogs/jimmy_may.