Hadoop


Better Data Lake with Apache Hadoop® on Object Storage

Better Data Lake with Apache Hadoop® on Object Storage

S3A is Hadoop's new S3 adapter that allows you to connect your Hadoop cluster to any S3 compatible object storage. Here are three advantages to using Apache Hadoop® on Object Storage vs. traditional scale-out storage, and how you can build a better data lake today.

April 10, 2018 • 5 min read Data

Western Digital Unveils New Addition: 8TB Ultrastar® DC HC320

Western Digital Unveils New Addition: 8TB Ultrastar® DC HC320

We are now shipping the latest product in our capacity enterprise HDD product line, the 8TB Ultrastar® DC HC320, providing a new high capacity point in our lower capacity lineup of air-based products. As part of the unification of our new Western Digital brand, the Ultrastar DC HC320 is being introduced under the Western Digital […]

March 15, 2018 • 3 min read Data

Ultrastar® 7K6: More Data, Fewer Disks

Ultrastar® 7K6: More Data, Fewer Disks

Western Digital eliminates platters to deliver better value! Even with Western Digital’s advancements in the SSD business, we remain committed to the value we can deliver with drives based on spinning media. Our latest enterprise-class hard disk drive (HDD), the Ultrastar 7K6, leverages improved areal density to provide the same capacity as the previous generation […]

January 10, 2018 • 3 min read Data

Which Hadoop Architecture is Right for Me?

Which Hadoop Architecture is Right for Me?

Hadoop technology just turned 10, and has gained tremendous momentum. But some components of the traditional architecture are coming of age, and a new approach to Hadoop architecture is emerging.

July 26, 2016 • 2 min read Data

Accelerate and Optimize Big Data and Hadoop

Accelerate and Optimize Big Data and Hadoop

Organizations trying to embark on the Big Data journey are often confronted with infrastructure challenges. Prasad Venkatachar shares testing results of a new Hadoop solution leveraging Cisco UCS servers and Fusion ioMemory™ from SanDisk.

May 13, 2016 • 9 min read Data

The Third Phase of Big Data

The Third Phase of Big Data

The demand for higher-capacity, higher-performing systems is driving us toward the third phase of the Big Data revolution.

May 4, 2015 • 6 min read Data

SanDisk® is Cloudera Certified! Hadoop with Confidence

SanDisk® is Cloudera Certified! Hadoop with Confidence

Within our SanDisk® labs, I conducted a number of experiments with Apache Hadoop and SanDisk flash, mainly our CloudSpeed Ascend SATA Solid State Drives (SSDs). The initial experiments were with standard Hadoop benchmarks, namely the Terasort and the TestDFSIO benchmarks. These benchmarks helped show me how SanDisk SSDs helped boost the performance of the Terasort […]

November 10, 2014 • 3 min read Data

Flash-Accelerated Apache HBase

Flash-Accelerated Apache HBase

In today’s hyper-connected world, there is a significant amount of data being collected, and later analyzed to make business decisions. This explosion of data has led to various technologies that can operate on this “Big-Data”, technologies for Big-Data Analytics. Traditional database systems and data warehouses are being augmented with newer scale-out open-source technologies like Apache […]

November 3, 2014 • 4 min read Data

Faster Hadoop Reads and Writes with SanDisk® SSDs

Faster Hadoop Reads and Writes with SanDisk® SSDs

In my first blog post on the SanDisk® ITBlog, I talked about testing the Terasort benchmark using SanDisk CloudSpeed SSDs within an Apache Hadoop data-analytics environment. The blog and the more detailed technical paper that followed talked about the significant performance and TCO benefits that can be achieved by strategically using SSDs within a Big-Data […]

September 11, 2014 • 4 min read Data