Hadoop

From Data Control to Regulations – 8 Ways to Make Data Work from DataWorks Summit
Data professionals met recently at the DataWorks Summit Europe to talk about the hottest issues relating to big data management and analytics. Whether it’s the data value chain, governance, management, metadata, or standardization, data control and regulation is a perfect storm of all things data. Here are our 8 key takeaways.
April 25, 2018 • 5 min read Data

Better Data Lake with Apache Hadoop® on Object Storage
S3A is Hadoop's new S3 adapter that allows you to connect your Hadoop cluster to any S3 compatible object storage. Here are three advantages to using Apache Hadoop® on Object Storage vs. traditional scale-out storage, and how you can build a better data lake today.
April 10, 2018 • 5 min read Data

Western Digital Unveils New Addition: 8TB Ultrastar® DC HC320
We are now shipping the latest product in our capacity enterprise HDD product line, the 8TB Ultrastar® DC HC320, providing a new high capacity point in our lower capacity lineup of air-based products. As part of the unification of our new Western Digital brand, the Ultrastar DC HC320 is being introduced under the Western Digital […]
March 15, 2018 • 3 min read Data

Ultrastar® 7K6: More Data, Fewer Disks
Western Digital eliminates platters to deliver better value! Even with Western Digital’s advancements in the SSD business, we remain committed to the value we can deliver with drives based on spinning media. Our latest enterprise-class hard disk drive (HDD), the Ultrastar 7K6, leverages improved areal density to provide the same capacity as the previous generation […]
January 10, 2018 • 3 min read Data

Which Hadoop Architecture is Right for Me?
Hadoop technology just turned 10, and has gained tremendous momentum. But some components of the traditional architecture are coming of age, and a new approach to Hadoop architecture is emerging.
July 26, 2016 • 2 min read Data

Accelerate and Optimize Big Data and Hadoop
Organizations trying to embark on the Big Data journey are often confronted with infrastructure challenges. Prasad Venkatachar shares testing results of a new Hadoop solution leveraging Cisco UCS servers and Fusion ioMemory™ from SanDisk.
May 13, 2016 • 9 min read Data

The Third Phase of Big Data
The demand for higher-capacity, higher-performing systems is driving us toward the third phase of the Big Data revolution.
May 4, 2015 • 6 min read Data

SanDisk® is Cloudera Certified! Hadoop with Confidence
Within our SanDisk® labs, I conducted a number of experiments with Apache Hadoop and SanDisk flash, mainly our CloudSpeed Ascend SATA Solid State Drives (SSDs). The initial experiments were with standard Hadoop benchmarks, namely the Terasort and the TestDFSIO benchmarks. These benchmarks helped show me how SanDisk SSDs helped boost the performance of the Terasort […]
November 10, 2014 • 3 min read Data

Flash-Accelerated Apache HBase
In today’s hyper-connected world, there is a significant amount of data being collected, and later analyzed to make business decisions. This explosion of data has led to various technologies that can operate on this “Big-Data”, technologies for Big-Data Analytics. Traditional database systems and data warehouses are being augmented with newer scale-out open-source technologies like Apache […]
November 3, 2014 • 4 min read Data

Faster Hadoop Reads and Writes with SanDisk® SSDs
In my first blog post on the SanDisk® ITBlog, I talked about testing the Terasort benchmark using SanDisk CloudSpeed SSDs within an Apache Hadoop data-analytics environment. The blog and the more detailed technical paper that followed talked about the significant performance and TCO benefits that can be achieved by strategically using SSDs within a Big-Data […]
September 11, 2014 • 4 min read Data