Hadoop and Cassandra to merge in DataStax distro

DataStax's combined Hadoop/Cassandra distribution can undertake both database and data analysis duties

Uniting the seemingly conflicting values of fast data access and deep analysis, open-source software company DataStax is developing a package that will combine its Cassandra non-relational database with Apache Hadoop data process framework, the company announced Wednesday.

The distribution, to be called Brisk, combines low-latency data storage and retrieval with the ability to do in-depth analysis of that data, said Matt Pfeil, CEO and co-founder of DataStax, formerly called Riptano.

Cassandra has been traditionally used by Web 2.0 companies that require a fast and scalable way to store simple data sets, while Hadoop has been used for analyzing vast amounts of data across many servers.

Typically, running heavy analytics against live databases has been frowned upon, because it could slow responsiveness of the database. For this distribution, however, DataStax is taking advantage of Cassandra's ability to be distributed across multiple nodes.

In this setup, the data can be replicated, whereupon one copy would be kept with the transactional servers and another copy of the data could be placed on servers that would be subjected to analytics. "The two parts of your data don't interfere with each other," Pfeil said.

The initial customers might be Internet service companies that already use Cassandra for high-volume data capture and retrieval, Pfeil explained. The company is also marketing the package for enterprises, as a potential lower-cost and speedier alternative to databases and data warehouses.

The initial version of Brisk will use Hadoop version 0.20.2, the Hive data warehouse infrastructure version 0.7, and Cassandra 0.7.4. It will keep Hadoop's MapReduce, job tracker and task tracker functionality, but will replace the underlying Hadoop File System (HDFS) with a Cassandra interface called CassandraFS, explains a DataStax white paper describing the technology.

DataStax plans to issue this distribution, under Apache open-source license, within the next two months.

Joab Jackson covers enterprise software and general technology breaking news for The IDG News Service. Follow Joab on Twitter at @Joab_Jackson. Joab's e-mail address is Joab_Jackson@idg.com

Join the newsletter!

Error: Please check your email address.
Rocket to Success - Your 10 Tips for Smarter ERP System Selection

Tags open sourcedatabasesData managementsoftwareapplicationsdata miningDataStax

Keep up with the latest tech news, reviews and previews by subscribing to the Good Gear Guide newsletter.

Joab Jackson

IDG News Service
Show Comments



Victorinox Werks Professional Executive 17 Laptop Case

Learn more >



Back To Business Guide

Click for more ›

Brand Post

Most Popular Reviews

Latest Articles


PCW Evaluation Team

Louise Coady

Brother MFC-L9570CDW Multifunction Printer

The printer was convenient, produced clear and vibrant images and was very easy to use

Edwina Hargreaves

WD My Cloud Home

I would recommend this device for families and small businesses who want one safe place to store all their important digital content and a way to easily share it with friends, family, business partners, or customers.

Walid Mikhael

Brother QL-820NWB Professional Label Printer

It’s easy to set up, it’s compact and quiet when printing and to top if off, the print quality is excellent. This is hands down the best printer I’ve used for printing labels.

Ben Ramsden

Sharp PN-40TC1 Huddle Board

Brainstorming, innovation, problem solving, and negotiation have all become much more productive and valuable if people can easily collaborate in real time with minimal friction.

Sarah Ieroianni

Brother QL-820NWB Professional Label Printer

The print quality also does not disappoint, it’s clear, bold, doesn’t smudge and the text is perfectly sized.

Ratchada Dunn

Sharp PN-40TC1 Huddle Board

The Huddle Board’s built in program; Sharp Touch Viewing software allows us to easily manipulate and edit our documents (jpegs and PDFs) all at the same time on the dashboard.

Featured Content

Product Launch Showcase

Latest Jobs

Don’t have an account? Sign up here

Don't have an account? Sign up now

Forgot password?