Apache Cassandra 2.0 brings the old school SQL functionality

Relational database administrators will recognize Cassandra's new triggers, lightweight transactions and SQL-like query language

The Apache Cassandra NoSQL distributed data store continues to accumulate features that mimic traditional databases, with the newly released version 2 of the open source software offering triggers, lightweight transactions and an updated query language similar to SQL.

"A lot of our development is driven [to dissolve] the pain points that our users are feeling the most distress over," said Cassandra vice president Jonathan Ellis, about the release of Cassandra 2.0, which is managed under the Apache Software Foundation.

First created for Facebook, Cassandra is a NoSQL distributed data store easily able to handle large amounts of writes and reads, a quality that has won favor with both high volume Internet services as well as with those firms executing big data-styled analysis.

Organizations such as Adobe, CERN, Comcast, eBay, GoDaddy, Hewlett-Packard, IBM, Instagram, Netflix, and Sony all use the software.

Many of the changes in the new version offer capabilities long enjoyed by relational databases, capabilities that could make Cassandra suitable as a replacement for a traditional database, at least for some use cases.

Perhaps the most notable feature is the support for lightweight transactions, which guarantees that any one data store operation isn't interrupted by any other operation. "We're the first eventually consistent database to implement lightweight transactions," Ellis said.

Long a feature of traditional SQL databases, lightweight transactions assure that, for instance, two accounts with the same user name can't be created at the same time. It essentially locks data that is being read or updated by an operation so another operation doesn't change the data mid-transaction, or reads data that is about to be rendered outdated.

The Cassandra project team found that the lack of support for lightweight transactions in Cassandra had motivated some users to run two databases instead of one.

Such users had split off the most highly consulted portions of their relational databases to run under Cassandra for speedier performance. But they didn't migrate their entire databases to Cassandra because of the concerns around lock management. Others used an external locking mechanism such as Apache ZooKeeper, which brought a new set of complexities.

The new version of Cassandra also reintroduces an old database concept called triggers, a form of stored procedures. Two decades back, triggers had been used with traditional databases to centralize calculations in the database itself, in order to improve consistency of results across different applications that used the database.

Triggers lost favor over time, however, because their operation could slow operations of a database. Thus, many moved their calculations out to application servers, where they could be more easily scaled.

Cassandra has an advantage over relational databases, however, in that it is a distributed database, which means executing triggers no longer has to be a performance bottleneck; if performance slows, another Cassandra node can be added.

"So when it makes sense to have logic closer to the data, you can do that. It has come full circle to where that makes sense again," Ellis said.

The development team also endeavored to ease the jobs of developers building their applications on top of Cassandra, by replacing a notoriously difficult to use API (Application Programming Interface) with the Cassandra Query Language (CQL), which resembles the widely understood SQL used for relational databases.

Cassandra has offered CQL since January, but the 2.0 release has corrected some minor issues, making it ready for full production use.

With the CQL, "we're comfortable with telling people that this is what you should be building your applications with," Ellis said. "You don't need to use that old API that everyone complained was too difficult to use. You can use CQL. The learning curve is dramatically lower."

Also, as with any major release, Cassandra 2.0 sheds a lot of obsolete functionality. Upgrading to version 2.0 requires an old version of at least 1.2, which allowed project keepers to clean up old code and disregard little used or problematic code.

The updated software also includes a number of tweaks to speed performance.

It includes a new method of compaction, one that prevents read performance from suffering when large amounts of material are being written to disk. Message processing latencies have been reduced thanks to an implementation of the Disruptor high performance inter-thread messaging library. The number of timeouts can be reduced thanks to a new technique of sending redundant requests to other nodes, should too much time pass before the original request is fulfilled.

Joab Jackson covers enterprise software and general technology breaking news for The IDG News Service. Follow Joab on Twitter at @Joab_Jackson. Joab's e-mail address is Joab_Jackson@idg.com

Join the PC World newsletter!

Error: Please check your email address.

Tags open sourceapplicationsdatabasessoftwareApache Software Foundation

Our Back to Business guide highlights the best products for you to boost your productivity at home, on the road, at the office, or in the classroom.

Keep up with the latest tech news, reviews and previews by subscribing to the Good Gear Guide newsletter.

Joab Jackson

IDG News Service
Show Comments

Essentials

Lexar® JumpDrive® S57 USB 3.0 flash drive

Learn more >

Microsoft L5V-00027 Sculpt Ergonomic Keyboard Desktop

Learn more >

Mobile

Lexar® JumpDrive® S45 USB 3.0 flash drive 

Learn more >

Exec

Audio-Technica ATH-ANC70 Noise Cancelling Headphones

Learn more >

Lexar® JumpDrive® C20c USB Type-C flash drive 

Learn more >

HD Pan/Tilt Wi-Fi Camera with Night Vision NC450

Learn more >

Lexar® Professional 1800x microSDHC™/microSDXC™ UHS-II cards 

Learn more >

Budget

Back To Business Guide

Click for more ›

Most Popular Reviews

Latest News Articles

Resources

PCW Evaluation Team

Azadeh Williams

HP OfficeJet Pro 8730

A smarter way to print for busy small business owners, combining speedy printing with scanning and copying, making it easier to produce high quality documents and images at a touch of a button.

Andrew Grant

HP OfficeJet Pro 8730

I've had a multifunction printer in the office going on 10 years now. It was a neat bit of kit back in the day -- print, copy, scan, fax -- when printing over WiFi felt a bit like magic. It’s seen better days though and an upgrade’s well overdue. This HP OfficeJet Pro 8730 looks like it ticks all the same boxes: print, copy, scan, and fax. (Really? Does anyone fax anything any more? I guess it's good to know the facility’s there, just in case.) Printing over WiFi is more-or- less standard these days.

Ed Dawson

HP OfficeJet Pro 8730

As a freelance writer who is always on the go, I like my technology to be both efficient and effective so I can do my job well. The HP OfficeJet Pro 8730 Inkjet Printer ticks all the boxes in terms of form factor, performance and user interface.

Michael Hargreaves

Windows 10 for Business / Dell XPS 13

I’d happily recommend this touchscreen laptop and Windows 10 as a great way to get serious work done at a desk or on the road.

Aysha Strobbe

Windows 10 / HP Spectre x360

Ultimately, I think the Windows 10 environment is excellent for me as it caters for so many different uses. The inclusion of the Xbox app is also great for when you need some downtime too!

Mark Escubio

Windows 10 / Lenovo Yoga 910

For me, the Xbox Play Anywhere is a great new feature as it allows you to play your current Xbox games with higher resolutions and better graphics without forking out extra cash for another copy. Although available titles are still scarce, but I’m sure it will grow in time.

Featured Content

Latest Jobs

Don’t have an account? Sign up here

Don't have an account? Sign up now

Forgot password?