MarkLogic ties its database to Hadoop for 'Big Data' support

The XML-powered data store specializes in handling unstructured information

You can add MarkLogic to the growing list of database vendors rushing to embrace the open-source Hadoop programming framework for large-scale data processing.

MarkLogic 5, which became generally available on Tuesday, includes a Hadoop connector that will allow customers to "aggregate data inside MarkLogic for richer analytics, while maintaining the advantages of MarkLogic indexes for performance and accuracy," the company said.

MarkLogic is a "real, enterprise-class database, but it uses XML and XQuery instead of SQL, so it's well-suited for certain classes of applications," said analyst Curt Monash of Monash Research. "They have a nice scale-out story and they're dotting some i's and crossing some t's on industrial-strength performance."

The database's calling card has been its ability to manage, index and serve up large amounts of unstructured data, from text documents to media files.

It makes sense for MarkLogic to support Hadoop, Monash said.

"There are some multi-structured data use cases that are an obvious fit for MarkLogic over Hadoop and vice versa," he said. "Any integration lets you straddle them and get broader reach."

For example, an insurance company may have a set of documents numbering in the billions that it wants to pull up one by one and perform analytics on each, he said. "That would be a great use case for the combination," with MarkLogic handling the first part and Hadoop the second, he said.

The Hadoop tie-in reflects the broader trend around "Big Data," an industry buzzword that refers to the ever-increasing amount of unstructured information from sources apart from traditional enterprise applications, such as social networking sites and sensors.

Meanwhile, another new feature in MarkLogic 5 tries to make the most of the mix of storage customers might have, said CTO Ron Avnur. "We realized people have rotational drives and network-attached storage, and are starting to play more seriously with solid-state. These have different performance profiles."

System administrators will tell MarkLogic where and what the options for storage are, and the system will "do all the optimization." In this way, more frequently used data can be kept in flash and older or less frequently accessed information held elsewhere.

The new release also adds dashboards for overseeing multiple MarkLogic clusters. Customers may have development, test and production systems, and "they want to understand what's going on across those," Avnur said.

Also new are tie-ins to the Nagios open-source monitoring framework and Hewlett-Packard's Operations Manager software, as well as an API (application programming interface) that can be used to integrate with other management systems.

In addition, MarkLogic 5 features the ability to keep a "hot copy" of the database in another data center for quick failover in the event of a disaster, as well as a journal-archiving function that allows a database to be restored to a particular point in time.

The company is also rolling out a new version of its developer edition, with the chief change being that customers can now use it in production. It's limited to a single two-CPU node and 40GB of data.

The company is small compared to database giant Oracle, with US$50 million in revenue through the end of last year, but is growing quickly, according to Bill Veiga, vice president of solutions marketing.

It has 275 distinct customers and more than 500 implementations, Veiga added.

Chris Kanaracus covers enterprise software and general technology breaking news for The IDG News Service. Chris's e-mail address is Chris_Kanaracus@idg.com

Join the PC World newsletter!

Error: Please check your email address.
Rocket to Success - Your 10 Tips for Smarter ERP System Selection

Tags MarkLogicDevelopment toolsapplication developmentYahoodatabasesapplicationssoftwarebusiness intelligence

Keep up with the latest tech news, reviews and previews by subscribing to the Good Gear Guide newsletter.

Chris Kanaracus

IDG News Service
Show Comments

Most Popular Reviews

Latest Articles

Resources

PCW Evaluation Team

Matthew Stivala

HP OfficeJet 250 Mobile Printer

The HP OfficeJet 250 Mobile Printer is a great device that fits perfectly into my fast paced and mobile lifestyle. My first impression of the printer itself was how incredibly compact and sleek the device was.

Armand Abogado

HP OfficeJet 250 Mobile Printer

Wireless printing from my iPhone was also a handy feature, the whole experience was quick and seamless with no setup requirements - accessed through the default iOS printing menu options.

Azadeh Williams

HP OfficeJet Pro 8730

A smarter way to print for busy small business owners, combining speedy printing with scanning and copying, making it easier to produce high quality documents and images at a touch of a button.

Andrew Grant

HP OfficeJet Pro 8730

I've had a multifunction printer in the office going on 10 years now. It was a neat bit of kit back in the day -- print, copy, scan, fax -- when printing over WiFi felt a bit like magic. It’s seen better days though and an upgrade’s well overdue. This HP OfficeJet Pro 8730 looks like it ticks all the same boxes: print, copy, scan, and fax. (Really? Does anyone fax anything any more? I guess it's good to know the facility’s there, just in case.) Printing over WiFi is more-or- less standard these days.

Ed Dawson

HP OfficeJet Pro 8730

As a freelance writer who is always on the go, I like my technology to be both efficient and effective so I can do my job well. The HP OfficeJet Pro 8730 Inkjet Printer ticks all the boxes in terms of form factor, performance and user interface.

Michael Hargreaves

Windows 10 for Business / Dell XPS 13

I’d happily recommend this touchscreen laptop and Windows 10 as a great way to get serious work done at a desk or on the road.

Featured Content

Latest Jobs

Don’t have an account? Sign up here

Don't have an account? Sign up now

Forgot password?