Hadoop gets more search with MapR, Cloudera releases

MapR uses the LucidWorks Search while Cloudera releases its SQL-compliant Impala

Users of the Hadoop data processing platform now have two more search engines to help them sort through their mountains of information.

Hadoop distributor MapR has integrated the LucidWorks Search into its own distribution. And Cloudera has launched the first full release of its open source Impala SQL search engine for Hadoop.

"Using search as the user interface for big data is very interesting. Search is well suited to leveraging a lot of different types of information, especially unstructured information," said Jack Norris, chief marketing officer for MapR. "We're seeing some really interesting applications with search engines at their core, even if a typical user would not think of them as search engine driven."

LucidWorks Search is the commercial version of the open source Apache Lucene/Solr full-text search engine. With the new MapR integration, LucidWorks Search can search through either data on the Hadoop File Systems (HDFS) or on files on other file systems.

LucidWorks Search offers snapshots and mirrors for high availability, and eliminates much of the work required to install Lucene/Solr from scratch. It also offers native support for more data sources, a graphical user interface and a security framework.

The search engine could be used in a dynamic Web application to quickly retrieve photos, advertising, product recommendations, and other information that can be used to populate Web sites on the fly. "This isn't a lower cost substitute for data warehouses. This is about leveraging new data sources and doing some things that have a dramatic impact on the business," Norris said.

MapR and LucidWorks have been working together on pairing their technologies since 2011, when they formed a joint marketing agreement. Earlier this year, they released a connector that makes it easy to use Lucene/Solr with the MapR Hadoop distribution.

LucidWords Search works with the MapR's newly released M7 distribution, in beta form. In addition to supporting LucidWorks Search, the M7 edition has been re-architected to eliminate compactions or background consistency checks, speeding performance.

Also this week, Cloudera released version 1.0 of Cloudera Impala, an open source SQL-compliant query engine for Hadoop. SQL is the database interface language used in relational database management systems (RDMS) and is well-known by database administrators.

Impala was designed to execute queries faster than Hadoop's Hive, because it doesn't use the MapReduce framework, which requires search results to be written to disk. Instead, users can query data stored in HDFS and HBase directly. Users can query data either interactively or through batch processes.

Cloudera first released a version of this engine last October as a beta. Since then, the software has been tested by companies such as 37signals and Expedia.

Impala is the core component of the Cloudera Enterprise RTQ (Real-Time Query) supplemental package for the Cloudera Hadoop platform. Impala can be downloaded at no cost.

Joab Jackson covers enterprise software and general technology breaking news for The IDG News Service. Follow Joab on Twitter at @Joab_Jackson. Joab's e-mail address is Joab_Jackson@idg.com

Join the newsletter!

Error: Please check your email address.
Rocket to Success - Your 10 Tips for Smarter ERP System Selection

Tags open sourcedatabasessoftwareapplicationsclouderadata miningMapR

Keep up with the latest tech news, reviews and previews by subscribing to the Good Gear Guide newsletter.

Joab Jackson

IDG News Service
Show Comments


James Cook University - Master of Data Science Online Course

Learn more >


Victorinox Werks Professional Executive 17 Laptop Case

Learn more >



Back To Business Guide

Click for more ›

Brand Post

Most Popular Reviews

Latest Articles


PCW Evaluation Team

Andrew Teoh

Brother MFC-L9570CDW Multifunction Printer

Touch screen visibility and operation was great and easy to navigate. Each menu and sub-menu was in an understandable order and category

Louise Coady

Brother MFC-L9570CDW Multifunction Printer

The printer was convenient, produced clear and vibrant images and was very easy to use

Edwina Hargreaves

WD My Cloud Home

I would recommend this device for families and small businesses who want one safe place to store all their important digital content and a way to easily share it with friends, family, business partners, or customers.

Walid Mikhael

Brother QL-820NWB Professional Label Printer

It’s easy to set up, it’s compact and quiet when printing and to top if off, the print quality is excellent. This is hands down the best printer I’ve used for printing labels.

Ben Ramsden

Sharp PN-40TC1 Huddle Board

Brainstorming, innovation, problem solving, and negotiation have all become much more productive and valuable if people can easily collaborate in real time with minimal friction.

Sarah Ieroianni

Brother QL-820NWB Professional Label Printer

The print quality also does not disappoint, it’s clear, bold, doesn’t smudge and the text is perfectly sized.

Featured Content

Product Launch Showcase

Latest Jobs

Don’t have an account? Sign up here

Don't have an account? Sign up now

Forgot password?