In Pictures: 18 essential Hadoop tools for crunching big data

Making the most of this powerful MapReduce platform means mastering a vibrant ecosystem of quickly evolving code

In Pictures: 18 essential Hadoop tools for crunching big data prev next

Loading...

ZooKeeper Once Hadoop runs on more than a few machines, making sense of the cluster requires order, especially when some of the machines start checking out.

ZooKeeper imposes a file system-like hierarchy on the cluster and stores all of the metadata for the machines so you can synchronize the work of the various machines. (The image at left shows a simple two-tiered cluster.) The documentation shows how to implement many of the standard techniques for data processing, such as producer-consumer queues so the data is chopped, cleaned, sifted, and sorted in the right order. The nodes use ZooKeeper to signal each other when they're done so the others can start up with the data.

For more information, documentation, and the latest builds turn to http://zookeeper.apache.org/.

Prev Next 9/19

Comments on this image

Close

In Pictures: 18 essential Hadoop tools for crunching big data

19 images
Shopping.com

Don’t have an account? Sign up here

Don't have an account? Sign up now

Forgot password?