In Pictures: 18 essential Hadoop tools for crunching big data

Making the most of this powerful MapReduce platform means mastering a vibrant ecosystem of quickly evolving code

In Pictures: 18 essential Hadoop tools for crunching big data prev next

Loading...

Pig Once data is stored in nodes in a way Hadoop can find it, the fun begins. Apache's Pig plows through the data, running code written in its own language, called Pig Latin, filled with abstractions for handling the data. This structure steers users toward algorithms that are easy to run in parallel across the cluster.

Pig comes with standard functions for common tasks like averaging data, working with dates, or finding differences between strings. When those aren't enough -- as they often aren't -- you can write your own functions. The image at left shows one elaborate example from Apache's documentation of how you can mix your own code with Pig's to mine the data.

The latest version can be found at http://pig.apache.org.

Prev Next 8/19

Comments on this image

Close

In Pictures: 18 essential Hadoop tools for crunching big data

19 images
Shopping.com

Don’t have an account? Sign up here

Don't have an account? Sign up now

Forgot password?