Amazon Elastic MapReduce
Based on Hadoop, MapReduce equips users with potent distributed data-processing tools
- Doesn't take long to get the hang of
- Currently available in the US region only
You'll want to be familiar with the Apache Hadoop framework before you jump into Elastic MapReduce. It doesn't take long to get the hang of it, though. Most developers can have a MapReduce application running within a few hours.
These two steps, the map function and the reduce function, comprise what Amazon MapReduce refers to as a "job flow." Admittedly, this is an oversimplification, because job flows involve other configuration parameters (such as where you get the input data and where you put the output), and you can define additional steps in the process, but that's the basic idea.
As a result, a programmer building a Hadoop-powered MapReduce system can focus on the comparatively simple job of crafting the individual functions that process single key/value pairs at a time. Hadoop does the legwork of carving the input data into initial key/value pairs; starting multiple map function instances; feeding them input data; gathering, sorting, and ordering the intermediate key/value pairs; launching reduce instances; feeding them the properly arranged intermediate data; and -- finally -- delivering the output. And all the while, Hadoop monitors the progress map and reduce tasks, as well as restarts "dead" ones automatically. Whuf.
Hadoop in the cloud
To access Amazon's Elastic MapReduce, your first stop is your Amazon Web Services account page (assuming you have an account with AWS), where you must sign up for the Elastic MapReduce service. Then, head on over to the AWS Management Console and log in. You'll find that the AWS Console -- which had been a control panel for Amazon's EC2 only -- displays a new Amazon Elastic MapReduce tab. Click the tab, and you are transferred to the Job Flows page, from which you can monitor the status of current job flows, as well as examine details of previous (terminated) job flows.
To define a new job flow, click the Create New Job Flow button. This sends you through a series of windows in step-by-step fashion. You fill in textboxes to define the location of your input data, where you want your output data, and paths to your map and reduce function. All of these locations must exist in Amazon S3 buckets. In the case of the output data, the location will exist when the job flow concludes. Consequently, it's a good idea to have a utility for transferring data to and from S3 on hand. I recommend the excellent S3Fox Organizer.
Amazon Elastic MapReduce allows for two kinds of job flows: custom jar and streaming. A custom jar-style job flow expects your map and reduce functions to be in compiled Java classes stored in Java JAR files. The Hadoop framework is Java-based, so a custom jar job flow provides the better performance. On the other hand, a streaming-type job flow lets you write your map and reduce functions in non-Java languages such as Python, Ruby, Perl, and others. The functions of a streaming job flow read the input data from stdin, and send the output to stdout. So, data flows in and out of the functions as strings, and -- by convention -- a tab separates the key and value of each input line. Once you've specified the whereabouts of your job flow's components, you identify the quantity and processing power of the EC2 instances on which the job will execute. You can select up to 20 EC2 instances; any more than that, and you have to fill out a special request form. Your choice of compute instances ranges from Small to Extra Large High CPU. Check the Amazon documentation for a complete description of the power of a CPU instance.
Join the newsletter!
Nespresso Creatista Coffee Machine
WD MY PASSPORT™ X Gaming Storage
WD MY PASSPORT™ Gaming Storage
Panasonic OLED 4K Ultra HD TV - TH-55EZ950U
Dyson Supersonic™ Hair Dryer Fuchsia/Iron
cloudandco Smart Cane
Apple iPhone X
Panasonic OLED 4K Ultra HD TV - TH-77EZ1000U
SanDisk MicroSDXC™ for Nintendo® Switch™
Breitling Superocean Heritage Chronographe 44
Toys for Boys
WD MY CLOUD™ HOME Personal Cloud Storage
Toffee Bags Commuter Satchel
Panasonic Hi-Fi - SC-UA7GS-K
Amazon Echo Bluetooth Speaker
Panasonic 4K UHD Blu-Ray Player and Full HD Recorder with Netflix - UBT1GL-K
Dearear Endear In-ear Wireless Earphones
iRobot Roomba 980 Vaccum Cleaning Robot
Xbox One X
PETKIG Go Smart Dog Leash
Razer DeathAdder Expert Ergonomic Gaming Mouse
Lexon Flip Alarm Clock
Urbanworx Full HD Action Camera
Panasonic Portable Splashproof Fun - RF-D20U
Ikea NORDMÄRKE Wireless Charging Pad
Kogan Bluetooth Soundbar
Raspberry Pi Starter Kit
Most Popular Reviews
- 1 Huawei Mate 10 Pro Review: A solid winter flagship that cribs from the best
- 2 Google Pixel 2 review: not quite 'pixel perfect' but damn close
- 3 Huawei Nova 2i review: Flagship features get smuggled into the mid-tier
- 4 Moto X4 review: This is what a world without MotoMods looks like
- 5 Giabyte Aorus X9 Gaming Laptop review: Full, in-depth review
PCW Evaluation Team
Brainstorming, innovation, problem solving, and negotiation have all become much more productive and valuable if people can easily collaborate in real time with minimal friction.
The print quality also does not disappoint, it’s clear, bold, doesn’t smudge and the text is perfectly sized.
The Huddle Board’s built in program; Sharp Touch Viewing software allows us to easily manipulate and edit our documents (jpegs and PDFs) all at the same time on the dashboard.
The biggest perks for me would be that it comes with easy to use and comprehensive programs that make the collaboration process a whole lot more intuitive and organic
I rate the printer as a 5 out of 5 stars as it has been able to fit seamlessly into my busy and mobile lifestyle.
It’s perfect for mobile workers. Just take it out — it’s small enough to sit anywhere — turn it on, load a sheet of paper, and start printing.
- Huawei Mate 10 Pro review
- Dell Inspiron 5675 Gaming Desktop review
- Hands On: Our first impressions of Sony's a7R III
- What's the difference between an Intel Core i3, i5 and i7?
- Laser vs. inkjet printers: which is better?
Product Launch Showcase
- CCDeployment ManagerVIC
- FTSystem Specialist - Operational Technology SystemsOther
- TPProject ManagerNSW
- FTTechnical Digital Producer | 6 Month ContractOther
- FTReporting Analyst - Planning & ForecastingOther
- FTHFC Capacity Planner | 6mth ContractOther
- CCSenior Technical Business AnalystNSW
- FTNetwork Engineer, Voice & DataOther
- FTSAP CRM FunctionalistOther
- FTDigital BAOther
- CC.net DeveloperNSW
- FTSenior Network Engineer - Incident & OperationsOther
- CCSAP HR Functional ConsultantNSW
- CCLinux AdministratorNSW
- FTService Team LeaderACT
- CCPega Resources Required - Developers & ArchitectsACT
- CCDigital Content ExecutiveNSW
- CCIntel IT ArchitectNSW
- CCLSS BPMN Process Improvement AnalystVIC
- FTChange ManagerOther
- FTDevops Engineer X 2 positionsOther
- FTJunior Account Manager - Global Cloud OrganisationVIC
- CCDatabase Systems SpecialistNSW
- TPInstructional DesignerNSW
- CCDynamics AX Functional Consultant ? Finance | Supply ChainQLD