Saving money by using Amazon EC2 Spot Instances to execute EMR job flows

Amazon EC2 Spot Instances allow us to purchase underutilized EC2 compute resources at a significant discount. The prices of Spot Instances change depending on the demand. We can submit bids for the Spot Instances and we receive the requested compute instances, if our bid exceeds the current Spot Instance price. Amazon bills these instances based on the actual Spot Instance price, which can be lower than your bid. Amazon will terminate your instances, if the Spot Instance price exceeds your bid. However, Amazon do not charged for partial spot instance hours if Amazon terminated your instances. You can find more information on Amazon EC2 Spot Instances on http://aws.amazon.com/ec2/spot-instances/.

Amazon EMR supports using Spot Instances both as master as well as worker compute instances. Spot Instances are ideal to execute non-time critical computations such as batch jobs.

How to do it...

The following steps show you how to use Amazon EC2 Spot Instances with Amazon Elastic MapReduce to execute the WordCount MapReduce application.

  1. Follow the steps 1 to 8 of the Running Hadoop MapReduce computations using Amazon ElasticMapReduce (EMR) recipe.
  2. Configure your EMR job flow to use Spot Instances in the Configure EC2 Instances tab. (See Step 9 of the Running Hadoop MapReduce computations using Amazon ElasticMapReduce (EMR) recipe).
  3. In the Configure EC2 Instances tab, select the Request Spot Instances checkboxes next to the Instance Type drop-down boxes under Master and Core Instance Group and Core Instance Group.
  4. Specify your bid price in the Spot Bid Price textboxes. You can find the Spot Instance pricing history in the Spot Requests window of the Amazon EC2 console (https://console.aws.amazon.com/ec2).
    How to do it...
  5. Follow the steps 10 to 12 of the Running Hadoop MapReduce computations using Amazon ElasticMapReduce (EMR) recipe.

There's more...

You can also run the EMR computations on a combination of traditional EC2 on-demand instances and EC2 Spot instances, safe guarding your computation against possible Spot instance terminations.

As Amazon bills the Spot Instances using the current spot price irrespective of your bid price, it is a good practice not to set the Spot Instance price too low to avoid the risk of frequent terminations.

See also

  • The Running Hadoop MapReduce computations using Amazon Elastic MapReduce (EMR) recipe from this chapter.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset