Amazon Elastic MapReduce
Developer Guide (API Version 2009-11-30)
Print this pageEmail this pageGo to the ForumsView the PDFShare this page on TwitterShare this page on FacebookBookmark this page on DeliciousSubmit this page to RedditSubmit this page to DiggDid this page help you?  Yes  No   Tell us about it...

Using the AWS SDK for Java to Create an Amazon EMR Job Flow

The AWS SDK for Java provides three packages with Amazon Elastic MapReduce (Amazon EMR) functionality:

For more information about these packages, go to the AWS SDK for Java API Reference.

The following example illustrates how the SDKs can simplify programming with Amazon EMR The code sample below uses the StepFactory object, a helper class for creating common Amazon EMR step types, to create an interactive Hive job flow with debugging enabled.

   AWSCredentials credentials = new BasicAWSCredentials(accessKey, secretKey);
   AmazonElasticMapReduceClient emr = new AmazonElasticMapReduceClient(credentials);

   StepFactory stepFactory = new StepFactory();

   StepConfig enableDebugging = new StepConfig()
       .withName("Enable Debugging")
       .withActionOnFailure("TERMINATE_JOB_FLOW")
       .withHadoopJarStep(stepFactory.newEnableDebuggingStep());

   StepConfig installHive = new StepConfig()
       .withName("Install Hive")
       .withActionOnFailure("TERMINATE_JOB_FLOW")
       .withHadoopJarStep(stepFactory.newInstallHiveStep());

   RunJobFlowRequest request = new RunJobFlowRequest()
       .withName("Hive Interactive")
       .withSteps(enableDebugging, installHive)
       .withLogUri("s3://myawsbucket/")
       .withInstances(new JobFlowInstancesConfig()
           .withEc2KeyName("keypair")
           .withHadoopVersion("0.20")
           .withInstanceCount(5)
           .withKeepJobFlowAliveWhenNoSteps(true)
           .withMasterInstanceType("m1.small")
           .withSlaveInstanceType("m1.small"));

   RunJobFlowResult result = emr.runJobFlow(request);