Amazon Elastic MapReduce
Developer Guide (API Version 2009-11-30)
Print this pageEmail this pageGo to the ForumsView the PDFShare this page on TwitterShare this page on FacebookBookmark this page on DeliciousSubmit this page to RedditSubmit this page to DiggDid this page help you?  Yes  No   Tell us about it...

Generating a Query Request Using AWS Ruby Gems

Because AWS releases new features on a continual basis, there may be times when a new API is not available as a wrapper function in the SDK. When that happens, you can use the SDK to sign and send a raw Query request to the web service to access the API. Using the SDK to create the Query request gives you access to functions that simplify the process of formatting and signing the Query. The following code uses the Ruby SDK to create a Query request.

Before you begin, you'll need to have Ruby (version 1.8.7 or later) and Ruby Gems (version 1.3.6 or later) installed. You can check the installation status and version of both of these with the following commands.

ruby -v
gem -v
		

Next, you'll need to install the aws-sdk gem. You can do this with the following command.

sudo gem install aws-sdk		
		

Finally, make sure that your AWS security credentials are stored in a JSON file as credentials.json. For information on how to create and format this file, go to Create a Credentials File.

Create a file named send_raw_request.rb, containing the following script.

#!/usr/bin/env ruby
require 'rubygems'
require 'aws-sdk'
require 'cgi'
require 'net/http'
require 'credentials'

class EMRClient
  
  def initialize(secret_key, access_key)
    @secret_key = secret_key
    @access_key = access_key
  end
  
  def signable_string(request_string)
    uri = URI.parse(request_string)
    host = uri.host
    lines = uri.query.split('&')
    path = uri.path
    if path == ""
      path ="/"
    end
    curtime = CGI::escape(Time.now.gmtime.strftime("%Y-%m-%dT%H:%M:%S"))
    lines.insert(0,"Timestamp=#{curtime}")
    lines.insert(0,"AWSAccessKeyId=#{CGI::escape(@access_key)}")
    lines.insert(0,"SignatureMethod=HmacSHA256")
    lines.insert(0,"SignatureVersion=2")
    lines.each do |line|
      if line =~ /AuthParams/
        lines.delete(line)
      end
    end
    lines = lines.sort
    string = "GET\n#{host}\n#{path}\n" + lines.join('&')
  end
  
  def send_request(request_string)
	puts '--------------'
  	puts request_string
	puts '--------------'
	
    signer = AWS::DefaultSigner.new(@access_key, @secret_key)
    signature = CGI::escape(signer.sign(request_string))
    request = (request_string + "&Signature=#{signature}").split("\n")[-1]
	
	puts '--------------'
	puts request
	puts '--------------'
	Net::HTTP.get_print( 'elasticmapreduce.amazonaws.com', ('/?' + request))
  end

  def read_request(file_path)
    infile = File.new(file_path)
    input = ""
    while (inline = infile.gets)
      input += inline.strip
    end
    signable = signable_string input
    sent_request = send_request(signable)
  end

end

creds = Credentials.new
options = {}
creds.parse_credentials('credentials.json', options)
client = EMRClient.new(options[:aws_secret_key], options[:aws_access_id])
client.read_request(ARGV[0])
		

The input to the script is a text file containing a Query request such as the following.

https://elasticmapreduce.amazonaws.com?Action=RunJobFlow
&Instances.Ec2KeyName=myec2keyname
&Instances.HadoopVersion=0.20
&Instances.InstanceGroups.member.1.InstanceType=m1.small
&Instances.InstanceGroups.member.1.InstanceCount=1
&Instances.InstanceGroups.member.1.BidPrice=.25
&Instances.InstanceGroups.member.1.InstanceRole=MASTER
&Instances.InstanceGroups.member.1.Market=SPOT
&Instances.InstanceGroups.member.2.InstanceType=m1.small
&Instances.InstanceGroups.member.2.InstanceCount=1
&Instances.InstanceGroups.member.2.BidPrice=.03
&Instances.InstanceGroups.member.2.InstanceRole=CORE
&Instances.InstanceGroups.member.2.Market=SPOT
&Instances.InstanceGroups.member.3.InstanceType=m1.small
&Instances.InstanceGroups.member.3.InstanceCount=1
&Instances.InstanceGroups.member.3.BidPrice=.03
&Instances.InstanceGroups.member.3.InstanceRole=TASK
&Instances.InstanceGroups.member.3.Market=SPOT
&Instances.KeepJobFlowAliveWhenNoSteps=true 
&Instances.Placement.AvailabilityZone=us-east-1a
&Instances.TerminationProtected=true
&LogUri=s3n%3A%2F%2Fmybucket%2Fsubdir
&Name=MyJobFlowName 
&Steps.member.1.ActionOnFailure=CONTINUE
&Steps.member.1.HadoopJarStep.Args.member.1=arg1
&Steps.member.1.HadoopJarStep.Args.member.2=arg2 
&Steps.member.1.HadoopJarStep.Jar=MyJarFile
&Steps.member.1.HadoopJarStep.MainClass=MyMainClass
&Steps.member.1.Name=MyStepName

Call the script and pass in the request file as follows:

ruby send_raw_request.rb Sample_request.txt
		

The script will run and send the request to the web service using AWS signature version 2 (the version Amazon Elastic MapReduce (Amazon EMR) currently supports).