Amazon Elastic MapReduce
Developer Guide (API Version 2009-11-30)
Print this pageEmail this pageGo to the ForumsView the PDFShare this page on TwitterShare this page on FacebookBookmark this page on DeliciousSubmit this page to RedditSubmit this page to DiggDid this page help you?  Yes  No   Tell us about it...

Protecting a Job Flow from Termination

Termination protection ensures that the Amazon EC2 instances in your job flow are not shut down by an accident or error. This protection is especially useful if your cluster contains data in ephemeral instance storage that you need to recover before those instances are terminated.

By default, termination protection is disabled on job flows. When termination protection is not enabled, you can terminate job flows either through calls to the TerminateJobFlows API, through the Amazon EMR console, or by using the command-line interface. In addition, the master node may terminate a task node that has become unresponsive or has returned an error.

When termination protection is enabled, TerminateJobFlows and users won't be able to terminate the job flow from the Amazon EMR console or CLI. If you attempt to terminate a protected job flow, the API will return an error, and the CLI will exit with a non-zero return code. In addition, in the case of an error, the job flow will end, but the Amazon EC2 instances will persist. Furthermore, if the ActionOnFailure flag for the job flow has been set to “terminate and close” having termination protection enabled changes the job flow’s ActionOnFailure behavior to “close and wait.”

Before an instance in a protected job flow can be terminated, you must explicitly remove termination protection from the job flow. For more on termination protection go to SetTerminationProtection in the Amazon Elastic MapReduce API Reference.

[Note]Note

Use job flow termination protection judiciously because it can lead to additional charges for the persistent Amazon EC2 instances.

Termination Protection in Amazon EMR and Amazon EC2

Termination protection of job flows in Amazon Elastic MapReduce (Amazon EMR) is analogous to setting the disableAPITermination flag on an Amazon EC2 instance. In the event of a conflict between the termination protection set in Amazon EC2 and that set in Amazon EMR, the Amazon EMR job flow protection status overrides that set by Amazon EC2 on the given instance. For example, if you use the Amazon EC2 console to enable termination protection on an Amazon EC2 instance in an Amazon EMR job flow that has termination protection disabled, Amazon EMR will turn off termination protection on that Amazon EC2 instance and shut down the instance when the rest of the job flow terminates.

Termination Protection and Spot Instances

Amazon EMR termination protection does not prevent an Amazon EC2 Spot Instance from terminating when the spot price rises above the maximum bid price. For more information about the behavior of Amazon EC2 Spot Instances in Amazon EMR, go to Lowering Costs with Spot Instances

Termination Protection and Keep Alive

Enabling termination protection on a job flow is similar to enabling keep alive on a job flow (using the --alive argument in the CLI), but the protections each offers are different. Keep alive causes instances in a job flow to persist after the job flow has successfully completed, but still allows the job flow to be terminated by calls to TerminateJobFlows and errors. Termination protection allows the job to terminate after successful completion, but keeps it persistent in the case of user actions, errors, and TerminateJobFlow calls.

The following table compares the protections offered by termination protection and keep alive.

Protects against termination from...Termination ProtectionKeep Alive
Successful completion 
User actions
 
TerminateJobFlows API
 
Errors
 

Protecting a New Job Flow

You can specify that a new job flow be protected from termination during the job flow creation.

You can do this in the CLI by specifying --with-termination-protection during a job flow creation call. This is shown in the following:

elastic-mapreduce --create --alive --instance-type m1.xlarge --num-instances 2 --stream --input s3://elasticmapreduce/samples/wordcount/input --output s3://myawsbucket/wordcount/output/2011-03-25 --mapper s3://elasticmapreduce/samples/wordcount/wordSplitter.py --reducer aggregate --with-termination-protection

Or you can use the RunJobFlow API and specify a request like the following, in which the instances.TerminationProtected request argument specifies that the job flow be created with termination protection.

https://elasticmapreduce.amazonaws.com?Operation=RunJobFlow
&Name=MyJobFlowName
&LogUri=s3n%3A%2F%2Fmybucket%2Fsubdir
&Instances.MasterInstanceType=m1.small
&Instances.SlaveInstanceType=m1.small
&Instances.InstanceCount=4
&Instances.Ec2KeyName=myec2keyname
&Instances.Placement.AvailabilityZone=us-east-1a
&Instances.KeepJobFlowAliveWhenNoSteps=true
&Instances.TerminationProtected=true
&Steps.member.1.Name=MyStepName
&Steps.member.1.ActionOnFailure=CONTINUE
&Steps.member.1.HadoopJarStep.Jar=MyJarFile
&Steps.member.1.HadoopJarStep.MainClass=MyMainClass
&Steps.member.1.HadoopJarStep.Args.member.1=arg1
&Steps.member.1.HadoopJarStep.Args.member.2=arg2
&AuthParams
		

Protecting an Existing Job Flow

You can add termination protection to an already running job flow.

To use the CLI to enable termination protection from an existing job flow

  • Run the following:

    elastic-mapreduce --set-termination-protection true --jobflow JobFlowID

To use the API to enable termination protection from an existing job flow

  • Call SetTerminationProtection with a request like the following:

    https://elasticmapreduce.amazonaws.com?Operation=SetTerminationProtection
    &JobFlowId=JobFlowID
    &TerminationProtected=true
    					

The JobFlowID is the identifier of the job flow on which to enable termination protection in both of the preceding examples.

Terminating a Protected Job Flow

If you want to terminate a protected job flow, you must first explicitly disable termination protection. After termination protection is disabled, you can terminate the job flow from the Amazon EMR console, CLI, or programmatically using the TerminateJobFlows API.

To use the CLI to disable termination protection for an existing job flow

  • Run the following:

    elastic-mapreduce --set-termination-protection false --jobflow JobFlowID

To use the API to disable termination protection for an existing job flow

  • Call SetTerminationProtection and set the TerminationProtected flag to false.

    https://elasticmapreduce.amazonaws.com?Operation=SetTerminationProtection
    &JobFlowId=JobFlowID
    &TerminationProtected=false
    		

The JobFlowID is the identifier of the job flow on which to enable termination protection in both of the preceding examples.