Amazon Elastic MapReduce
Developer Guide (API Version 2009-11-30)
Print this pageEmail this pageGo to the ForumsView the PDFShare this page on TwitterShare this page on FacebookBookmark this page on DeliciousSubmit this page to RedditSubmit this page to DiggDid this page help you?  Yes  No   Tell us about it...

JSON Configuration Files

When Amazon Elastic MapReduce (Amazon EMR) creates a Hadoop cluster, each node contains a pair of JSON files containing configuration information about the node and the currently running job flow. These files are in the /mnt/var/lib/info directory, and accessible by scripts running on the node.

Node Settings

Settings for an Elastic MapReduce cluster node are contained in the instance.json file.

The following table describes the contents of the instance.json file.

ParameterDescription
isMaster

Indicates that is the master node.

Type: Boolean

isRunningNameNode

Indicates that this is running the Hadoop name node daemon.

Type: Boolean

isRunningDataNode

Indicates that is running the Hadoop data node daemon.

Type: Boolean

isRunningJobTracker

Indicates that is running the Hadoop job tracker daemon.

Type: Boolean

isRunningTaskTracker

Indicates that is running the Hadoop task tracker daemon.

Type: Boolean

The following example shows the contents of an instance.json file:

{
     "instanceGroupId":"Instance_Group_ID",
            "isMaster": Boolean,
   "isRunningNameNode": Boolean,
   "isRunningDataNode": Boolean,
 "isRunningJobTracker": Boolean,
"isRunningTaskTracker": Boolean
}

Example to identify settings in JSON file using a bootstrap action

This example demonstrates how to execute the command line function echo to display the string running on master nodeon a master node by evaluating the JSON file parameter instance.isMaster.

If you are using...Enter the following...
Linux or UNIX
$ ./elasticmapreduce --create --alive --name "RunIf" \
--bootstrap-action s3://elasticmapreduce/bootstrap-actions/run-if \
--bootstrap-name "Run only on master" \
--args "instance.isMaster=true,echo,’Running on master node’" 
Microsoft Windows c:\ruby elasticmapreduce --create --alive --name "RunIf" --bootstrap-action s3://elasticmapreduce/bootstrap-actions/run-if --bootstrap-name "Run only on master" --=args "instance.isMaster=true,echo,’Running on master node’"

Job Flow Configuration

Information about the currently running job flow is contained in the job-flow.json file.

The following table describes the contents of the job-flow.json file.

ParameterDescription
JobFlowID

Contains the ID for the job flow.

Type: String

jobFlowCreationInstant

Contains the time that the job flow was created.

Type: Long

instanceCount

Contains the number of nodes in an instance group.

Type: Integer

masterInstanceID

Contains the ID for the master node.

Type: String

masterPrivateDnsName

Contains the private DNS name of the master node.

Type: String

masterInstanceType

Contains the Amazon EC2 instance type of the master node.

Type: String

slaveInstanceType

Contains the Amazon EC2 instance type of the slave nodes.

Type: String

HadoopVersion

Contains the version of Hadoop running on the cluster.

Type: String

instanceGroups

A list of objects specifying each instance group in the cluster

instanceGroupId—unique identifier for this instance group.

Type: String

instanceGroupName—uUser defined name of the instance group.

Type: String

instanceRole—one of MASTER, CORE, or TASK.

Type: String

instanceType—the Amazon EC2 type of the node, such as "m1.small".

Type: String

requestedInstanceCount—the target number of nodes for this instance group.

Type: Long

The following example shows the contents of an job-flow.json file.

{
             "jobFlowId":"JobFlowID",
"jobFlowCreationInstant": CreationInstanceID,
         "instanceCount": Count,
      "masterInstanceId":"MasterInstanceID",
  "masterPrivateDnsName":"Name",
    "masterInstanceType":"Amazon_EC2_Instance_Type",
     "slaveInstanceType":"Amazon_EC2_Instance_Type",
         "hadoopVersion":"Version",
        "instanceGroups":
            [
                {
                 "instanceGroupId":"InstanceGroupID",
               "instanceGroupName":"Name",
                    "instanceRole":"MASTER",
                      "marketType":"Type",
                    "instanceType":"AmazonEC2InstanceType",
          "requestedInstanceCount": Count},
                }
                {
                 "instanceGroupId":"InstanceGroupID",
               "instanceGroupName":"Name",
                    "instanceRole":"CORE",
                      "marketType":"Type",
                    "instanceType":"AmazonEC2InstanceType",
          "requestedInstanceCount": Count},
                }
                {
                 "instanceGroupId":"InstanceGroupID",
               "instanceGroupName":"Name",
                    "instanceRole":"TASK",
                      "marketType":"Type",
                    "instanceType":"AmazonEC2InstanceType",
          "requestedInstanceCount": Count
                }
           ]
}