Amazon Elastic MapReduce
Developer Guide (API Version 2009-11-30)
Print this pageEmail this pageGo to the ForumsView the PDFShare this page on TwitterShare this page on FacebookBookmark this page on DeliciousSubmit this page to RedditSubmit this page to DiggDid this page help you?  Yes  No   Tell us about it...

Initialize Ganglia on a Job Flow

To set up Ganglia monitoring on a job flow, you must specify the Ganglia bootstrap action when you create the job flow. Amazon Elastic MapReduce (Amazon EMR) then installs the monitoring agents and the aggregator that Ganglia uses to report data.

[Note]Note

You cannot add Ganglia monitoring to a job flow that is already running.

When you create a new job flow, you can use the CLI to specify the Ganglia bootstrap action, by adding the following parameter to your job flow call:

--bootstrap-action s3://elasticmapreduce/bootstrap-actions/install-ganglia

The following command illustrates the use of the bootstrap-action parameter when starting a new job flow. In this example, you start the Word Count sample job flow provided by Amazon EMR and launch three instances.

elastic-mapreduce --create --alive --instance-type m1.xlarge --num-instances 3 \
--bootstrap-action s3://elasticmapreduce/bootstrap-actions/install-ganglia --stream \
--input s3://elasticmapreduce/samples/wordcount/input \
--output s3://myawsbucket/wordcount/output/2012-04-19 \
--mapper s3://elasticmapreduce/samples/wordcount/wordSplitter.py --reducer aggregate		

Where output-bucket is an S3 bucket to receive the job flow output.