AWS Import/Export
Developer Guide (API Version 2010-06-03)
Print this pageEmail this pageGo to the ForumsView the PDFShare this page on TwitterShare this page on FacebookBookmark this page on DeliciousSubmit this page to RedditSubmit this page to DiggDid this page help you?  Yes  No   Tell us about it...

Examples of Import Manifest Options

This section describes the commonly used import manifest options. Not all options are supported for both Amazon S3 and Amazon Elastic Block Store (Amazon EBS). If an example does not apply to both, the availability is mentioned at the beginning of the description for the example. For a list of all manifest options and their availability, see Manifest File Options.

Requesting Device Erase After Import

You can request AWS Import/Export to erase the contents of your storage device after uploading the data. You might choose to do this to safeguard the data during return shipment. AWS overwrites all writable blocks on your device with zeros. You will need to repartition and format your device after we return it to you. You request device erase in your import manifest by adding the eraseDevice option.

eraseDevice: Yes
[Note]Note

You are charged data-loading rates for the system time required to erase your data.

For more information, see Import to Amazon S3 Manifest File Options.

Excluding Files and Directories

For Amazon S3 import jobs, only.

You can optionally request AWS Import/Export not to import some of the files/directories on your storage device. This is a convenient feature that allows you to keep a directory structure intact, but avoid uploading unwanted files and directories. You make this request in the manifest using the ignore option. With the ignore option you can specify directories, files, or file types on your storage device that you do not want us to load. To specify naming patterns, you use standard Java regular expressions. For information about Java regular expressions, go to http://download.oracle.com/javase/tutorial/essential/regex/. Examples of Java regular expressions commonly used in a manifest are given below.

Excluding Files

You can specify Java regular expression in the ignore manifest option to exclude files with specific suffix. The following example uses the ignore option with two Java regular expressions to exclude files with suffix ending with a tilde and .swp.

ignore:
  -  .*~$ 
  -  .*\.swp$

The following ignore option specifies that all the files on the storage device with the .psd extension will be excluded from the import.

ignore:  
  - \.psd$ 
  - \.PSD$          

The log report includes all ignored files, including the SIGNATURE file you added at the root of your storage device.

Excluding Directories

The following ignore option specifies that the backup directory at the root of your storage device will be excluded from the import.

ignore:
  - ^backup/

[Important]Important

When specifying a path that includes a file separator, for example, images/myImages/sampleImage.jpg, make sure to use a forward slash, “/”, and not a back slash.

The following ignore option causes all the content in the images/myImages directory to be excluded from the import.

ignore: 
  - ^images/myImages/

Excluding Recycle Bin

Many storage devices include recycle bins. You may not want to upload the recycle bin in the import process. To skip the recycle bin on Windows computers you specify the following ignore option. The first regular expression applies to NTFS file systems formatted for Windows Vista and Windows 7. The second one applies to NTFS file systems on Windows 2000, Windows XP and Windows NT. And the third one applies to the FAT file system.

ignore: 
  - ^\$Recycle\.bin/
  - ^RECYCLER/
  - ^RECYCLED/

Excluding Lost+Found

The Java regular expression in the following ignore statement causes lost+found directory from being uploaded.

ignore: 
  - ^lost\+found/

Handling Gzip Files

For Amazon S3 import jobs, only.

When importing Gzip compressed files you can optionally specify the setContentEncodingForGzFiles option in the manifest with value set to yes. This adds the Content-Encoding header, in addition to the Content-Type header, when the Gzip compressed files are uploaded. The Content-Encoding header helps most browsers to render these files correctly.

setContentEncodingForGzFiles: yes

Additionally, the extensions, .gz or .gzip are ignored when setting the Content-Type header if the file has been compressed with gzip.

For example, if setContentEncodingForGzFiles is set to "yes", the gzip compressed file, text1.html.gz, would be uploaded with the following HTTP headers:

  • Content-Encoding: gzip

  • Content-Type: text/html

The gzip compressed file, text2.html, would be uploaded with the following HTTP headers:

  • Content-Encoding: gzip

  • Content-Type: text/html

The non-compressed file text3.html would be uploaded with the following HTTP headers:

  • Content-Type: text/html

[Note]Note

When setContentEncodingForGzFiles is set to yes, only files that are gzip compressed will contain a Content-Encoding header. We look at the first few bytes of all imported files to see if they are gzip'd. If so, they get the Content-Encoding header regardless of the file extension.

The gzip compressed file text.gzip would be uploaded with the following HTTP headers using the defaultContentType specified in the manifest file:

  • Content-Encoding: gzip

  • Content-Type: binary/octet-stream

Expediting the Return of Your Storage Device

We offer expedited shipping to U.S. addresses after loading data into US Region buckets. To request expedited shipping of your storage device add the serviceLevel manifest option with value expeditedShipping.

serviceLevel: expeditedShipping

For additional information regarding return shipping fees and services, go to the AWS Import/Export Calculator.

Specifying Customs Related Manifest Options

When shipping devices internationally except within the European Union you must include the customs option in the manifest. For more information about the customs related manifest options, see Customs Manifest File Options.

Setting ACL on Imported Amazon S3 Objects

For Amazon S3 import jobs, only.

When importing data to Amazon S3, the permissions on the imported objects is set as private. You can specify the acl manifest option to specify the access control list (ACL) on the imported objects. The following manifest option sets the ACL value on the uploaded objects to public-read.

acl: public-read

For more information, see Import to Amazon S3 Manifest File Options.

Specifying Key Prefix

For Amazon S3 import jobs, only.

The AWS Import/Export prefix mechanism allows you to create a logical grouping of the objects in a bucket. The prefix value is similar to a directory name that enables you to store similar data under the same directory in a bucket. For example, if your Amazon S3 bucket name is my-bucket, and you set prefix to my-prefix/, and the file on your storage device is /jpgs/sample.jpg, then sample.jpg would be loaded to http://s3.amazonaws.com/my-bucket/my-prefix/jpgs/sample.jpg. If the prefix is not specified, sample.jpg would be loaded to http://s3.amazonaws.com/my-bucket/jpgs/sample.jpg. You can specify a prefix by adding the prefix option in the manifest.

[Important]Important

We do not include a forward slash (/) automatically. If you don't include the slash at the end of the value for prefix, the value is concatenated to the file name. For example if your prefix is images and you import the file sample.jpg, your key would become imagessample.jpg instead of images/sample.jpg.

prefix: my-prefix/

Specifying Log Prefix

The AWS Import/Export process generates a log file. The log file name always ends with the phrase import-log- followed by your JobId. There is a remote chance that you already have an object with this name. To avoid a key collision, you can add an optional prefix to the log file by adding the logPrefix option in the manifest. AWS Import/Export takes the string value specified for this option and inserts it between the bucket name and log report name. The following manifest option sets the prefix for the log key.

logPrefix: logs/

For example, if your job ID is 53TX4, the log file is saved to http://s3.amazonaws.com/mybucket/logs/import-log-53TX4.

[Note]Note

We do not include a forward slash (/) automatically. If you don't include the slash at the end of the value for logPrefix, the value is concatenated to the log file name. For example, if your logPrefix is logs the log file key your key would become logsimport-log-jobId instead of logs/import-log-JobId.

logPrefix + import-log-JOBID cannot be longer than 1024 bytes. If it is, AWS Import/Export returns an InvalidManifestField error from the CreateJob action.

Mapping Uppercase Characters to Lowercase Characters

For Amazon S3 import jobs, only.

The substitutions manifest option allows you to specify rules for naming the object keys when importing to Amazon S3 and file names when exporting objects to a file system. You can define a rule to substitute all uppercase characters for file names with the equivalent lowercase characters for object names in your import job. For example, use the following entry to replace all the uppercase characters in your file names with lowercase letters for an entire alphabet. List all the letters in the alphabet (you need to specify each one) with the uppercase letters on the left side of the option parameter and lowercase letters on the right side of the option parameter (where "..." represents all the characters between C and Y):

substitutions:
    "A" : "a"
    "B" : "b"
    "C" : "c"
       ...
    "Y" : "y"
    "Z" : "z"

For more information, see the substitutions option in Common Manifest File Options.

Mapping File Directories to the Amazon S3 Root

For Amazon S3 import jobs, only.

Amazon S3 performs well even when there are millions of files in the same bucket. To import your data efficiently into Amazon S3 using AWS Import/Export, you might decide to eliminate your subdirectories. If you name your directories carefully, such that none of the names of the directories are substrings of your file names, you can use the substitutions manifest option to remove the directory from the key name. The following example assumes you have a directory structure that divides your data across the three subdirectories, ZZ1, ZZ2, ZZ3 in your file system.

ZZ1/
ZZ2/
ZZ3/

To remove the directory name from the Amazon S3 key names, define the following substitutions option in your manifest file:

substitutions:
    "ZZ1/" : ""
    "ZZ2/" : ""
    "ZZ3/" : ""

All of the files will be stored in the Amazon S3 bucket root.

[Important]Important

None of the files within the subdirectories should contain the substitutions strings in their file names (such as "ZZ1/", "ZZ2/", or "ZZ3/").

If two files have the same name, both files are uploaded to Amazon S3, but you will only retain the bytes of the last file transferred.

Use the forward slash (/) as the file separator character. Don't use the back-slash (\) or double back-slash (\\).