Returns meta data about a specific document from the most recent Alexa web crawl. The meta data includes the return code, size of the page, checksum, and the URLs of links, images, frames and more. The crawl meta data is based on Alexa's current snapshot of the web. Updates occur following the completion of Alexa's Web-wide crawl cycle, which takes approximately two months to complete.
Note that this action does not return any traffic data. See UrlInfo action for traffic data.
The Crawl Action takes the following parameters. Required parameters must be provided for the request to succeed.
| Name | Description | Required |
|---|---|---|
Action
|
Set the | Yes |
ResponseGroup
|
The only valid value is | Yes |
Url
|
Any valid URL. The URL parameter specifies the URL, host or domain about which you would like to receive information. | Yes |
Version
| Pass in the current version number, 2005-07-11, to ensure that requests succeed even if the API changes in future versions. | No |
Start
|
1-based index of result at which to start. Note: An empty document will be returned if this value exceeds the total number of available results. | No |
Count
|
Number of results to return for this request, beginning from specified Start number (maximum 20) | No |
Purify
|
Canonicalize URL prior to requesting its data. (true | false). The default is true. | No |
ResponseCodes
|
Return metadata for entries that match one of this comma-separated list of HTTP response codes (200,302) | No |
The following example shows a Query-style request and response
http://awis.amazonaws.com? Action=Crawl &AWSAccessKeyId=[Your AWS Access Key ID]&Signature=[signature]&Timestamp=[timestamp used in signature]&ResponseGroup=MetaData &Url=[Valid URL]&Start=[number to start at]&Count=[Number of results to return]&Purify=[true | false]&ResponseCodes=[200,302]
<aws:CrawlResponse xmlns:aws="http://alexa.amazonaws.com/doc/2005-10-05/">
<aws:Response xmlns:aws="http://awis.amazonaws.com/doc/2005-07-11">
<aws:OperationRequest>
<aws:RequestId>608de633-e4a0-422e-ab7a-517209bc0df2</aws:RequestId>
</aws:OperationRequest>
<aws:CrawlResult>
<aws:Alexa>
<aws:CrawlData>
<aws:MetaData>
<aws:ResultNumber>1</aws:ResultNumber>
<aws:RequestInfo>
<aws:OriginalRequest>http://alexa.com:80/</aws:OriginalRequest>
<aws:IPAddress>64.213.200.100</aws:IPAddress>
<aws:RequestDate>20070502195602</aws:RequestDate>
<aws:ContentType>text/html</aws:ContentType>
<aws:ResponseCode>200</aws:ResponseCode>
<aws:Length>58319</aws:Length>
<aws:Language>en.utf-8 0.907 2829</aws:Language>
</aws:RequestInfo>
<aws:Checksums>
<aws:AppearanceChecksum>db16a79395ad7a0774faf065aee9a794</aws:AppearanceChecksum>
<aws:ContentChecksum>60e305f16781a67a585647efc158d193</aws:ContentChecksum>
</aws:Checksums>
<aws:OtherUrls>
<aws:OtherUrl source="href">www.alexa.com/favicon.ico</aws:OtherUrl>
<aws:OtherUrl source="src">purl.org/atom/ns</aws:OtherUrl>
</aws:OtherUrls>
<aws:Images>
<aws:Image>client.alexa.com/common/images/alexa.gif</aws:Image>
<aws:Image>client.alexa.com/common/images/button_search_arrow.gif</aws:Image>
</aws:Images>
<aws:Links>
<aws:Link>
<aws:LocationURI>www.alexa.com/</aws:LocationURI>
</aws:Link>
<aws:Link>
<aws:Name>Traffic Rankings</aws:Name>
<aws:LocationURI>alexa.com/site/ds/top_500?qterm=</aws:LocationURI>
</aws:Link>
</aws:Links>
</aws:MetaData>
</aws:CrawlData>
</aws:Alexa>
</aws:CrawlResult>
<aws:ResponseStatus xmlns:aws="http://alexa.amazonaws.com/doc/2005-10-05/">
<aws:StatusCode>Success</aws:StatusCode>
</aws:ResponseStatus>
</aws:Response>
</aws:CrawlResponse>