StartSearch

Start an offline search query that will return up to 10,000,000 results in a downloadable text file.

NameDescriptionRequired

Action

Set the Action parameter to StartSearch to start a search for documents matching the terms in the Query.

Yes

Version

Pass in the version number to ensure that requests succeed even if the API changes in future versions.

Yes

Query

A set of search terms. Phrases should be enclosed in double quotes. Must be URL encoded. See the Query Syntax chapter for more details. The maximum size of the Query string is 120 KB.

Some tips:

  • To limit results to text documents add pagetype:(-irrelevant) to your Query. For example, to search for text documents about cats: Query=cats+pagetype:(-irrelevant)
  • To limit results to documents where the terms appear in document text use text:(mysearchterms). Otherwise matches can occur when the terms match other fields such as the anchor text pointing to that document or the document title. For example, to search for text documents containing the term cats: Query=text:(cats)+pagetype:(-irrelevant)
  • Read the Search Fields Article to learn about all the ways you can filter your results.
  • You cannot unique the results, but you can limit the number of results per site using the SiteThrottle parameter.

Yes

MaxNumberOfDocuments

Maximum number of lines in the output file. The value must be between 200 and 10000000 (inclusive).

Note that since you are charged per result returned, setting the MaxNumberOfDocuments limits the maximum charge per request.

Yes

CachedDocumentsOnly

Set to true to limit the results to documents that are stored in Alexa's cache. You should set this to true if you want to post-process the results using regular expressions with the StartGrep action. The default value is false, in which case all matching documents are returned.

As of July 2007 the full search index contains about 10 billion documents. The most popular half billion documents are cached and available for post-processing.

No

MaxTime

The maximum runtime in seconds, after which any available results are returned. The default value is 86400 (24 hours). The value must be between 1 and 86400 (inclusive).

A simple query returning one million results usually completes within 30 minutes. Complex queries may take longer.

No

SiteThrottle

Used to reduce the number of results returned per site. The value must be between 0 and 4 inclusive. The default value is 0, which is no throttling. Set SiteThrottle=4 to get the maximum throttling, and thus the fewest number of results per site.

No

NameDescription

ActionRequestId

The id associated with this request. Pass this Id into the GetStatus Action to find out if your search has completed.

The following example shows a Query-style request and response

Use the GetStatus Action to get the status of your search query, and the download URL where you can pickup your results.