When you create a HIT or a Qualification test, you can include various kinds of content to be displayed to the Worker on the Amazon Mechanical Turk web site, such as text (titles, paragraphs, lists), media (pictures, audio, video) and browser applets (Java or Flash).
You can also include blocks of formatted content. Formatted content lets you include XHTML tags directly in your instructions and your questions for detailed control over the appearance and layout of your data.
You include a block of formatted content by specifying a
FormattedContent element in the appropriate
place in your QuestionForm
data structure. You can specify any number of
FormattedContent elements in content, and
you can mix them with other kinds of content.
The following example uses other content types (Title, Text) along with FormattedContent to include a table in a HIT:
<Text>
This HIT asks you some questions about a game of Tic-Tac-Toe
currently in progress. Your answers will help decide the next move.
Squares with "-" are available.
</Text>
<Title>The Current Board</Title>
<Text>
The following table shows the board as it currently stands.
</Text>
<FormattedContent><![CDATA[
<table border="1">
<tr>
<td></td>
<td align="center">1</td>
<td align="center">2</td>
<td align="center">3</td>
</tr>
<tr>
<td align="right">A</td>
<td align="center"><b>X</b></td>
<td align="center">-</td>
<td align="center"><b>O</b></td>
</tr>
<tr>
<td align="right">B</td>
<td align="center">-</td>
<td align="center"><b>O</b></td>
<td align="center">-</td>
</tr>
<tr>
<td align="right">C</td>
<td align="center">-</td>
<td align="center">-</td>
<td align="center"><b>X</b></td>
</tr>
<tr>
<td align="center" colspan="4">It is <b>X</b>'s turn.</td>
</tr>
</table>
]]></FormattedContent>For more information about describing the contents of a HIT or Qualification test, see the QuestionForm data structure.
As you can see in the example above, formatted content is
specified in an XML CDATA block, inside a
FormattedContent element. The CDATA
block contains the text and XHTML markup to display in the
Worker's browser.
Only a subset of the XHTML standard is supported. For a
complete list of supported XHTML elements and attributes, see
the table below. In particular, JavaScript, element IDs,
class and style attributes, and
<div> and <span> elements
are not allowed.
XML comments (<!-- ... -->) are not allowed
in formatted content blocks.
Every XHTML tag in the CDATA block must be closed before the end
of the block. For example, if you start an XHTML paragraph with
a <p> tag, you must end it with a
</p> tag within the same
FormattedContent block.
![]() | Note |
|---|---|
The tag closure requirement means you cannot open an XHTML tag
in one |
XHTML tags must be nested properly. When tags are
used inside other tags, the inner-most tags must be closed
before outer tags are closed. For example, to specify that some
text should appear in bold italics, you would use the
<b> and <i> tags as
follows:
<b><i>This text appears bold italic.</i></b>
But the following would not be valid, because the closing
</b> tag appears before the closing
</i> tag:
<b><i>These tags don't nest properly!</b></i>
Finally, formatted content must meet other requirements to validate against the XHTML schema. For instance, tag names and attribute names must be all lowercase letters, and attribute values must be surrounded by quotes.
For details on how Amazon Mechanical Turk validates XHTML formatted content blocks, see "How XHTML Formatted Content Is Validated," below.
FormattedContent supports a limited
subset of the XHTML
1.0 ("transitional") standard. The complete list of
supported tags and attributes appears in the table below.
Notable differences with the standard include:
JavaScript is not allowed
The <script>
tag is not supported, and anchors (<a>)
and images (<img>) cannot use
javascript: targets in URLs.
CSS is not allowed
The <style> tag is
not supported, and the class and
style attributes are not supported. The
id attribute is also not supported.
XML comments (<!-- ... -->) are not supported
URLs in anchor targets, image locations, and iframe locations are
limited to the following: http:// https:// ftp:// news:// nntp:// mailto:// gopher:// telnet://
Other things to note with regards to supported tags and attributes:
In addition to the attributes listed, the title
attribute is supported for all tags, and the
dir and lang attributes are
supported for all tags except <br>
The alt attribute is required for
<area> and <img> tags
<iframe> tags cannot be empty
They must contain simple text and cannot contain tags.
The following example is correct:
<iframe src="http://www.slashdot.org">Your browser does not support IFRAMEs. Please return this HIT.</iframe>
The following examples are not correct:
<iframe src="http://www.slashdot.org"/> <iframe src="http://www.slashdot.org"></iframe> <iframe src="http://www.slashdot.org"> </iframe> <iframe src="http://www.slashdot.org">This frame links <a href="http://www.slashdot.org/">here</a></iframe>
<img> tags also require a
src attribute
<map> tags require a
name attribute
The following table lists the supported tags and attributes:
| Tag | Attributes |
|---|---|
a | accesskey charset coords href hreflang name rel rev shape tabindex target type |
area | alt coords href nohref shape target |
b | |
big | |
blockquote | cite |
br | |
center | |
cite | |
code | |
col | align char charoff span valign width |
colgroup | align char charoff span valign width |
dd | |
del | cite datetime |
dl | |
em | |
font | color face size |
h1 | align |
h2 | align |
h3 | align |
h4 | align |
h5 | align |
h6 | align |
hr | align noshade size width |
i | |
iframe | align frameborder height longdesc marginheight marginwidth name scrolling src width |
img | align alt border height hspace ismap longdesc src usemap vspace width |
ins | cite datetime |
li | type value |
map | name |
ol | compact start type |
p | align |
pre | width |
q | cite |
small | |
strong | |
sub | |
sup | |
table | align bgcolor border cellpadding cellspacing frame rules summary width |
tbody | align char charoff valign |
td | abbr align axis bgcolor char charoff colspan headers height nowrap rowspan scope valign width |
tfoot | align char charoff valign |
th | abbr align axis bgcolor char charoff colspan headers height nowrap rowspan scope valign width |
thead | align char charoff valign |
tr | align bgcolor char charoff valign |
u | |
ul | compact type |
When you create a HIT or a Qualification test whose content uses
FormattedContent, Amazon Mechanical Turk
attempts to validate the formatted content blocks against a
schema. If the formatted content does not validate against the
schema, the operation call will fail and return an error.
To validate the formatted content, Mechanical Turk takes the
contents of the FormattedContent element
(the text and markup inside the CDATA), then constructs an XML
document with an appropriate XML header,
<FormattedContent> as the root element, and
the text and markup as the element's contents (without the
CDATA). This document is then validated against a schema.
For example, consider the following FormattedContent block:
...
<FormattedContent><![CDATA[
I absolutely <i>love</i> chocolate ice cream!
]]></FormattedContent>
...To validate this block, Mechanical Turk produces the following XML document:
<?xml version="1.0"?> <FormattedContent xmlns="http://www.w3.org/1999/xhtml"> I absolutely <i>love</i> chocolate ice cream! </FormattedContent>
The schema used for validation is called
FormattedContentXHTMLSubset.xsd. For information
on how to download this schema, see WSDL and Schema Locations.
You do not need to specify the namespace of the XHTML tags in your formatted content. This is assumed automatically during validation.