Copyright © 2009 W3C ® ( MIT , ERCIM , Keio ), All Rights Reserved. W3C liability , trademark and document use rules apply.
This specification defines the 5th major revision of the core language of the World Wide Web: the Hypertext Markup Language (HTML). In this version, new features are introduced to help Web application authors, new elements are introduced based on research into prevailing authoring practices, and special attention has been given to defining clear conformance criteria for user agents in an effort to improve interoperability.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the most recently formally published revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.
The WHATWG version of this specification is available under a license that permits reuse of the specification text.
If you wish to make comments regarding this document, please send them to public-html-comments@w3.org ( subscribe , archives ) or whatwg@whatwg.org ( subscribe , archives ), or submit them using our public bug database . All feedback is welcome.
We maintain a list of all e-mails that have not yet been considered and a list of all bug reports that have not yet been resolved .
Implementors should be aware that this specification is not stable. Implementors who are not taking part in the discussions are likely to find the specification changing out from under them in incompatible ways. Vendors interested in implementing this specification before it eventually reaches the Candidate Recommendation stage should join the aforementioned mailing lists and take part in the discussions.
The publication of this document by the W3C as a W3C Working Draft does not imply that all of the participants in the W3C HTML working group endorse the contents of the specification. Indeed, for any section of the specification, one can usually find many members of the working group or of the W3C as a whole who object strongly to the current text, the existence of the section at all, or the idea that the working group should even spend time discussing the concept of that section.
The latest stable version of the editor's draft of this specification is always available on the W3C CVS server and in the WHATWG Subversion repository . The latest editor's working copy (which may contain unfinished text in the process of being prepared) is also available.
There are various ways to follow the change history for the specification:
svn
checkout
http://svn.whatwg.org/webapps/
The
W3C
HTML
Working
Group
is
the
W3C
working
group
responsible
for
this
specification's
progress
along
the
W3C
Recommendation
track.
This
specification
is
the
13
July
08
August
2009
Editor's
Draft.
This specification is also being produced by the WHATWG . The two specifications are identical from the table of contents onwards.
This specification is intended to replace (be a new version of) what was previously the HTML4, XHTML 1.0, and DOM2 HTML specifications.
This document was produced by a group operating under the 5 February 2004 W3C Patent Policy . W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy .
Different parts of this specification are at different levels of maturity.
body
element
section
element
nav
element
article
element
aside
element
h1
,
h2
,
h3
,
h4
,
h5
,
and
h6
elements
hgroup
element
header
element
footer
element
address
element
a
element
q
element
cite
element
em
element
strong
element
small
element
mark
element
dfn
element
abbr
element
time
element
progress
element
meter
element
code
element
var
element
samp
element
kbd
element
sub
and
sup
elements
span
element
i
element
b
element
bdo
element
ruby
element
rt
element
rp
element
figure
element
img
element
iframe
element
embed
element
object
element
param
element
video
element
audio
element
source
element
canvas
element
canvas
elements
map
element
area
element
table
element
caption
element
colgroup
element
col
element
tbody
element
thead
element
tfoot
element
tr
element
td
element
th
element
td
and
th
elements
form
element
fieldset
element
label
element
input
element
type
attribute
input
element
attributes
autocomplete
attribute
list
attribute
readonly
attribute
size
attribute
required
attribute
multiple
attribute
maxlength
attribute
pattern
attribute
min
and
max
attributes
step
attribute
placeholder
attribute
input
element
APIs
button
element
select
element
datalist
element
optgroup
element
option
element
textarea
element
keygen
element
output
element
details
element
command
element
bb
element
menu
element
a
element
to
define
a
command
button
element
to
define
a
command
input
element
to
define
a
command
option
element
to
define
a
command
command
element
to
define
a
command
bb
element
to
define
a
command
accesskey
attribute
on
a
label
element
to
define
a
command
accesskey
attribute
on
a
legend
element
to
define
a
command
accesskey
attribute
to
define
a
command
on
other
elements
WindowProxy
object
Window
object
alternate
"
archives
"
author
"
bookmark
"
external
"
feed
"
help
"
icon
"
license
"
nofollow
"
noreferrer
"
pingback
"
prefetch
"
search
"
stylesheet
"
sidebar
"
tag
"
hidden
attribute
accesskey
attribute
contenteditable
attribute
text/event-stream
bb
element
button
element
details
element
input
element
as
a
text
entry
widget
input
element
as
domain-specific
widgets
input
element
as
a
range
control
input
element
as
a
color
well
input
element
as
a
check
box
and
radio
button
widgets
input
element
as
a
file
upload
control
input
element
as
a
button
marquee
element
meter
element
progress
element
select
element
textarea
element
keygen
element
time
element
This section is non-normative.
The World Wide Web's markup language has always been HTML. HTML was primarily designed as a language for semantically describing scientific documents, although its general design and adaptations over the years have enabled it to be used to describe a number of other types of documents.
The main area that has not been adequately addressed by HTML is a vague subject referred to as Web Applications. This specification attempts to rectify this, while at the same time updating the HTML specifications to address issues raised in the past few years.
This section is non-normative.
This specification is intended for authors of documents and scripts that use the features defined in this specification , implementors of tools that operate on pages that use the features defined in this specification, and individuals wishing to establish the correctness of documents or implementations with respect to the requirements of this specification .
This document is probably not suited to readers who do not already have at least a passing familiarity with Web technologies, as in places it sacrifices clarity for precision, and brevity for completeness. More approachable tutorials and authoring guides can provide a gentler introduction to the topic.
In particular, familiarity with the basics of DOM Core and DOM Events is necessary for a complete understanding of some of the more technical parts of this specification. An understanding of Web IDL, HTTP, XML, Unicode, character encodings, JavaScript, and CSS will also be helpful in places but is not essential.
This section is non-normative.
This specification is limited to providing a semantic-level markup language and associated semantic-level scripting APIs for authoring accessible pages on the Web ranging from static documents to dynamic applications.
The scope of this specification does not include providing mechanisms for media-specific customization of presentation (although default rendering rules for Web browsers are included at the end of this specification, and several mechanisms for hooking into CSS are provided as part of the language).
The
scope
of
this
specification
does
not
include
documenting
every
HTML
or
DOM
feature
supported
by
Web
browsers.
Browsers
support
many
features
that
are
considered
to
be
very
bad
for
accessibility
or
that
are
otherwise
inappropriate.
For
example,
the
blink
element
is
clearly
presentational
and
authors
wishing
to
cause
text
to
blink
should
instead
use
CSS.
The scope of this specification is not to describe an entire operating system. In particular, hardware configuration software, image manipulation tools, and applications that users would be expected to use with high-end workstations on a daily basis are out of scope. In terms of applications, this specification is targeted specifically at applications that would be expected to be used by users on an occasional basis, or regularly but from disparate locations, with low CPU requirements. For instance online purchasing systems, searching systems, games (especially multiplayer online games), public telephone books or address books, communications software (e-mail clients, instant messaging clients, discussion software), document editing software, etc.
This section is non-normative.
Work on HTML 5 originally started in late 2003, as a proof of concept to show that it was possible to extend HTML 4's forms to provide many of the features that XForms 1.0 introduced, without requiring browsers to implement rendering engines that were incompatible with existing HTML Web pages. At this early stage, while the draft was already publicly available, and input was already being solicited from all sources, the specification was only under Opera Software's copyright.
In early 2004, some of the principles that underlie this effort, as well as an early draft proposal covering just forms-related features, were presented to the W3C jointly by Mozilla and Opera at a workshop discussing the future of Web Applications on the Web. The proposal was rejected on the grounds that the proposal conflicted with the previously chosen direction for the Web's evolution.
Shortly thereafter, Apple, Mozilla, and Opera jointly announced their intent to continue working on the effort. A public mailing list was created, and the drafts were moved to the WHATWG site. The copyright was subsequently amended to be jointly owned by all three vendors, and to allow reuse of the specifications.
In 2006, the W3C expressed interest in the specification, and created a working group chartered to work with the WHATWG on the development of the HTML 5 specifications. The working group opened in 2007. Apple, Mozilla, and Opera allowed the W3C to publish the specifications under the W3C copyright, while keeping versions with the less restrictive license on the WHATWG site.
Since then, both groups have been working together.
This section is non-normative.
It must be admitted that many aspects of HTML appear at first glance to be nonsensical and inconsistent.
HTML, its supporting DOM APIs, as well as many of its supporting technologies, have been developed over a period of several decades by a wide array of people with different priorities who, in many cases, did not know of each other's existence.
Features have thus arisen from many sources, and have not always been designed in especially consistent ways. Furthermore, because of the unique characteristics of the Web, implementation bugs have often become de-facto, and now de-jure, standards, as content is often unintentionally written in ways that rely on them before they can be fixed.
Despite all this, efforts have been made to adhere to certain design goals. These are described in the next few subsections.
This section is non-normative.
To avoid exposing Web authors to the complexities of multithreading, the HTML and DOM APIs are designed such that no script can ever detect the simultaneous execution of other scripts. Even with workers , the intent is that the behavior of implementations can be thought of as completely serialising the execution of all scripts in all browsing contexts .
The
navigator.getStorageUpdates()
method,
in
this
model,
is
equivalent
to
allowing
other
scripts
to
run
while
the
calling
script
is
blocked.
This section is non-normative.
This specification interacts with and relies on a wide variety of other specifications. In certain circumstances, unfortunately, the desire to be compatible with legacy content has led to this specification violating the requirements of these other specifications. Whenever this has occurred, the transgressions have been noted as " willful violations ".
This section is non-normative.
This specification describes a new revision of the HTML language and its associated DOM API.
The requirements in this specification for features that were already in HTML 4 and DOM2 HTML are based primarily on the implementation and deployment experience collected over the past ten years. Some features have been removed from the language, based on best current practices; implementation requirements for some of these, as well as for non-standard features that have nonetheless garnered wide use, are still included in this specification to allow implementations to continue supporting legacy content. [HTML4] [DOM2HTML]
A separate document has been published by the W3C HTML working group to provide a more detailed reference of the differences between this specification and the language described in the HTML 4 specification. [HTMLDIFF]
This section is non-normative.
This specification is intended to replace XHTML 1.0 as the normative definition of the XML serialization of the HTML vocabulary. [XHTML10]
While this specification updates the semantics and requirements of the vocabulary defined by XHTML Modularization 1.1 and used by XHTML 1.1, it does not attempt to provide a replacement for the modularization scheme defined and used by those (and other) specifications, and therefore cannot be considered a complete replacement for them. [XHTMLMOD] [XHTML11]
Thus, authors and implementors who do not need such a modularization scheme can consider this specification a replacement for XHTML 1.x, but those who do need such a mechanism are encouraged to continue using the XHTML 1.1 line of specifications.
This section is non-normative.
This specification defines an abstract language for describing documents and applications, and some APIs for interacting with in-memory representations of resources that use this language.
The in-memory representation is known as "DOM5 HTML", or "the DOM" for short.
There are various concrete syntaxes that can be used to transmit resources that use this abstract language, two of which are defined in this specification.
The
first
such
concrete
syntax
is
"HTML5".
This
is
the
format
recommended
for
most
authors.
It
is
compatible
with
all
legacy
Web
browsers.
If
a
document
is
transmitted
with
the
MIME
type
text/html
,
then
it
will
be
processed
as
an
"HTML5"
document
by
Web
browsers.
The
second
concrete
syntax
uses
XML,
and
is
known
as
"XHTML5".
When
a
document
is
transmitted
with
an
XML
MIME
type
,
such
as
application/xhtml+xml
,
then
it
is
processed
by
an
XML
processor
by
Web
browsers,
and
treated
as
an
"XHTML5"
document.
Authors
are
reminded
that
the
processing
for
XML
and
HTML
differs;
in
particular,
even
minor
syntax
errors
will
prevent
an
XML
document
from
being
rendered
fully,
whereas
they
would
be
ignored
in
the
"HTML5"
syntax.
The
"DOM5
HTML",
"HTML5",
and
"XHTML5"
representations
cannot
all
represent
the
same
content.
For
example,
namespaces
cannot
be
represented
using
"HTML5",
but
they
are
supported
in
"DOM5
HTML"
and
"XHTML5".
Similarly,
documents
that
use
the
noscript
feature
can
be
represented
using
"HTML5",
but
cannot
be
represented
with
"XHTML5"
and
"DOM5
HTML".
Comments
that
contain
the
string
"
-->
"
can
be
represented
in
"DOM5
HTML"
but
not
in
"HTML5"
and
"XHTML5".
And
so
forth.
This section is non-normative.
This specification is divided into the following major sections:
There are also a couple of appendices, defining rendering rules for Web browsers and listing obsolete features and areas that are out of scope for this specification.
This specification should be read like all other specifications. First, it should be read cover-to-cover, multiple times. Then, it should be read backwards at least once. Then it should be read by picking random sections from the contents list and following all the cross-references.
This is a definition, requirement, or explanation.
This is a note.
This is an example.
This is an open issue.
This is a warning.
interface Example {
// this is an IDL definition
};
method
(
[
optionalArgument
]
)
This is a note to authors describing the usage of an interface.
/* this is a CSS fragment */
The defining instance of a term is marked up like this . Uses of that term are marked up like this or like this .
The
defining
instance
of
an
element,
attribute,
or
API
is
marked
up
like
this
.
References
to
that
element,
attribute,
or
API
are
marked
up
like
this
.
Other
code
fragments
are
marked
up
like
this
.
Variables are marked up like this .
This is an implementation requirement.
This section is non-normative.
A basic HTML document looks like this:
<!DOCTYPE HTML> <html> <head> <title>Sample page</title> </head> <body> <h1>Sample page</h1> <p>This is a <a href="demo.html">simple</a> sample.</p> <!-- this is a comment --> </body> </html>
HTML
documents
consist
of
a
tree
of
elements
and
text.
Each
element
is
denoted
in
the
source
by
a
start
tag
,
such
as
"
<body>
",
and
an
end
tag
,
such
as
"
</body>
".
(Certain
start
tags
and
end
tags
can
in
certain
cases
be
omitted
and
are
implied
by
other
tags.)
Tags have to be nested such that elements are all completely within each other, without overlapping:
<p>This is <em>very <strong>wrong</em>!</strong></p>
<p>This <em>is <strong>correct</strong>.</em></p>
This specification defines a set of elements that can be used in HTML, along with rules about the ways in which the elements can be nested.
Elements
can
have
attributes,
which
control
how
the
elements
work.
In
the
example
above,
there
is
a
hyperlink
,
formed
using
the
a
element
and
its
href
attribute:
<a href="demo.html">simple</a>
Attributes
are
placed
inside
the
start
tag,
and
consist
of
a
name
and
a
value
,
separated
by
an
"
=
"
character.
The
attribute
value
can
be
left
unquoted
if
it
doesn't
contain
any
special
characters.
Otherwise,
it
has
to
be
quoted
using
either
single
or
double
quotes.
The
value,
along
with
the
"
=
"
character,
can
be
omitted
altogether
if
the
value
is
the
empty
string.
<!-- empty attributes --> <input name=address disabled> <input name=address disabled=""> <!-- attributes with a value --> <input name=address maxlength=200> <input name=address maxlength='200'> <input name=address maxlength="200">
HTML user agents (e.g. Web browsers) then parse this markup, turning it into a DOM (Document Object Model) tree. A DOM tree is an in-memory representation of a document.
DOM trees contain several kinds of nodes, in particular a DOCTYPE node, elements, text nodes, and comment nodes.
The markup snippet at the top of this section would be turned into the following DOM tree:
The
root
element
of
this
tree
is
the
html
element,
which
is
the
element
always
found
at
the
root
of
HTML
documents.
It
contains
two
elements,
head
and
body
,
as
well
as
a
text
node
between
them.
There are many more text nodes in the DOM tree than one would initially expect, because the source contains a number of spaces (presented by "␣") and line breaks ("⏎") that all end up as text nodes in the DOM.
The
head
element
contains
a
title
element,
which
itself
contains
a
text
node
with
the
text
"Sample
page".
Similarly,
the
body
element
contains
an
h1
element,
a
p
element,
and
a
comment.
This
DOM
tree
can
be
manipulated
from
scripts
in
the
page.
Scripts
(typically
in
JavaScript)
are
small
programs
that
can
be
embedded
using
the
script
element
or
using
event
handler
content
attributes
.
For
example,
here
is
a
form
with
a
script
that
sets
the
value
of
the
form's
output
element
to
say
"Hello
World":
<form name="main"> Result: <output name="result"></output> <script> document.forms.main.elements.result.value = 'Hello World'; </script> </form>
Each
element
in
the
DOM
tree
is
represented
by
an
object,
and
these
objects
have
APIs
so
that
they
can
be
manipulated.
For
instance,
a
link
(e.g.
the
a
element
in
the
tree
above)
can
have
its
"
href
"
attributed
changed
in
several
ways:
var a = document.links[0]; // obtain the first link in the document
a.href = 'sample.html'; // change the destination URL of the link
a.protocol = 'https'; // change just the scheme part of the URL
a.setAttribute('href',
'http://example.com/');
//
change
the
content
attribute
directly
Since DOM trees are used as the way to represent HTML documents when they are processed and presented by implementations (especially interactive implementations like Web browsers), this specification is mostly phrased in terms of DOM trees, instead of the markup described above.
HTML documents represent a media-independent description of interactive content. HTML documents might be rendered to a screen, or through a speech synthesizer, or on a braille display. To influence exactly how such rendering takes place, authors can use a styling language such as CSS.
In the following example, the page has been made yellow-on-blue using CSS.
<!DOCTYPE HTML>
<html>
<head>
<title>Sample styled page</title>
<style>
body { background: navy; color: yellow; }
</style>
</head>
<body>
<h1>Sample styled page</h1>
<p>This page is just a demo.</p>
</body>
</html>
For more details on how to use HTML, authors are encouraged to consult tutorials and guides. Some of the examples included in this specification might also be of use, but the novice author is cautioned that this specification, by necessity, defines the language with a level of detail that may be difficult to understand at first.
This specification refers to both HTML and XML attributes and DOM attributes, often in the same context. When it is not clear which is being referred to, they are referred to as content attributes for HTML and XML attributes, and DOM attributes for those from the DOM. Similarly, the term "properties" is used for both JavaScript object properties and CSS properties. When these are ambiguous they are qualified as object properties and CSS properties respectively.
Generally, when the specification states that a feature applies to the HTML syntax or the XHTML syntax , it also includes the other. When a feature specifically only applies to one of the two languages, it is called out by explicitly stating that it does not apply to the other format, as in "for HTML, ... (this does not apply to XHTML)".
This specification uses the term document to refer to any use of HTML, ranging from short static documents to long essays or reports with rich multimedia, as well as to fully-fledged interactive applications.
For simplicity, terms such as shown , displayed , and visible might sometimes be used when referring to the way a document is rendered to the user. These terms are not meant to imply a visual medium; they must be considered to apply to other media in equivalent ways.
When an algorithm B says to return to another algorithm A, it implies that A called B. Upon returning to A, the implementation must continue from where it left off in calling B.
To
ease
migration
from
HTML
to
XHTML,
UAs
conforming
to
this
specification
will
place
elements
in
HTML
in
the
http://www.w3.org/1999/xhtml
namespace,
at
least
for
the
purposes
of
the
DOM
and
CSS.
The
term
"
elements
in
the
HTML
namespace
",
or
"
HTML
elements
"
for
short,
when
used
in
this
specification,
thus
refers
to
both
HTML
and
XHTML
elements.
Unless
otherwise
stated,
all
elements
defined
or
mentioned
in
this
specification
are
in
the
http://www.w3.org/1999/xhtml
namespace,
and
all
attributes
defined
or
mentioned
in
this
specification
have
no
namespace
(they
are
in
the
per-element
partition).
When
an
XML
name,
such
as
an
attribute
or
element
name,
is
referred
to
in
the
form
prefix
:
localName
,
as
in
xml:id
or
svg:rect
,
it
refers
to
a
name
with
the
local
name
localName
and
the
namespace
given
by
the
prefix,
as
defined
by
the
following
table:
xml
http://www.w3.org/XML/1998/namespace
html
http://www.w3.org/1999/xhtml
svg
http://www.w3.org/2000/svg
Attribute
names
are
said
to
be
XML-compatible
if
they
match
the
Name
production
defined
in
XML,
they
contain
no
U+003A
COLON
(:)
characters,
and
their
first
three
characters
are
not
an
ASCII
case-insensitive
match
for
the
string
"
xml
".
[XML]
The
term
XML
MIME
type
is
used
to
refer
to
the
MIME
types
text/xml
,
application/xml
,
and
any
MIME
type
ending
with
the
four
characters
"
+xml
".
[RFC3023]
The term root element , when not explicitly qualified as referring to the document's root element, means the furthest ancestor element node of whatever node is being discussed, or the node itself if it has no ancestors. When the node is a part of the document, then that is indeed the document's root element; however, if the node is not currently part of the document tree, the root element will be an orphaned node.
A node's home subtree is the subtree rooted at that node's root element .
The
Document
of
a
Node
(such
as
an
element)
is
the
Document
that
the
Node
's
ownerDocument
DOM
attribute
returns.
When
an
element's
root
element
is
the
root
element
of
a
Document
,
it
is
said
to
be
in
a
Document
.
An
element
is
said
to
have
been
inserted
into
a
document
when
its
root
element
changes
and
is
now
the
document's
root
element
.
Analogously,
an
element
is
said
to
have
been
removed
from
a
document
when
its
root
element
changes
from
being
the
document's
root
element
to
being
another
element.
If
a
Node
is
in
a
Document
then
that
Document
is
always
the
Node
's
Document
,
and
the
Node
's
ownerDocument
DOM
attribute
thus
always
returns
that
Document
.
The
term
tree
order
means
a
pre-order,
depth-first
traversal
of
DOM
nodes
involved
(through
the
parentNode
/
childNodes
relationship).
When it is stated that some element or attribute is ignored , or treated as some other value, or handled as if it was something else, this refers only to the processing of the node after it is in the DOM. A user agent must not mutate the DOM in such situations.
The
term
text
node
refers
to
any
Text
node,
including
CDATASection
nodes;
specifically,
any
Node
with
node
type
TEXT_NODE
(3)
or
CDATA_SECTION_NODE
(4).
[DOMCORE]
A content attribute is said to change value only if its new value is different than its previous value; setting an attribute to a value it already has does not change it.
The
construction
"a
Foo
object",
where
Foo
is
actually
an
interface,
is
sometimes
used
instead
of
the
more
accurate
"an
object
implementing
the
interface
Foo
".
A DOM attribute is said to be getting when its value is being retrieved (e.g. by author script), and is said to be setting when a new value is assigned to it.
If a DOM object is said to be live , then that means that any attributes returning that object must always return the same object (not a new object each time), and the attributes and methods on that object must operate on the actual underlying data, not a snapshot of the data.
The terms fire and dispatch are used interchangeably in the context of events, as in the DOM Events specifications. [DOMEVENTS]
The term plugin is used to mean any content handler for Web content types that are either not supported by the user agent natively or that do not expose a DOM, which supports rendering the content as part of the user agent's interface.
Typically such content handlers are provided by third parties.
One example of a plugin would be a PDF viewer that is instantiated in a browsing context when the user navigates to a PDF file. This would count as a plugin regardless of whether the party that implemented the PDF viewer component was the same as that which implemented the user agent itself. However, a PDF viewer application that launches separate from the user agent (as opposed to using the same interface) is not a plugin by this definition.
This specification does not define a mechanism for interacting with plugins, as it is expected to be user-agent- and platform-specific. Some UAs might opt to support a plugin mechanism such as the Netscape Plugin API; others might use remote content converters or have built-in support for certain types. [NPAPI]
Browsers should take extreme care when interacting with external content intended for plugins . When third-party software is run with the same privileges as the user agent itself, vulnerabilities in the third-party software become as dangerous as those in the user agent.
An ASCII-compatible character encoding is a single-byte or variable-length encoding in which the bytes 0x09, 0x0A, 0x0C, 0x0D, 0x20 - 0x22, 0x26, 0x27, 0x2C - 0x3F, 0x41 - 0x5A, and 0x61 - 0x7A , ignoring bytes that are the second and later bytes of multibyte sequences, all correspond to single-byte sequences that map to the same Unicode characters as those bytes in ANSI_X3.4-1968 (US-ASCII). [RFC1345]
This includes such encodings as Shift_JIS and variants of ISO-2022, even though it is possible in these encodings for bytes like 0x70 to be part of longer sequences that are unrelated to their interpretation as ASCII. It excludes such encodings as UTF-7, UTF-16, HZ-GB-2312, GSM03.38, and EBCDIC variants.
The specification uses the term supported when referring to whether a user agent has an implementation capable of decoding the semantics of an external resource. A format or type is said to be supported if the implementation can process an external resource of that format or type without critical aspects of the resource being ignored. Whether a specific resource is supported can depend on what features of the resource's format are in use.
For example, a PNG image would be considered to be in a supported format if its pixel data could be decoded and rendered, even if, unbeknownst to the implementation, the image actually also contained animation data.
A MPEG4 video file would not be considered to be in a supported format if the compression format used was not supported, even if the implementation could determine the dimensions of the movie from the file's metadata.
The term MIME type is used to refer to what is sometimes called an Internet media type in protocol literature. The term media type in this specification is used to refer to the type of media intended for presentation, as used by the CSS specifications. [RFC2046] [MQ]
All diagrams, examples, and notes in this specification are non-normative, as are all sections explicitly marked non-normative. Everything else in this specification is normative.
The key words "MUST", "MUST NOT", "REQUIRED", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in the normative parts of this document are to be interpreted as described in RFC2119. For readability, these words do not appear in all uppercase letters in this specification. [RFC2119]
Requirements phrased in the imperative as part of algorithms (such as "strip any leading space characters" or "return false and abort these steps") are to be interpreted with the meaning of the key word ("must", "should", "may", etc) used in introducing the algorithm.
This specification describes the conformance criteria for user agents (relevant to implementors) and documents (relevant to authors and authoring tool implementors) .
There is no implied relationship between document conformance requirements and implementation conformance requirements. User agents are not free to handle non-conformant documents as they please; the processing model described in this specification applies to implementations regardless of the conformity of the input documents.
User agents fall into several (overlapping) categories with different conformance requirements.
Web browsers that support the XHTML syntax must process elements and attributes from the HTML namespace found in XML documents as described in this specification, so that users can interact with them, unless the semantics of those elements have been overridden by other specifications.
A
conforming
XHTML
processor
would,
upon
finding
an
XHTML
script
element
in
an
XML
document,
execute
the
script
contained
in
that
element.
However,
if
the
element
is
found
within
a
transformation
expressed
in
XSLT
(assuming
the
user
agent
also
supports
XSLT),
then
the
processor
would
instead
treat
the
script
element
as
an
opaque
element
that
forms
part
of
the
transform.
Web
browsers
that
support
the
HTML
syntax
must
process
documents
labeled
as
text/html
as
described
in
this
specification,
so
that
users
can
interact
with
them.
User agents that support scripting must also be conforming implementations of the IDL fragments in this specification, as described in the Web IDL specification. [WEBIDL]
User agents that process HTML and XHTML documents purely to render non-interactive versions of them must comply to the same conformance criteria as Web browsers, except that they are exempt from requirements regarding user interaction.
Typical examples of non-interactive presentation user agents are printers (static UAs) and overhead displays (dynamic UAs). It is expected that most static non-interactive presentation user agents will also opt to lack scripting support .
A non-interactive but dynamic presentation UA would still execute scripts, allowing forms to be dynamically submitted, and so forth. However, since the concept of "focus" is irrelevant when the user cannot interact with the document, the UA would not need to support any of the focus-related DOM APIs.
Implementations that do not support scripting (or which have their scripting features disabled entirely) are exempt from supporting the events and DOM interfaces mentioned in this specification. For the parts of this specification that are defined in terms of an events model or in terms of the DOM, such user agents must still act as if events and the DOM were supported.
Scripting can form an integral part of an application. Web browsers that do not support scripting, or that have scripting disabled, might be unable to fully convey the author's intent.
Conformance
checkers
must
verify
that
a
document
conforms
to
the
applicable
conformance
criteria
described
in
this
specification.
Automated
conformance
checkers
are
exempt
from
detecting
errors
that
require
interpretation
of
the
author's
intent
(for
example,
while
a
document
is
non-conforming
if
the
content
of
a
blockquote
element
is
not
a
quote,
conformance
checkers
running
without
the
input
of
human
judgement
do
not
have
to
check
that
blockquote
elements
only
contain
quoted
material).
Conformance checkers must check that the input document conforms when parsed without a browsing context (meaning that no scripts are run, and that the parser's scripting flag is disabled), and should also check that the input document conforms when parsed with a browsing context in which scripts execute, and that the scripts never cause non-conforming states to occur other than transiently during script execution itself. (This is only a "SHOULD" and not a "MUST" requirement because it has been proven to be impossible. [COMPUTABLE] )
The term "HTML5 validator" can be used to refer to a conformance checker that itself conforms to the applicable requirements of this specification.
XML DTDs cannot express all the conformance requirements of this specification. Therefore, a validating XML processor and a DTD cannot constitute a conformance checker. Also, since neither of the two authoring formats defined in this specification are applications of SGML, a validating SGML system cannot constitute a conformance checker either.
To put it another way, there are three types of conformance criteria:
A conformance checker must check for the first two. A simple DTD-based validator only checks for the first class of errors and is therefore not a conforming conformance checker according to this specification.
Applications and tools that process HTML and XHTML documents for reasons other than to either render the documents or check them for conformance should act in accordance to the semantics of the documents that they process.
A tool that generates document outlines but increases the nesting level for each paragraph and does not increase the nesting level for each section would not be conforming.
Authoring tools and markup generators must generate conforming documents. Conformance criteria that apply to authors also apply to authoring tools, where appropriate.
Authoring tools are exempt from the strict requirements of using elements only for their specified purpose, but only to the extent that authoring tools are not yet able to determine author intent.
For
example,
it
is
not
conforming
to
use
an
address
element
for
arbitrary
contact
information;
that
element
can
only
be
used
for
marking
up
contact
information
for
the
author
of
the
document
or
section.
However,
since
an
authoring
tool
is
likely
unable
to
determine
the
difference,
an
authoring
tool
is
exempt
from
that
requirement.
In terms of conformance checking, an editor is therefore required to output documents that conform to the same extent that a conformance checker will verify.
When an authoring tool is used to edit a non-conforming document, it may preserve the conformance errors in sections of the document that were not edited during the editing session (i.e. an editing tool is allowed to round-trip erroneous content). However, an authoring tool must not claim that the output is conformant if errors have been so preserved.
Authoring tools are expected to come in two broad varieties: tools that work from structure or semantic data, and tools that work on a What-You-See-Is-What-You-Get media-specific editing basis (WYSIWYG).
The former is the preferred mechanism for tools that author HTML, since the structure in the source information can be used to make informed choices regarding which HTML elements and attributes are most appropriate.
However,
WYSIWYG
tools
are
legitimate.
WYSIWYG
tools
should
use
elements
they
know
are
appropriate,
and
should
not
use
elements
that
they
do
not
know
to
be
appropriate.
This
might
in
certain
extreme
cases
mean
limiting
the
use
of
flow
elements
to
just
a
few
elements,
like
div
,
b
,
i
,
and
span
and
making
liberal
use
of
the
style
attribute.
All authoring tools, whether WYSIWYG or not, should make a best effort attempt at enabling users to create well-structured, semantically rich, media-independent content.
Some conformance requirements are phrased as requirements on elements, attributes, methods or objects. Such requirements fall into two categories: those describing content model restrictions, and those describing implementation behavior. Those in the former category are requirements on documents and authoring tools. Those in the second category are requirements on user agents.
Conformance requirements phrased as algorithms or specific steps may be implemented in any manner, so long as the end result is equivalent. (In particular, the algorithms defined in this specification are intended to be easy to follow, and not intended to be performant.)
User agents may impose implementation-specific limits on otherwise unconstrained inputs, e.g. to prevent denial of service attacks, to guard against running out of memory, or to work around platform-specific limitations.
For compatibility with existing content and prior specifications, this specification describes two authoring formats: one based on XML (referred to as the XHTML syntax ), and one using a custom format inspired by SGML (referred to as the HTML syntax ). Implementations may support only one of these two formats, although supporting both is encouraged.
The language in this specification assumes that the user agent expands all entity references, and therefore does not include entity reference nodes in the DOM. If user agents do include entity reference nodes in the DOM, then user agents must handle them as if they were fully expanded when implementing this specification. For example, if a requirement talks about an element's child text nodes, then any text nodes that are children of an entity reference that is a child of that element would be used as well. Entity references to unknown entities must be treated as if they contained just an empty text node for the purposes of the algorithms defined in this specification.
This specification relies on several other underlying specifications.
Implementations that support the XHTML syntax must support some version of XML, as well as its corresponding namespaces specification, because that syntax uses an XML serialization with namespaces. [XML] [XMLNAMES]
The Document Object Model (DOM) is a representation — a model — of a document and its content. The DOM is not just an API; the conformance criteria of HTML implementations are defined, in this specification, in terms of operations on the DOM. [DOMCORE]
Implementations must support some version of DOM Core and DOM Events, because this specification is defined in terms of the DOM, and some of the features are defined as extensions to the DOM Core interfaces. [DOMCORE] [DOMEVENTS]
The IDL fragments in this specification must be interpreted as required for conforming IDL fragments, as described in the Web IDL specification. [WEBIDL]
Unless
otherwise
specified,
if
a
DOM
attribute
that
is
a
floating
point
number
type
(
float
)
is
assigned
an
Infinity
or
Not-a-Number
(NaN)
value,
a
NOT_SUPPORTED_ERR
exception
must
be
raised.
Unless
otherwise
specified,
if
a
method
with
an
argument
that
is
a
floating
point
number
type
(
float
)
is
passed
an
Infinity
or
Not-a-Number
(NaN)
value,
a
NOT_SUPPORTED_ERR
exception
must
be
raised.
Some parts of the language described by this specification only support JavaScript as the underlying scripting language. [ECMA262]
The
term
"JavaScript"
is
used
to
refer
to
ECMA262,
rather
than
the
official
term
ECMAScript,
since
the
term
JavaScript
is
more
widely
known.
Similarly,
the
MIME
type
used
to
refer
to
JavaScript
in
this
specification
is
text/javascript
,
since
that
is
the
most
commonly
used
type,
despite
it
being
an
officially
obsoleted
type
according
to
RFC
4329.
[RFC4329]
Implementations must support some version of the Media Queries language. [MQ]
This specification does not require support of any particular network transport protocols, style sheet language, scripting language, or any of the DOM and WebAPI specifications beyond those described above. However, the language described by this specification is biased towards CSS as the styling language, JavaScript as the scripting language, and HTTP as the network protocol, and several features assume that those languages and protocols are in use.
This specification might have certain additional requirements on character encodings, image formats, audio formats, and video formats in the respective sections.
Vendor-specific proprietary extensions to this specification are strongly discouraged. Documents must not use such extensions, as doing so reduces interoperability and fragments the user base, allowing only users of specific user agents to access the content in question.
If markup extensions are needed, they should be done using XML, with elements or attributes from custom namespaces. If DOM extensions are needed, the members should be prefixed by vendor-specific strings to prevent clashes with future versions of this specification. Extensions must be defined so that the use of extensions does not contradict nor cause the non-conformance of functionality defined in the specification.
For
example,
while
strongly
discouraged
to
do
so,
an
implementation
"Foo
Browser"
could
add
a
new
DOM
attribute
"
fooTypeTime
"
to
a
control's
DOM
interface
that
returned
the
time
it
took
the
user
to
select
the
current
value
of
a
control
(say).
On
the
other
hand,
defining
a
new
control
that
appears
in
a
form's
elements
array
would
be
in
violation
of
the
above
requirement,
as
it
would
violate
the
definition
of
elements
given
in
this
specification.
User agents must treat elements and attributes that they do not understand as semantically neutral; leaving them in the DOM (for DOM processors), and styling them according to CSS (for CSS processors), but not inferring any meaning from them.
This specification defines several comparison operators for strings.
Comparing two strings in a case-sensitive manner means comparing them exactly, code point for code point.
Comparing two strings in an ASCII case-insensitive manner means comparing them exactly, code point for code point, except that the characters in the range U+0041 .. U+005A (i.e. LATIN CAPITAL LETTER A to LATIN CAPITAL LETTER Z) and the corresponding characters in the range U+0061 .. U+007A (i.e. LATIN SMALL LETTER A to LATIN SMALL LETTER Z) are considered to also match.
Comparing two strings in a compatibility caseless manner means using the Unicode compatibility caseless match operation to compare the two strings. [UNICODECASE]
Converting a string to ASCII uppercase means replacing all characters in the range U+0061 .. U+007A (i.e. LATIN SMALL LETTER A to LATIN SMALL LETTER Z) with the corresponding characters in the range U+0041 .. U+005A (i.e. LATIN CAPITAL LETTER A to LATIN CAPITAL LETTER Z).
Converting a string to ASCII lowercase means replacing all characters in the range U+0041 .. U+005A (i.e. LATIN CAPITAL LETTER A to LATIN CAPITAL LETTER Z) with the corresponding characters in the range U+0061 .. U+007A (i.e. LATIN SMALL LETTER A to LATIN SMALL LETTER Z).
A string pattern is a prefix match for a string s when pattern is not longer than s and truncating s to pattern 's length leaves the two strings as matches of each other.
There are various places in HTML that accept particular data types, such as dates or numbers. This section describes what the conformance criteria for content in those formats is, and how to parse them.
Implementors are strongly urged to carefully examine any third-party libraries they might consider using to implement the parsing of syntaxes described below. For example, date libraries are likely to implement error handling behavior that differs from what is required in this specification, since error-handling behavior is often not defined in specifications that describe date syntaxes similar to those used in this specification, and thus implementations tend to vary greatly in how they handle errors.
The space characters , for the purposes of this specification, are U+0020 SPACE, U+0009 CHARACTER TABULATION (tab), U+000A LINE FEED (LF), U+000C FORM FEED (FF), and U+000D CARRIAGE RETURN (CR).
The White_Space characters are those that have the Unicode property "White_Space". [UNICODE]
The alphanumeric ASCII characters are those in the ranges U+0030 DIGIT ZERO .. U+0039 DIGIT NINE, U+0041 LATIN CAPITAL LETTER A .. U+005A LATIN CAPITAL LETTER Z, U+0061 LATIN SMALL LETTER A .. U+007A LATIN SMALL LETTER Z.
Some of the micro-parsers described below follow the pattern of having an input variable that holds the string being parsed, and having a position variable pointing at the next character to parse in input .
For parsers based on this pattern, a step that requires the user agent to collect a sequence of characters means that the following algorithm must be run, with characters being the set of characters that can be collected:
Let input and position be the same variables as those of the same name in the algorithm that invoked these steps.
Let result be the empty string.
While position doesn't point past the end of input and the character at position is one of the characters , append that character to the end of result and advance position to the next character in input .
Return result .
The step skip whitespace means that the user agent must collect a sequence of characters that are space characters . The step skip White_Space characters means that the user agent must collect a sequence of characters that are White_Space characters. In both cases, the collected characters are not used. [UNICODE]
When a user agent is to strip line breaks from a string, the user agent must remove any U+000A LINE FEED (LF) and U+000D CARRIAGE RETURN (CR) characters from that string.
The code-point length of a string is the number of Unicode code points in that string.
A number of attributes are boolean attributes . The presence of a boolean attribute on an element represents the true value, and the absence of the attribute represents the false value.
If the attribute is present, its value must either be the empty string or a value that is an ASCII case-insensitive match for the attribute's canonical name, with no leading or trailing whitespace.
The values "true" and "false" are not allowed on boolean attributes. To represent a false value, the attribute has to be omitted altogether.
Some attributes are defined as taking one of a finite set of keywords. Such attributes are called enumerated attributes . The keywords are each defined to map to a particular state (several keywords might map to the same state, in which case some of the keywords are synonyms of each other; additionally, some of the keywords can be said to be non-conforming, and are only in the specification for historical reasons). In addition, two default states can be given. The first is the invalid value default , the second is the missing value default .
If an enumerated attribute is specified, the attribute's value must be an ASCII case-insensitive match for one of the given keywords that are not said to be non-conforming, with no leading or trailing whitespace.
When the attribute is specified, if its value is an ASCII case-insensitive match for one of the given keywords then that keyword's state is the state that the attribute represents. If the attribute value matches none of the given keywords, but the attribute has an invalid value default , then the attribute represents that state. Otherwise, if the attribute value matches none of the keywords but there is a missing value default state defined, then that is the state represented by the attribute. Otherwise, there is no default, and invalid values must be ignored.
When the attribute is not specified, if there is a missing value default state defined, then that is the state represented by the (missing) attribute. Otherwise, the absence of the attribute means that there is no state represented.
The empty string can be a valid keyword.
A string is a valid non-negative integer if it consists of one or more characters in the range U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9).
A valid non-negative integer represents the number that is represented in base ten by that string of digits.
The rules for parsing non-negative integers are as given in the following algorithm. When invoked, the steps must be followed in the order given, aborting at the first step that returns a value. This algorithm will either return zero, a positive integer, or an error. Leading spaces are ignored. Trailing spaces and any trailing garbage characters are ignored.
Let input be the string being parsed.
Let position be a pointer into input , initially pointing at the start of the string.
Let value have the value 0.
If position is past the end of input , return an error.
If the next character is a U+002B PLUS SIGN character (+), advance position to the next character.
If position is past the end of input , return an error.
If the next character is not one of U+0030 DIGIT ZERO (0) .. U+0039 DIGIT NINE (9), then return an error.
Loop : If the next character is one of U+0030 DIGIT ZERO (0) .. U+0039 DIGIT NINE (9):
Return value .
A string is a valid integer if it consists of one or more characters in the range U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9), optionally prefixed with a U+002D HYPHEN-MINUS ("-") character.
A valid integer without a U+002D HYPHEN-MINUS ("-") prefix represents the number that is represented in base ten by that string of digits. A valid integer with a U+002D HYPHEN-MINUS ("-") prefix represents the number represented in base ten by the string of digits that follows the U+002D HYPHEN-MINUS, subtracted from zero.
The rules for parsing integers are similar to the rules for non-negative integers , and are as given in the following algorithm. When invoked, the steps must be followed in the order given, aborting at the first step that returns a value. This algorithm will either return an integer or an error. Leading spaces are ignored. Trailing spaces and trailing garbage characters are ignored.
Let input be the string being parsed.
Let position be a pointer into input , initially pointing at the start of the string.
Let value have the value 0.
Let sign have the value "positive".
If position is past the end of input , return an error.
If the character indicated by position (the first character) is a U+002D HYPHEN-MINUS ("-") character:
Otherwise,
if
the
character
indicated
by
position
(the
first
character)
is
a
U+002B
PLUS
SIGN
character
(+),
then
advance
position
to
the
next
character.
(The
"
+
"
is
ignored,
but
it
is
not
conforming.)
If the next character is not one of U+0030 DIGIT ZERO (0) .. U+0039 DIGIT NINE (9), then return an error.
If the next character is one of U+0030 DIGIT ZERO (0) .. U+0039 DIGIT NINE (9):
If sign is "positive", return value , otherwise return the result of subtracting value from zero.
A string is a valid floating point number if it consists of:
A valid floating point number represents the number obtained by multiplying the significand by ten raised to the power of the exponent, where the significand is the first number, interpreted as base ten (including the decimal point and the number after the decimal point, if any, and interpreting the significand as a negative number if the whole string starts with a U+002D HYPHEN-MINUS ("-") character and the number is not zero), and where the exponent is the number after the E, if any (interpreted as a negative number if there is a U+002D HYPHEN-MINUS ("-") character between the E and the number and the number is not zero, or else ignoring a U+002B PLUS SIGN ("+") character between the E and the number if there is one). If there is no E, then the exponent is treated as zero.
The Infinity and Not-a-Number (NaN) values are not valid floating point numbers .
The best representation of the floating point number n is the string obtained from applying the JavaScript operator ToString to n .
The rules for parsing floating point number values are as given in the following algorithm. As with the previous algorithms, when this one is invoked, the steps must be followed in the order given, aborting at the first step that returns something. This algorithm will either return a number or an error. Leading spaces are ignored. Trailing spaces and garbage characters are ignored.
Let input be the string being parsed.
Let position be a pointer into input , initially pointing at the start of the string.
Let value have the value 1.
Let divisor have the value 1.
Let exponent have the value 1.
If position is past the end of input , return an error.
If the character indicated by position is a U+002D HYPHEN-MINUS ("-") character:
If the character indicated by position is not one of U+0030 DIGIT ZERO (0) .. U+0039 DIGIT NINE (9), then return an error.
Collect a sequence of characters in the range U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9), and interpret the resulting sequence as a base-ten integer. Multiply value by that integer.
If the character indicated by position is a U+002E FULL STOP ("."), run these substeps:
Advance position to the next character.
If position is past the end of input , or if the character indicated by position is not one of U+0030 DIGIT ZERO (0) .. U+0039 DIGIT NINE (9), then return value .
Fraction loop : Multiply divisor by ten.
Advance position to the next character.
If position is past the end of input , then return value .
If the character indicated by position is one of U+0030 DIGIT ZERO (0) .. U+0039 DIGIT NINE (9), return to the step labeled fraction loop in these substeps.
If the character indicated by position is a U+0065 LATIN SMALL LETTER E character or a U+0045 LATIN CAPITAL LETTER E character, run these substeps:
Advance position to the next character.
If position is past the end of input , then return value .
If the character indicated by position is a U+002D HYPHEN-MINUS ("-") character:
If position is past the end of input , then return value .
Otherwise, if the character indicated by position is a U+002B PLUS SIGN ("+") character:
If position is past the end of input , then return value .
If the character indicated by position is not one of U+0030 DIGIT ZERO (0) .. U+0039 DIGIT NINE (9), then return value .
Collect a sequence of characters in the range U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9), and interpret the resulting sequence as a base-ten integer. Multiply exponent by that integer.
Multiply value by ten raised to the exponent th power.
Return value .
The
algorithms
described
in
this
section
are
used
by
the
progress
and
meter
elements.
A valid denominator punctuation character is one of the characters from the table below. There is a value associated with each denominator punctuation character , as shown in the table below.
| Denominator Punctuation Character | Value | |
|---|---|---|
| U+0025 PERCENT SIGN | % | 100 |
| U+066A ARABIC PERCENT SIGN | ٪ | 100 |
| U+FE6A SMALL PERCENT SIGN | ﹪ | 100 |
| U+FF05 FULLWIDTH PERCENT SIGN | % | 100 |
| U+2030 PER MILLE SIGN | ‰ | 1000 |
| U+2031 PER TEN THOUSAND SIGN | ‱ | 10000 |
The steps for finding one or two numbers of a ratio in a string are as follows:
The algorithm to find a number is as follows. It is given a string and a starting position, and returns either nothing, a number, or an error condition.
The rules for parsing dimension values are as given in the following algorithm. When invoked, the steps must be followed in the order given, aborting at the first step that returns a value. This algorithm will either return a number greater than or equal to 1.0, or an error; if a number is returned, then it is further categorized as either a percentage or a length.
Let input be the string being parsed.
Let position be a pointer into input , initially pointing at the start of the string.
If position is past the end of input , return an error.
If the next character is a U+002B PLUS SIGN character (+), advance position to the next character.
Collect a sequence of characters that are U+0030 DIGIT ZERO (0) characters, and discard them.
If position is past the end of input , return an error.
If the next character is not one of U+0031 DIGIT ONE (1) .. U+0039 DIGIT NINE (9), then return an error.
Collect a sequence of characters in the range U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9), and interpret the resulting sequence as a base-ten integer. Let value be that number.
If position is past the end of input , return value as an integer.
If the next character is a U+002E FULL STOP character (.):
Advance position to the next character.
If the next character is not one of U+0030 DIGIT ZERO (0) .. U+0039 DIGIT NINE (9), then return value as an integer.
Collect a sequence of characters in the range U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9). Let length be the number of characters collected. Let fraction be the result of interpreting the collected characters as a base-ten integer, and then dividing that number by 10 length .
Increment value by fraction .
If position is past the end of input , return value as a length.
If the next character is a U+0025 PERCENT SIGN character (%), return value as a percentage.
Return value as a length.
A valid list of integers is a number of valid integers separated by U+002C COMMA characters, with no other characters (e.g. no space characters ). In addition, there might be restrictions on the number of integers that can be given, or on the range of values allowed.
The rules for parsing a list of integers are as follows:
Let input be the string being parsed.
Let position be a pointer into input , initially pointing at the start of the string.
Let numbers be an initially empty list of integers. This list will be the result of this algorithm.
If there is a character in the string input at position position , and it is either a U+0020 SPACE, U+002C COMMA, or U+003B SEMICOLON character, then advance position to the next character in input , or to beyond the end of the string if there are no more characters.
If position points to beyond the end of input , return numbers and abort.
If the character in the string input at position position is a U+0020 SPACE, U+002C COMMA, or U+003B SEMICOLON character, then return to step 4.
Let negated be false.
Let value be 0.
Let started be false. This variable is set to true when the parser sees a number or a U+002D HYPHEN-MINUS ("-") character.
Let got number be false. This variable is set to true when the parser sees a number.
Let finished be false. This variable is set to true to switch parser into a mode where it ignores characters until the next separator.
Let bogus be false.
Parser : If the character in the string input at position position is:
Follow these substeps:
Follow these substeps:
Follow these substeps:
1,2,x,4
".
Follow these substeps:
Follow these substeps:
Advance position to the next character in input , or to beyond the end of the string if there are no more characters.
If position points to a character (and not to beyond the end of input ), jump to the big Parser step above.
If negated is true, then negate value .
If got number is true, then append value to the numbers list.
Return the numbers list and abort.
The rules for parsing a list of dimensions are as follows. These rules return a list of zero or more pairs consisting of a number and a unit, the unit being one of percentage , relative , and absolute .
Let raw input be the string being parsed.
If the last character in raw input is a U+002C COMMA character (","), then remove that character from raw input .
Split the string raw input on commas . Let raw tokens be the resulting list of tokens.
Let result be an empty list of number/unit pairs.
For each token in raw tokens , run the following substeps:
Let input be the token.
Let position be a pointer into input , initially pointing at the start of the string.
Let value be the number 0.
Let unit be absolute .
If position is past the end of input , set unit to relative and jump to the last substep.
If the character at position is a character in the range U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9), collect a sequence of characters in the range U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9), interpret the resulting sequence as an integer in base ten, and increment value by that integer.
If the character at position is a U+002E FULL STOP character (.), run these substeps:
Collect a sequence of characters consisting of space characters and characters in the range U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9). Let s be the resulting sequence.
Remove all space characters in s .
If s is not the empty string, run these subsubsteps:
Let length be the number of characters in s (after the spaces were removed).
Let fraction be the result of interpreting s as a base-ten integer, and then dividing that number by 10 length .
Increment value by fraction .
If the character at position is a U+0025 PERCENT SIGN (%) character, then set unit to percentage .
Otherwise, if the character at position is a U+002A ASTERISK character (*), then set unit to relative .
Add an entry to result consisting of the number given by value and the unit given by unit .
Return the list result .
In the algorithms below, the number of days in month month of year year is: 31 if month is 1, 3, 5, 7, 8, 10, or 12; 30 if month is 4, 6, 9, or 11; 29 if month is 2 and year is a number divisible by 400, or if year is a number divisible by 4 but not by 100; and 28 otherwise. This takes into account leap years in the Gregorian calendar. [GREGORIAN]
The digits in the date and time syntaxes defined in this section must be characters in the range U+0030 DIGIT ZERO to U+0039 DIGIT NINE, used to express numbers in base ten.
While the formats described here are intended to be subsets of the corresponding ISO8601 formats, this specification defines parsing rules in much more detail than ISO8601. Implementators are therefore encouraged to carefully examine any date parsing libraries before using them to implement the parsing rules described below; ISO8601 libraries might not parse dates and times exactly the same manner. [ISO8601]
A month consists of a specific proleptic Gregorian date with no time-zone information and no date information beyond a year and a month. [GREGORIAN]
A string is a valid month string representing a year year and month month if it consists of the following components in the given order:
The rules to parse a month string are as follows. This will either return a year and month, or nothing. If at any point the algorithm says that it "fails", this means that it is aborted at that point and returns nothing.
Let input be the string being parsed.
Let position be a pointer into input , initially pointing at the start of the string.
Parse a month component to obtain year and month . If this returns nothing, then fail.
If position is not beyond the end of input , then fail.
Return year and month .
The rules to parse a month component , given an input string and a position , are as follows. This will either return a year and a month, or nothing. If at any point the algorithm says that it "fails", this means that it is aborted at that point and returns nothing.
Collect a sequence of characters in the range U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9). If the collected sequence is not at least four characters long, then fail. Otherwise, interpret the resulting sequence as a base-ten integer. Let that number be the year .
If year is not a number greater than zero, then fail.
If position is beyond the end of input or if the character at position is not a U+002D HYPHEN-MINUS character, then fail. Otherwise, move position forwards one character.
Collect a sequence of characters in the range U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9). If the collected sequence is not exactly two characters long, then fail. Otherwise, interpret the resulting sequence as a base-ten integer. Let that number be the month .
If month is not a number in the range 1 ≤ month ≤ 12, then fail.
Return year and month .
A date consists of a specific proleptic Gregorian date with no time-zone information, consisting of a year, a month, and a day. [GREGORIAN]
A string is a valid date string representing a year year , month month , and day day if it consists of the following components in the given order:
The rules to parse a date string are as follows. This will either return a date, or nothing. If at any point the algorithm says that it "fails", this means that it is aborted at that point and returns nothing.
Let input be the string being parsed.
Let position be a pointer into input , initially pointing at the start of the string.
Parse a date component to obtain year , month , and day . If this returns nothing, then fail.
If position is not beyond the end of input , then fail.
Let date be the date with year year , month month , and day day .
Return date .
The rules to parse a date component , given an input string and a position , are as follows. This will either return a year, a month, and a day, or nothing. If at any point the algorithm says that it "fails", this means that it is aborted at that point and returns nothing.
Parse a month component to obtain year and month . If this returns nothing, then fail.
Let maxday be the number of days in month month of year year .
If position is beyond the end of input or if the character at position is not a U+002D HYPHEN-MINUS character, then fail. Otherwise, move position forwards one character.
Collect a sequence of characters in the range U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9). If the collected sequence is not exactly two characters long, then fail. Otherwise, interpret the resulting sequence as a base-ten integer. Let that number be the day .
If day is not a number in the range 1 ≤ month ≤ maxday , then fail.
Return year , month , and day .
A time consists of a specific time with no time-zone information, consisting of an hour, a minute, a second, and a fraction of a second.
A string is a valid time string representing an hour hour , a minute minute , and a second second if it consists of the following components in the given order:
The second component cannot be 60 or 61; leap seconds cannot be represented.
The rules to parse a time string are as follows. This will either return a time, or nothing. If at any point the algorithm says that it "fails", this means that it is aborted at that point and returns nothing.
Let input be the string being parsed.
Let position be a pointer into input , initially pointing at the start of the string.
Parse a time component to obtain hour , minute , and second . If this returns nothing, then fail.
If position is not beyond the end of input , then fail.
Let time be the time with hour hour , minute minute , and second second .
Return time .
The rules to parse a time component , given an input string and a position , are as follows. This will either return an hour, a minute, and a second, or nothing. If at any point the algorithm says that it "fails", this means that it is aborted at that point and returns nothing.
Collect a sequence of characters in the range U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9). If the collected sequence is not exactly two characters long, then fail. Otherwise, interpret the resulting sequence as a base-ten integer. Let that number be the hour .
If position is beyond the end of input or if the character at position is not a U+003A COLON character, then fail. Otherwise, move position forwards one character.
Collect a sequence of characters in the range U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9). If the collected sequence is not exactly two characters long, then fail. Otherwise, interpret the resulting sequence as a base-ten integer. Let that number be the minute .
Let second be a string with the value "0".
If position is not beyond the end of input and the character at position is a U+003A COLON, then run these substeps:
Advance position to the next character in input .
If position is beyond the end of input , or at the last character in input , or if the next two characters in input starting at position are not two characters both in the range U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9), then fail.
Collect a sequence of characters that are either characters in the range U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9) or U+002E FULL STOP characters. If the collected sequence has more than one U+002E FULL STOP characters, or if the last character in the sequence is a U+002E FULL STOP character, then fail. Otherwise, let the collected string be second instead of its previous value.
Interpret second as a base-ten number (possibly with a fractional part). Let second be that number instead of the string version.
If second is not a number in the range 0 ≤ second < 60, then fail.
Return hour , minute , and second .
A local date and time consists of a specific proleptic Gregorian date, consisting of a year, a month, and a day, and a time, consisting of an hour, a minute, a second, and a fraction of a second, but expressed without a time zone. [GREGORIAN]
A string is a valid local date and time string representing a date and time if it consists of the following components in the given order:
The rules to parse a local date and time string are as follows. This will either return a date and time, or nothing. If at any point the algorithm says that it "fails", this means that it is aborted at that point and returns nothing.
Let input be the string being parsed.
Let position be a pointer into input , initially pointing at the start of the string.
Parse a date component to obtain year , month , and day . If this returns nothing, then fail.
If position is beyond the end of input or if the character at position is not a U+0054 LATIN CAPITAL LETTER T character then fail. Otherwise, move position forwards one character.
Parse a time component to obtain hour , minute , and second . If this returns nothing, then fail.
If position is not beyond the end of input , then fail.
Let date be the date with year year , month month , and day day .
Let time be the time with hour hour , minute minute , and second second .
Return date and time .
A global date and time consists of a specific proleptic Gregorian date, consisting of a year, a month, and a day, and a time, consisting of an hour, a minute, a second, and a fraction of a second, expressed with a time zone, consisting of a number of hours and minutes. [GREGORIAN]
A string is a valid global date and time string representing a date, time, and a time-zone offset if it consists of the following components in the given order:
This format allows for time zone offsets from -23:59 to +23:59. In practice, however, the range of actual time zones is -12:00 to +14:00, and the minutes component of actual time zones is always either 00, 30, or 45.
The following are some examples of dates written as valid global date and time strings .
0037-12-13T00:00Z
"
1979-10-14T12:00:00.001-04:00
"
8592-01-01T02:09+02:09
"
Several things are notable about these dates:
The rules to parse a global date and time string are as follows. This will either return a time in UTC, with associated time-zone information for round tripping or display purposes, or nothing. If at any point the algorithm says that it "fails", this means that it is aborted at that point and returns nothing.
Let input be the string being parsed.
Let position be a pointer into input , initially pointing at the start of the string.
Parse a date component to obtain year , month , and day . If this returns nothing, then fail.
If position is beyond the end of input or if the character at position is not a U+0054 LATIN CAPITAL LETTER T character then fail. Otherwise, move position forwards one character.
Parse a time component to obtain hour , minute , and second . If this returns nothing, then fail.
If position is beyond the end of input , then fail.
Parse a time-zone component to obtain timezone hours and timezone minutes . If this returns nothing, then fail.
If position is not beyond the end of input , then fail.
Let time be the moment in time at year year , month month , day day , hours hour , minute minute , second second , subtracting timezone hours hours and timezone minutes minutes. That moment in time is a moment in the UTC time zone.
Let timezone be timezone hours hours and timezone minutes minutes from UTC.
Return time and timezone .
The rules to parse a time-zone component , given an input string and a position , are as follows. This will either return time-zone hours and time-zone minutes, or nothing. If at any point the algorithm says that it "fails", this means that it is aborted at that point and returns nothing.
If the character at position is a U+005A LATIN CAPITAL LETTER Z, then:
Let timezone hours be 0.
Let timezone minutes be 0.
Advance position to the next character in input .
Otherwise, if the character at position is either a U+002B PLUS SIGN ("+") or a U+002D HYPHEN-MINUS ("-"), then:
If the character at position is a U+002B PLUS SIGN ("+"), let sign be "positive". Otherwise, it's a U+002D HYPHEN-MINUS ("-"); let sign be "negative".
Advance position to the next character in input .
Collect a sequence of characters in the range U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9). If the collected sequence is not exactly two characters long, then fail. Otherwise, interpret the resulting sequence as a base-ten integer. Let that number be the timezone hours .
If position is beyond the end of input or if the character at position is not a U+003A COLON character, then fail. Otherwise, move position forwards one character.
Collect a sequence of characters in the range U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9). If the collected sequence is not exactly two characters long, then fail. Otherwise, interpret the resulting sequence as a base-ten integer. Let that number be the timezone minutes .
Return timezone hours and timezone minutes .
A week consists of a week-year number and a week number representing a seven day period. Each week-year in this calendaring system has either 52 weeks or 53 weeks, as defined below. A week is a seven-day period. The week starting on the Gregorian date Monday December 29th 1969 (1969-12-29) is defined as week number 1 in week-year 1970. Consecutive weeks are numbered sequentially. The week before the number 1 week in a week-year is the last week in the previous week-year, and vice versa. [GREGORIAN]
A week-year with a number year has 53 weeks if it corresponds to either a year year in the proleptic Gregorian calendar that has a Thursday as its first day (January 1st), or a year year in the proleptic Gregorian calendar that has a Wednesday as its first day (January 1st) and where year is a number divisible by 400, or a number divisible by 4 but not by 100. All other week-years have 52 weeks.
The week number of the last day of a week-year with 53 weeks is 53; the week number of the last day of a week-year with 52 weeks is 52.
The week-year number of a particular day can be different than the number of the year that contains that day in the proleptic Gregorian calendar. The first week in a week-year y is the week that contains the first Thursday of the Gregorian year y .
A string is a valid week string representing a week-year year and week week if it consists of the following components in the given order:
The rules to parse a week string are as follows. This will either return a week-year number and week number, or nothing. If at any point the algorithm says that it "fails", this means that it is aborted at that point and returns nothing.
Let input be the string being parsed.
Let position be a pointer into input , initially pointing at the start of the string.
Collect a sequence of characters in the range U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9). If the collected sequence is not at least four characters long, then fail. Otherwise, interpret the resulting sequence as a base-ten integer. Let that number be the year .
If year is not a number greater than zero, then fail.
If position is beyond the end of input or if the character at position is not a U+002D HYPHEN-MINUS character, then fail. Otherwise, move position forwards one character.
If position is beyond the end of input or if the character at position is not a U+0057 LATIN CAPITAL LETTER W character, then fail. Otherwise, move position forwards one character.
Collect a sequence of characters in the range U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9). If the collected sequence is not exactly two characters long, then fail. Otherwise, interpret the resulting sequence as a base-ten integer. Let that number be the week .
Let maxweek be the week number of the last day of year year .
If week is not a number in the range 1 ≤ week ≤ maxweek , then fail.
If position is not beyond the end of input , then fail.
Return the week-year number year and the week number week .
A date or time string consists of either a date , a time , or a global date and time .
A string is a valid date or time string if it is also one of the following:
A string is a valid date or time string in content if it consists of zero or more White_Space characters, followed by a valid date or time string , followed by zero or more further White_Space characters.
The rules to parse a date or time string are as follows. The algorithm is invoked with a flag indicating if the in attribute variant or the in content variant is to be used. The algorithm will either return a date , a time , a global date and time , or nothing. If at any point the algorithm says that it "fails", this means that it is aborted at that point and returns nothing.
Let input be the string being parsed.
Let position be a pointer into input , initially pointing at the start of the string.
For the in content variant: skip White_Space characters .
Set start position to the same position as position .
Set the date present and time present flags to true.
Parse a date component to obtain year , month , and day . If this fails, then set the date present flag to false.
If date present is true, and position is not beyond the end of input , and the character at position is a U+0054 LATIN CAPITAL LETTER T character, then advance position to the next character in input .
Otherwise, if date present is true, and either position is beyond the end of input or the character at position is not a U+0054 LATIN CAPITAL LETTER T character, then set time present to false.
Otherwise, if date present is false, set position back to the same position as start position .
If the time present flag is true, then parse a time component to obtain hour , minute , and second . If this returns nothing, then set the time present flag to false.
If both the date present and time present flags are false, then fail.
If the date present and time present flags are both true, but position is beyond the end of input , then fail.
If the date present and time present flags are both true, parse a time-zone component to obtain timezone hours and timezone minutes . If this returns nothing, then fail.
For the in content variant: skip White_Space characters .
If position is not beyond the end of input , then fail.
If the date present flag is true and the time present flag is false, then let date be the date with year year , month month , and day day , and return date .
Otherwise, if the time present flag is true and the date present flag is false, then let time be the time with hour hour , minute minute , and second second , and return time .
Otherwise, let time be the moment in time at year year , month month , day day , hours hour , minute minute , second second , subtracting timezone hours hours and timezone minutes minutes, that moment in time being a moment in the UTC time zone; let timezone be timezone hours hours and timezone minutes minutes from UTC; and return time and timezone .
A simple color consists of three 8-bit numbers in the range 0..255, representing the red, green, and blue components of the color respectively, in the sRGB color space. [SRGB]
A string is a valid simple color if it is exactly seven characters long, and the first character is a U+0023 NUMBER SIGN (#) character, and the remaining six characters are all in the range U+0030 DIGIT ZERO (0) .. U+0039 DIGIT NINE (9), U+0041 LATIN CAPITAL LETTER A .. U+0046 LATIN CAPITAL LETTER F, U+0061 LATIN SMALL LETTER A .. U+0066 LATIN SMALL LETTER F, with the first two digits representing the red component, the middle two digits representing the green component, and the last two digits representing the blue component, in hexadecimal.
A string is a valid lowercase simple color if it is a valid simple color and doesn't use any characters in the range U+0041 LATIN CAPITAL LETTER A .. U+0046 LATIN CAPITAL LETTER F.
The rules for parsing simple color values are as given in the following algorithm. When invoked, the steps must be followed in the order given, aborting at the first step that returns a value. This algorithm will either return a simple color or an error.
Let input be the string being parsed.
If input is not exactly seven characters long, then return an error.
If the first character in input is not a U+0023 NUMBER SIGN (#) character, then return an error.
If the last six characters of input are not all in the range U+0030 DIGIT ZERO (0) .. U+0039 DIGIT NINE (9), U+0041 LATIN CAPITAL LETTER A .. U+0046 LATIN CAPITAL LETTER F, U+0061 LATIN SMALL LETTER A .. U+0066 LATIN SMALL LETTER F, then return an error.
Let result be a simple color .
Interpret the second and third characters as a hexadecimal number and let the result be the red component of result .
Interpret the fourth and fifth characters as a hexadecimal number and let the result be the green component of result .
Interpret the sixth and seventh characters as a hexadecimal number and let the result be the blue component of result .
Return result .
The rules for serializing simple color values given a simple color are as given in the following algorithm:
Let result be a string consisting of a single U+0023 NUMBER SIGN (#) character.
Convert the red, green, and blue components in turn to two-digit hexadecimal numbers using the digits U+0030 DIGIT ZERO (0) .. U+0039 DIGIT NINE (9) and U+0061 LATIN SMALL LETTER A .. U+0066 LATIN SMALL LETTER F, zero-padding if necessary, and append these numbers to result , in the order red, green, blue.
Return result , which will be a valid lowercase simple color .
Some obsolete legacy attributes parse colors in a more complicated manner, using the rules for parsing a legacy color value , which are given in the following algorithm. When invoked, the steps must be followed in the order given, aborting at the first step that returns a value. This algorithm will either return a simple color or an error.
Let input be the string being parsed.
If input is the empty string, then return an error.
If
input
is
an
ASCII
case-insensitive
match
for
the
string
"
transparent
",
then
return
an
error.
If input is an ASCII case-insensitive match for one of the keywords listed in the SVG color keywords or CSS2 System Colors sections of the CSS3 Color specification, then return the simple color corresponding to that keyword. [CSSCOLOR]
If input is four characters long, and the first character in input is a U+0023 NUMBER SIGN (#) character, and the last three characters of input are all in the range U+0030 DIGIT ZERO (0) .. U+0039 DIGIT NINE (9), U+0041 LATIN CAPITAL LETTER A .. U+0046 LATIN CAPITAL LETTER F, and U+0061 LATIN SMALL LETTER A .. U+0066 LATIN SMALL LETTER F, then run these substeps:
Let result be a simple color .
Interpret the second character of input as a hexadecimal digit; let the red component of result be the resulting number multiplied by 17.
Interpret the third character of input as a hexadecimal digit; let the green component of result be the resulting number multiplied by 17.
Interpret the fourth character of input as a hexadecimal digit; let the blue component of result be the resulting number multiplied by 17.
Return result .
Replace
any
characters
in
input
that
have
a
Unicode
code
point
greater
than
U+FFFF
(i.e.
any
characters
that
are
not
in
the
basic
multilingual
plane)
with
the
two-character
string
"
00
".
If input is longer than 128 characters, truncate input , leaving only the first 128 characters.
If the first character in input is a U+0023 NUMBER SIGN character (#), remove it.
Replace any character in input that is not in the range U+0030 DIGIT ZERO (0) .. U+0039 DIGIT NINE (9), U+0041 LATIN CAPITAL LETTER A .. U+0046 LATIN CAPITAL LETTER F, and U+0061 LATIN SMALL LETTER A .. U+0066 LATIN SMALL LETTER F with the character U+0030 DIGIT ZERO (0).
While input 's length is zero or not a multiple of three, append a U+0030 DIGIT ZERO (0) character to input .
Split input into three strings of equal length, to obtain three components. Let length be the length of those components (one third the length of input ).
If length is greater than 8, then remove the leading length -8 characters in each component, and let length be 8.
While length is greater than two and the first character in each component is a U+0030 DIGIT ZERO (0) character, remove that character and reduce length by one.
If length is still greater than two, truncate each component, leaving only the first two characters in each.
Let result be a simple color .
Interpret the first component as a hexadecimal number; let the red component of result be the resulting number.
Interpret the second component as a hexadecimal number; let the green component of result be the resulting number.
Interpret the third component as a hexadecimal number; let the blue component of result be the resulting number.
Return result .
The 2D graphics context has a separate color syntax that also handles opacity.
A set of space-separated tokens is a set of zero or more words separated by one or more space characters , where words consist of any string of one or more characters, none of which are space characters .
A string containing a set of space-separated tokens may have leading or trailing space characters .
An unordered set of unique space-separated tokens is a set of space-separated tokens where none of the words are duplicated.
An ordered set of unique space-separated tokens is a set of space-separated tokens where none of the words are duplicated but where the order of the tokens is meaningful.
Sets of space-separated tokens sometimes have a defined set of allowed values. When a set of allowed values is defined, the tokens must all be from that list of allowed values; other values are non-conforming. If no such set of allowed values is provided, then all values are conforming.
When a user agent has to split a string on spaces , it must use the following algorithm:
Let input be the string being parsed.
Let position be a pointer into input , initially pointing at the start of the string.
Let tokens be a list of tokens, initially empty.
While position is not past the end of input :
Collect a sequence of characters that are not space characters .
Add the string collected in the previous step to tokens .
Return tokens .
When a user agent has to remove a token from a string , it must use the following algorithm:
Let input be the string being modified.
Let token be the token being removed. It will not contain any space characters .
Let output be the output string, initially empty.
Let position be a pointer into input , initially pointing at the start of the string.
If position is beyond the end of input , set the string being modified to output , and abort these steps.
If the character at position is a space character :
Append the character at position to the end of output .
Increment position so it points at the next character in input .
Return to step 5 in the overall set of steps.
Otherwise, the character at position is the first character of a token. Collect a sequence of characters that are not space characters , and let that be s .
If s is exactly equal to token , then:
Skip whitespace (in input ).
Remove any space characters currently at the end of output .
If position is not past the end of input , and output is not the empty string, append a single U+0020 SPACE character at the end of output .
Otherwise, append s to the end of output .
Return to step 6 in the overall set of steps.
This causes any occurrences of the token to be removed from the string, and any spaces that were surrounding the token to be collapsed to a single space, except at the start and end of the string, where such spaces are removed.
A
set
of
comma-separated
tokens
is
a
set
of
zero
or
more
tokens
each
separated
from
the
next
by
a
single
U+002C
COMMA
character
(
,
),
where
tokens
consist
of
any
string
of
zero
or
more
characters,
neither
beginning
nor
ending
with
space
characters
,
nor
containing
any
U+002C
COMMA
characters
(
,
),
and
optionally
surrounded
by
space
characters
.
For
instance,
the
string
"
a ,b,,d d
"
consists
of
four
tokens:
"a",
"b",
the
empty
string,
and
"d d".
Leading
and
trailing
whitespace
around
each
token
doesn't
count
as
part
of
the
token,
and
the
empty
string
can
be
a
token.
Sets of comma-separated tokens sometimes have further restrictions on what consists a valid token. When such restrictions are defined, the tokens must all fit within those restrictions; other values are non-conforming. If no such restrictions are specified, then all values are conforming.
When a user agent has to split a string on commas , it must use the following algorithm:
Let input be the string being parsed.
Let position be a pointer into input , initially pointing at the start of the string.
Let tokens be a list of tokens, initially empty.
Token : If position is past the end of input , jump to the last step.
Collect
a
sequence
of
characters
that
are
not
U+002C
COMMA
characters
(
,
).
Let
s
be
the
resulting
sequence
(which
might
be
the
empty
string).
Remove any leading or trailing sequence of space characters from s .
Add s to tokens .
If
position
is
not
past
the
end
of
input
,
then
the
character
at
position
is
a
U+002C
COMMA
character
(
,
);
advance
position
past
that
character.
Jump back to the step labeled token .
Return tokens .
A valid reversed DNS identifier is a string that consists of a series of IDNA labels in reverse order (i.e. starting with the top-level domain), the prefix of which, when reversed and converted to ASCII, corresponds to a registered domain.
For
instance,
the
string
"
com.example.xn--74h
"
is
a
valid
reversed
DNS
identifier
because
the
string
"
example.com
"
is
a
registered
domain.
To check if a string is a valid reversed DNS identifier , conformance checkers must run the following algorithm:
Apply the IDNA ToASCII algorithm to the string, with both the AllowUnassigned and UseSTD3ASCIIRules flags set, but between steps 2 and 3 of the general ToASCII/ToUnicode algorithm (i.e. after splitting the domain name into individual labels), reverse the order of the labels.
If ToASCII fails to convert one of the components of the string, e.g. because it is too long or because it contains invalid characters, then the string is not valid; abort these steps. [RFC3490]
Check that the end of the resulting string matches a suffix in the Public Suffix List, and that there is at least one domain label before the matching substring. If it does not, or if there is not, then the string is not valid; abort these steps. [PSL]
Check that the domain name up to the label before the prefix that was matched in the previous string is a registered domain name.
A
valid
hash-name
reference
to
an
element
of
type
type
is
a
string
consisting
of
a
U+0023
NUMBER
SIGN
(
#
)
character
followed
by
a
string
which
exactly
matches
the
value
of
the
name
attribute
of
an
element
in
the
document
with
type
type
.
The rules for parsing a hash-name reference to an element of type type are as follows:
If the string being parsed does not contain a U+0023 NUMBER SIGN character, or if the first such character in the string is the last character in the string, then return null and abort these steps.
Let s be the string from the character immediately after the first U+0023 NUMBER SIGN character in the string being parsed up to the end of that string.
Return
the
first
element
of
type
type
that
has
an
id
attribute
whose
value
is
a
case-sensitive
match
for
s
or
a
name
attribute
whose
value
is
a
compatibility
caseless
match
for
s
.
This section is unstable, violates existing RFCs and may be updated in future versions of this specification.
There is a forthcoming IRI specification that would challenge the normative nature of this section. The terminology used in this section is a willful violation of RFC 3986 and is expected to be challenged by members of the IETF.
A URL is a string used to identify a resource.
A URL is a valid URL if it is a valid Web address as defined by the Web addresses specification. [WEBADDRESSES]
A URL is an absolute URL if it is an absolute Web address as defined by the Web addresses specification. [WEBADDRESSES]
To parse a URL url into its component parts, the user agent must use the parse a Web address algorithm defined by the Web addresses specification. [WEBADDRESSES]
Parsing a URL results in the following components, again as defined by the Web addresses specification:
To resolve a URL to an absolute URL relative to either another absolute URL or an element, the user agent must use the resolve a Web address algorithm defined by the Web addresses specification. [WEBADDRESSES]
The
document
base
URL
of
a
Document
object
is
the
document
base
Web
address
as
defined
by
the
Web
addresses
specification.
[WEBADDRESSES]
The term "URL" in this specification is used in a manner distinct from the precise technical meaning it is given in RFC 3986. Readers familiar with that RFC will find it easier to read this specification if they pretend the term "URL" as used herein is really called something else altogether. This is a willful violation of RFC 3986. [RFC3986]
When
an
xml:base
attribute
changes,
the
attribute's
element,
and
all
descendant
elements,
are
affected
by
a
base
URL
change
.
When a document's document base URL changes, all elements in that document are affected by a base URL change .
When an element is moved from one document to another, if the two documents have different base URLs , then that element and all its descendants are affected by a base URL change .
When an element is affected by a base URL change , it must act as described in the following list:
If
the
absolute
URL
identified
by
the
hyperlink
is
being
shown
to
the
user,
or
if
any
data
derived
from
that
URL
is
affecting
the
display,
then
the
href
attribute
should
be
re-resolved
relative
to
the
element
and
the
UI
updated
appropriately.
For
example,
the
CSS
:link
/
:visited
pseudo-classes
might
have
been
affected.
If
the
hyperlink
has
a
ping
attribute
and
its
absolute
URL(s)
are
being
shown
to
the
user,
then
the
ping
attribute's
tokens
should
be
re-resolved
relative
to
the
element
and
the
UI
updated
appropriately.
q
,
blockquote
,
section
,
article
,
ins
,
or
del
element
with
a
cite
attribute
If
the
absolute
URL
identified
by
the
cite
attribute
is
being
shown
to
the
user,
or
if
any
data
derived
from
that
URL
is
affecting
the
display,
then
the
URL
should
be
re-resolved
relative
to
the
element
and
the
UI
updated
appropriately.
The element is not directly affected.
Changing
the
base
URL
doesn't
affect
the
image
displayed
by
img
elements,
although
subsequent
accesses
of
the
src
DOM
attribute
from
script
will
return
a
new
absolute
URL
that
might
no
longer
correspond
to
the
image
being
shown.
An interface that has a complement of URL decomposition attributes will have seven attributes with the following definitions:
attribute DOMString protocol;
attribute DOMString host;
attribute DOMString hostname;
attribute DOMString port;
attribute DOMString pathname;
attribute DOMString search;
attribute DOMString hash;
protocol
[
=
value
]
Returns the current scheme of the underlying URL.
Can be set, to change the underlying URL's scheme.
host
[
=
value
]
Returns the current host and port (if it's not the default port) in the underlying URL.
Can be set, to change the underlying URL's host and port.
The host and the port are separated by a colon. The port part, if omitted, will be assumed to be the current scheme's default port.
hostname
[
=
value
]
Returns the current host in the underlying URL.
Can be set, to change the underlying URL's host.
port
[
=
value
]
Returns the current port in the underlying URL.
Can be set, to change the underlying URL's port.
pathname
[
=
value
]
Returns the current path in the underlying URL.
Can be set, to change the underlying URL's path.
search
[
=
value
]
Returns the current query component in the underlying URL.
Can be set, to change the underlying URL's query component.
hash
[
=
value
]
Returns the current fragment identifier in the underlying URL.
Can be set, to change the underlying URL's fragment identifier.
The attributes defined to be URL decomposition attributes must act as described for the attributes with the same corresponding names in this section.
In addition, an interface with a complement of URL decomposition attributes will define an input , which is a URL that the attributes act on, and a common setter action , which is a set of steps invoked when any of the attributes' setters are invoked.
The seven URL decomposition attributes have similar requirements.
On getting, if the input is an absolute URL that fulfills the condition given in the "getter condition" column corresponding to the attribute in the table below, the user agent must return the part of the input URL given in the "component" column, with any prefixes specified in the "prefix" column appropriately added to the start of the string and any suffixes specified in the "suffix" column appropriately added to the end of the string. Otherwise, the attribute must return the empty string.
On setting, the new value must first be mutated as described by the "setter preprocessor" column, then mutated by %-escaping any characters in the new value that are not valid in the relevant component as given by the "component" column. Then, if the input is an absolute URL and the resulting new value fulfills the condition given in the "setter condition" column, the user agent must make a new string output by replacing the component of the URL given by the "component" column in the input URL with the new value; otherwise, the user agent must let output be equal to the input . Finally, the user agent must invoke the common setter action with the value of output .
When replacing a component in the URL, if the component is part of an optional group in the URL syntax consisting of a character followed by the component, the component (including its prefix character) must be included even if the new value is the empty string.
The
previous
paragraph
applies
in
particular
to
the
"
:
"
before
a
<port>
component,
the
"
?
"
before
a
<query>
component,
and
the
"
#
"
before
a
<fragment>
component.
For the purposes of the above definitions, URLs must be parsed using the URL parsing rules defined in this specification.
| Attribute | Component | Getter Condition | Prefix | Suffix | Setter Preprocessor | Setter Condition |
|---|---|---|---|---|---|---|
protocol
|
<scheme> | — | — |
U+003A
COLON
("
:
")
|
Remove
all
trailing
U+003A
COLON
("
:
")
characters
|
The new value is not the empty string |
host
|
<hostport> | input is hierarchical and uses a server-based naming authority | — | — | — | The new value is not the empty string and input is hierarchical and uses a server-based naming authority |
hostname
|
<host> | input is hierarchical and uses a server-based naming authority | — | — |
Remove
all
leading
U+002F
SOLIDUS
("
/
")
characters
|
The new value is not the empty string and input is hierarchical and uses a server-based naming authority |
port
|
<port> | input is hierarchical, uses a server-based naming authority, and contained a <port> component (possibly an empty one) | — | — | Remove any characters in the new value that are not in the range U+0030 DIGIT ZERO .. U+0039 DIGIT NINE. If the resulting string is empty, set it to a single U+0030 DIGIT ZERO character ('0'). | input is hierarchical and uses a server-based naming authority |
pathname
|
<path> | input is hierarchical | — | — |
If
it
has
no
leading
U+002F
SOLIDUS
("
/
")
character,
prepend
a
U+002F
SOLIDUS
("
/
")
character
to
the
new
value
|
— |
search
|
<query> | input is hierarchical, and contained a <query> component (possibly an empty one) |
U+003F
QUESTION
MARK
("
?
")
|
— |
Remove
one
leading
U+003F
QUESTION
MARK
("
?
")
character,
if
any
|
— |
hash
|
<fragment> | input contained a <fragment> component (possibly an empty one) |
U+0023
NUMBER
SIGN
("
#
")
|
— |
Remove
one
leading
U+0023
NUMBER
SIGN
("
#
")
character,
if
any
|
— |
The
table
below
demonstrates
how
the
getter
condition
for
search
results
in
different
results
depending
on
the
exact
original
syntax
of
the
URL:
| Input URL |
search
value
|
Explanation |
|---|---|---|
http://example.com/
|
empty string | No <query> component in input URL. |
http://example.com/?
|
?
|
There is a <query> component, but it is empty. The question mark in the resulting value is the prefix. |
http://example.com/?test
|
?test
|
The
<query>
component
has
the
value
"
test
".
|
http://example.com/?test#
|
?test
|
The (empty) <fragment> component is not part of the <query> component. |
When a user agent is to fetch a resource, the following steps must be run:
If
the
resource
is
identified
by
the
URL
about:blank
,
then
return
the
empty
string
and
abort
these
steps.
Perform the remaining steps asynchronously.
If the resource is identified by an absolute URL , and the resource is to be obtained using a idempotent action (such as an HTTP GET or equivalent ), and it is already being downloaded for other reasons (e.g. another invocation of this algorithm), and the user agent is configured such that it is to reuse the data from the existing download instead of initiating a new one, then use the results of the existing download instead of starting a new one.
Otherwise,
at
a
time
convenient
to
the
user
and
the
user
agent,
download
(or
otherwise
obtain)
the
resource,
applying
the
semantics
of
the
relevant
specifications
(e.g.
performing
an
HTTP
GET
or
POST
operation,
or
reading
the
file
from
disk,
following
redirects,
dereferencing
javascript:
URLs
,
etc).
For
purposes
of
generating
the
address
of
the
resource
from
which
Request-URIs
are
obtained
as
required
by
HTTP
for
the
Referer
(sic)
header,
the
user
agent
must
use
the
document's
current
address
of
the
appropriate
Document
as
given
by
this
list.
[HTTP]
Document
.
If there are cookies to be set, then the user agent must run the following substeps:
Wait until ownership of the storage mutex can be taken by this instance of the fetching algorithm.
Take ownership of the storage mutex .
Update the cookies. [COOKIES]
Release the storage mutex so that it is once again free.
When the resource is available, or if there is an error of some description, queue a task that uses the resource as appropriate. If the resource can be processed incrementally, as, for instance, with a progressively interlaced JPEG or an HTML file, additional tasks may be queued to process the data as it is downloaded. The task source for these tasks is the networking task source .
The application cache processing model introduces some changes to the networking model to handle the returning of cached resources.
The navigation processing model handles redirects itself, overriding the redirection handling that would be done by the fetching algorithm.
Whether the type sniffing rules apply to the fetched resource depends on the algorithm that invokes the rules — they are not always applicable.
This section is pending an update based on progress that is being made on [ MIMESNIFF ].
User agents can implement a variety of transfer protocols, but this specification mostly defines behavior in terms of HTTP. [HTTP]
The HTTP GET method is equivalent to the default retrieval action of the protocol. For example, RETR in FTP. Such actions are idempotent and safe, in HTTP terms.
The HTTP response codes are equivalent to statuses in other protocols that have the same basic meanings. For example, a "file not found" error is equivalent to a 404 code, a server error is equivalent to a 5xx code, and so on.
The HTTP headers are equivalent to fields in other protocols that have the same basic meaning. For example, the HTTP authentication headers are equivalent to the authentication aspects of the FTP protocol.
Anything
in
this
specification
that
refers
to
HTTP
also
applies
to
HTTP-over-TLS,
as
represented
by
URLs
representing
the
https
scheme.
User agents should report certificate errors to the user and must either refuse to download resources sent with erroneous certificates or must act as if such resources were in fact served with no encryption.
User agents should warn the user that there is a potential problem whenever the user visits a page that the user has previously visited, if the page uses less secure encryption on the second visit.
Not doing so can result in users not noticing man-in-the-middle attacks.
If a user connects to a server with a self-signed certificate, the user agent could allow the connection but just act as if there had been no encryption. If the user agent instead allowed the user to override the problem and then displayed the page as if it was fully and safely encrypted, the user could be easily tricked into accepting man-in-the-middle connections.
If a user connects to a server with full encryption, but the page then refers to an external resource that has an expired certificate, then the user agent will act as if the resource was unavailable, possibly also reporting the problem to the user. If the user agent instead allowed the resource to be used, then an attacker could just look for "secure" sites that used resources from a different host and only apply man-in-the-middle attacks to that host, for example taking over scripts in the page.
If a user bookmarks a site that uses a CA-signed certificate, and then later revisits that site directly but the site has started using a self-signed certificate, the user agent could warn the user that a man-in-the-middle attack is likely underway, instead of simply acting as if the page was not encrypted.
The Content-Type metadata of a resource must be obtained and interpreted in a manner consistent with the requirements of the Content-Type Processing Model specification. [MIMESNIFF]
The algorithm for extracting an encoding from a Content-Type , given a string s , is given in the Content-Type Processing Model specification. It either returns an encoding or nothing. [MIMESNIFF]
The sniffed type of a resource must be found in a manner consistent with the requirements given in the Content-Type Processing Model specification for finding that sniffed type . [MIMESNIFF]
The rules for sniffing images specifically are also defined in the Content-Type Processing Model specification. [MIMESNIFF]
It is imperative that the rules in the Content-Type Processing Model specification be followed exactly. When a user agent uses different heuristics for content type detection than the server expects, security problems can occur. For more details, see the Content-Type Processing Model specification. [MIMESNIFF]
User agents must at a minimum support the UTF-8 and Windows-1252 encodings, but may support more.
It is not unusual for Web browsers to support dozens if not upwards of a hundred distinct character encodings.
User agents must support the preferred MIME name of every character encoding they support that has a preferred MIME name, and should support all the IANA-registered aliases. [IANACHARSET]
When comparing a string specifying a character encoding with the name or alias of a character encoding to determine if they are equal, user agents must use the Charset Alias Matching rules defined in Unicode Technical Standard #22. [UTS22]
For instance, "GB_2312-80" and "g.b.2312(80)" are considered equivalent names.
In addition, user agents must support the aliases given in the following table, so that labels from the first column are treated as equivalent to the labels given in the corresponding cell from the second column on the same row.
| Alias | Corresponding encoding | References |
|---|---|---|
| x-sjis | windows-31J | [SHIFTJIS] [WIN31J] |
| windows-932 | windows-31J | [WIN31J] |
| x-x-big5 | Big5 | [BIG5] |
When a user agent would otherwise use an encoding given in the first column of the following table to either convert content to Unicode characters or convert Unicode characters to bytes, it must instead use the encoding given in the cell in the second column of the same row. When a byte or sequence of bytes is treated differently due to this encoding aliasing, it is said to have been misinterpreted for compatibility .
| Input encoding | Replacement encoding | References |
|---|---|---|
| EUC-KR | windows-949 | [EUCKR] [WIN949] |
| GB2312 | GBK | [RFC1345] [GBK] |
| GB_2312-80 | GBK | [RFC1345] [GBK] |
| ISO-8859-1 | windows-1252 | [RFC1345] [WIN1252] |
| ISO-8859-9 | windows-1254 | [RFC1345] [WIN1254] |
| ISO-8859-11 | windows-874 | [ISO885911] [WIN874] |
| KS_C_5601-1987 | windows-949 | [RFC1345] [WIN949] |
| Shift_JIS | windows-31J | [SHIFTJIS] [WIN31J] |
| TIS-620 | windows-874 | [TIS620] [WIN874] |
| US-ASCII | windows-1252 | [RFC1345] [WIN1252] |
The requirement to treat certain encodings as other encodings according to the table above is a willful violation of the W3C Character Model specification, motivated by a desire for compatibility with legacy content. [CHARMOD]
When a user agent is to use the UTF-16 encoding but no BOM has been found, user agents must default to UTF-16LE.
The requirement to default UTF-16 to LE rather than BE is a willful violation of RFC 2781, motivated by a desire for compatibility with legacy content. [CHARMOD]
User agents must not support the CESU-8, UTF-7, BOCU-1 and SCSU encodings. [CESU8] [UTF7] [BOCU1] [SCSU]
Support for encodings based on EBCDIC is not recommended. This encoding is rarely used for publicly-facing Web content.
Support for UTF-32 is not recommended. This encoding is rarely used, and frequently implemented incorrectly.
This specification does not make any attempt to support EBCDIC-based encodings and UTF-32 in its algorithms; support and use of these encodings can thus lead to unexpected behavior in implementations of this specification.
This section is controversial and does not enjoy broad consensus.
This section seems to be providing normative language requiring UA manufacturers to not support certain character encodings. The section also provides implementation advice, as normative language, which would require a conforming HTML5 UA to violate currently published specifications.
While these "willful violations" of current W3C and IETF specifications are performed to ensure backward compatability for legacy content, the act of violating known specifications is controversial and must be discussed with both the W3C and IETF. There may be a possibility to harmonize this specification with the specifications that are violated.
Some DOM attributes are defined to reflect a particular content attribute . This means that on getting, the DOM attribute returns the current value of the content attribute, and on setting, the DOM attribute changes the value of the content attribute to the given value.
A list of reflecting DOM attributes and their corresponding content attributes is given in the index.
In general, on getting, if the content attribute is not present, the DOM attribute must act as if the content attribute's value is the empty string; and on setting, if the content attribute is not present, it must first be added.
If
a
reflecting
DOM
attribute
is
a
DOMString
attribute
whose
content
attribute
is
defined
to
contain
a
URL
,
then
on
getting,
the
DOM
attribute
must
resolve
the
value
of
the
content
attribute
relative
to
the
element
and
return
the
resulting
absolute
URL
if
that
was
successful,
or
the
empty
string
otherwise;
and
on
setting,
must
set
the
content
attribute
to
the
specified
literal
value.
If
the
content
attribute
is
absent,
the
DOM
attribute
must
return
the
default
value,
if
the
content
attribute
has
one,
or
else
the
empty
string.
If
a
reflecting
DOM
attribute
is
a
DOMString
attribute
whose
content
attribute
is
defined
to
contain
one
or
more
URLs
,
then
on
getting,
the
DOM
attribute
must
split
the
content
attribute
on
spaces
and
return
the
concatenation
of
resolving
each
token
URL
to
an
absolute
URL
relative
to
the
element,
with
a
single
U+0020
SPACE
character
between
each
URL,
ignoring
any
tokens
that
did
not
resolve
successfully.
If
the
content
attribute
is
absent,
the
DOM
attribute
must
return
the
default
value,
if
the
content
attribute
has
one,
or
else
the
empty
string.
On
setting,
the
DOM
attribute
must
set
the
content
attribute
to
the
specified
literal
value.
If
a
reflecting
DOM
attribute
is
a
DOMString
whose
content
attribute
is
an
enumerated
attribute
,
and
the
DOM
attribute
is
limited
to
only
known
values
,
then,
on
getting,
the
DOM
attribute
must
return
the
conforming
value
associated
with
the
state
the
attribute
is
in
(in
its
canonical
case),
or
the
empty
string
if
the
attribute
is
in
a
state
that
has
no
associated
keyword
value;
and
on
setting,
if
the
new
value
is
an
ASCII
case-insensitive
match
for
one
of
the
keywords
given
for
that
attribute,
then
the
content
attribute
must
be
set
to
the
conforming
value
associated
with
the
state
that
the
attribute
would
be
in
if
set
to
the
given
new
value,
otherwise,
if
the
new
value
is
the
empty
string,
then
the
content
attribute
must
be
removed,
otherwise,
the
setter
must
raise
a
SYNTAX_ERR
exception.
If
a
reflecting
DOM
attribute
is
a
DOMString
but
doesn't
fall
into
any
of
the
above
categories,
then
the
getting
and
setting
must
be
done
in
a
transparent,
case-preserving
manner.
If a reflecting DOM attribute is a boolean attribute, then on getting the DOM attribute must return true if the attribute is set, and false if it is absent. On setting, the content attribute must be removed if the DOM attribute is set to false, and must be set to have the same value as its name if the DOM attribute is set to true. (This corresponds to the rules for boolean content attributes .)
If
a
reflecting
DOM
attribute
is
a
signed
integer
type
(
long
)
then,
on
getting,
the
content
attribute
must
be
parsed
according
to
the
rules
for
parsing
signed
integers
,
and
if
that
is
successful,
and
the
value
is
in
the
range
of
the
DOM
attribute's
type,
the
resulting
value
must
be
returned.
If,
on
the
other
hand,
it
fails
or
returns
an
out
of
range
value,
or
if
the
attribute
is
absent,
then
the
default
value
must
be
returned
instead,
or
0
if
there
is
no
default
value.
On
setting,
the
given
value
must
be
converted
to
the
shortest
possible
string
representing
the
number
as
a
valid
integer
and
then
that
string
must
be
used
as
the
new
content
attribute
value.
If
a
reflecting
DOM
attribute
is
an
unsigned
integer
type
(
unsigned
long
)
then,
on
getting,
the
content
attribute
must
be
parsed
according
to
the
rules
for
parsing
non-negative
integers
,
and
if
that
is
successful,
and
the
value
is
in
the
range
of
the
DOM
attribute's
type,
the
resulting
value
must
be
returned.
If,
on
the
other
hand,
it
fails
or
returns
an
out
of
range
value,
or
if
the
attribute
is
absent,
the
default
value
must
be
returned
instead,
or
0
if
there
is
no
default
value.
On
setting,
the
given
value
must
be
converted
to
the
shortest
possible
string
representing
the
number
as
a
valid
non-negative
integer
and
then
that
string
must
be
used
as
the
new
content
attribute
value.
If
a
reflecting
DOM
attribute
is
an
unsigned
integer
type
(
unsigned
long
)
that
is
limited
to
only
positive
non-zero
numbers
,
then
the
behavior
is
similar
to
the
previous
case,
but
zero
is
not
allowed.
On
getting,
the
content
attribute
must
first
be
parsed
according
to
the
rules
for
parsing
non-negative
integers
,
and
if
that
is
successful,
and
the
value
is
in
the
range
of
the
DOM
attribute's
type,
the
resulting
value
must
be
returned.
If,
on
the
other
hand,
it
fails
or
returns
an
out
of
range
value,
or
if
the
attribute
is
absent,
the
default
value
must
be
returned
instead,
or
1
if
there
is
no
default
value.
On
setting,
if
the
value
is
zero,
the
user
agent
must
fire
an
INDEX_SIZE_ERR
exception.
Otherwise,
the
given
value
must
be
converted
to
the
shortest
possible
string
representing
the
number
as
a
valid
non-negative
integer
and
then
that
string
must
be
used
as
the
new
content
attribute
value.
If
a
reflecting
DOM
attribute
is
a
floating
point
number
type
(
float
)
and
it
doesn't
fall
into
one
of
the
earlier
categories,
then,
on
getting,
the
content
attribute
must
be
parsed
according
to
the
rules
for
parsing
floating
point
number
values
,
and
if
that
is
successful,
and
the
value
is
in
the
range
of
the
DOM
attribute's
type,
the
resulting
value
must
be
returned.
If,
on
the
other
hand,
it
fails
or
returns
an
out
of
range
value,
or
if
the
attribute
is
absent,
the
default
value
must
be
returned
instead,
or
0.0
if
there
is
no
default
value.
On
setting,
the
given
value
must
be
converted
to
the
best
representation
of
the
floating
point
number
and
then
that
string
must
be
used
as
the
new
content
attribute
value.
The values Infinity and Not-a-Number (NaN) values throw an exception on setting, as defined earlier .
If
a
reflecting
DOM
attribute
is
of
the
type
DOMTokenList
or
DOMSettableTokenList
,
then
on
getting
it
must
return
a
DOMTokenList
or
DOMSettableTokenList
object
(as
appropriate)
whose
underlying
string
is
the
element's
corresponding
content
attribute.
When
the
object
mutates
its
underlying
string,
the
content
attribute
must
itself
be
immediately
mutated.
When
the
attribute
is
absent,
then
the
string
represented
by
the
object
is
the
empty
string;
when
the
object
mutates
this
empty
string,
the
user
agent
must
first
add
the
corresponding
content
attribute,
and
then
mutate
that
attribute
instead.
The
same
DOMTokenList
object
must
be
returned
every
time
for
each
attribute.
If
a
reflecting
DOM
attribute
has
the
type
HTMLElement
,
or
an
interface
that
descends
from
HTMLElement
,
then,
on
getting,
it
must
run
the
following
algorithm
(stopping
at
the
first
point
where
a
value
is
returned):
document.getElementById()
method
would
find
if
it
was
passed
as
its
argument
the
current
value
of
the
corresponding
content
attribute.
On
setting,
if
the
given
element
has
an
id
attribute,
then
the
content
attribute
must
be
set
to
the
value
of
that
id
attribute.
Otherwise,
the
DOM
attribute
must
be
set
to
the
empty
string.
The
HTMLCollection
,
HTMLAllCollection
,
HTMLFormControlsCollection
,
HTMLOptionsCollection
,
and
HTMLPropertyCollection
interfaces
represent
various
lists
of
DOM
nodes.
Collectively,
objects
implementing
these
interfaces
are
called
collections
.
When a collection is created, a filter and a root are associated with the collection.
For
example,
when
the
HTMLCollection
object
for
the
document.images
attribute
is
created,
it
is
associated
with
a
filter
that
selects
only
img
elements,
and
rooted
at
the
root
of
the
document.
The collection then represents a live view of the subtree rooted at the collection's root, containing only nodes that match the given filter. The view is linear. In the absence of specific requirements to the contrary, the nodes within the collection must be sorted in tree order .
The
rows
list
is
not
in
tree
order.
An attribute that returns a collection must return the same object every time it is retrieved.
The
HTMLCollection
interface
represents
a
generic
collection
of
elements.
interface HTMLCollection {
readonly attribute unsigned long length;
caller getter Element item(in unsigned long index);
caller getter Element namedItem(in DOMString name);
HTMLAllCollection tags(in DOMString tagName);
};
length
Returns the number of elements in the collection.
item
(
index
)
Returns the item with index index from the collection. The items are sorted in tree order .
Returns null if index is out of range.
namedItem
(
name
)
Returns the first item with ID or name name from the collection.
Returns null if no element with that ID or name could be found.
Only
a
,
applet
,
area
,
embed
,
form
,
frame
,
frameset
,
iframe
,
img
,
and
object
elements
can
have
a
name
for
the
purpose
of
this
method;
their
name
is
given
by
the
value
of
their
name
attribute.
tags
(
tagName
)
Returns a collection that is a filtered view of the current collection, containing only elements with the given tag name.
The object's indices of the supported indexed properties are the numbers in the range zero to one less than the number of nodes represented by the collection . If there are no such elements, then there are no supported indexed properties .
The
length
attribute
must
return
the
number
of
nodes
represented
by
the
collection
.
The
item(
index
)
method
must
return
the
index
th
node
in
the
collection.
If
there
is
no
index
th
node
in
the
collection,
then
the
method
must
return
null.
The
names
of
the
supported
named
properties
consist
of
the
values
of
the
name
attributes
of
each
a
,
applet
,
area
,
embed
,
form
,
frame
,
frameset
,
iframe
,
img
,
and
object
element
represented
by
the
collection
with
a
name
attribute,
plus
the
list
of
IDs
that
the
elements
represented
by
the
collection
have.
The
namedItem(
key
)
method
must
return
the
first
node
in
the
collection
that
matches
the
following
requirements:
a
,
applet
,
area
,
embed
,
form
,
frame
,
frameset
,
iframe
,
img
,
or
object
element
with
a
name
attribute
equal
to
key
,
or,
If no such elements are found, then the method must return null.
The
tags(
tagName
)
method
must
return
an
HTMLAllCollection
rooted
at
the
same
node
as
the
HTMLCollection
object
on
which
the
method
was
invoked,
whose
filter
matches
only
HTML
elements
whose
local
name
is
the
tagName
argument
and
that
already
match
the
filter
of
the
HTMLCollection
object
on
which
the
method
was
invoked.
In
HTML
documents
,
the
argument
must
first
be
converted
to
ASCII
lowercase
.
The
HTMLAllCollection
interface
represents
a
generic
collection
of
elements
just
like
HTMLCollection
,
with
the
exception
that
its
namedItem()
method
returns
an
HTMLCollection
object
when
there
are
multiple
matching
elements.
interface HTMLAllCollection {
readonly attribute unsigned long length;
caller getter Element item(in unsigned long index);
caller getter Object namedItem(in DOMString name);
HTMLAllCollection tags(in DOMString tagName);
};
length
Returns the number of elements in the collection.
item
(
index
)
Returns the item with index index from the collection. The items are sorted in tree order .
Returns null if index is out of range.
namedItem
(
name
)
namedItem
(
name
)
Returns the item with ID or name name from the collection.
If
there
are
multiple
matching
items,
then
an
HTMLAllCollection
object
containing
all
those
elements
is
returned.
Returns null if no element with that ID or name could be found.
Only
a
,
applet
,
area
,
embed
,
form
,
frame
,
frameset
,
iframe
,
img
,
and
object
elements
can
have
a
name
for
the
purpose
of
this
method;
their
name
is
given
by
the
value
of
their
name
attribute.
tags
(
tagName
)
Returns a collection that is a filtered view of the current collection, containing only elements with the given tag name.
The object's indices of the supported indexed properties are the numbers in the range zero to one less than the number of nodes represented by the collection . If there are no such elements, then there are no supported indexed properties .
The
length
attribute
must
return
the
number
of
nodes
represented
by
the
collection
.
The
item(
index
)
method
must
return
the
index
th
node
in
the
collection.
If
there
is
no
index
th
node
in
the
collection,
then
the
method
must
return
null.
The
names
of
the
supported
named
properties
consist
of
the
values
of
the
name
attributes
of
each
a
,
applet
,
area
,
embed
,
form
,
frame
,
frameset
,
iframe
,
img
,
and
object
element
represented
by
the
collection
with
a
name
attribute,
plus
the
list
of
IDs
that
the
elements
represented
by
the
collection
have.
The
namedItem(
key
)
method
must
act
according
to
the
following
algorithm:
Let
collection
be
an
HTMLAllCollection
object
rooted
at
the
same
node
as
the
HTMLAllCollection
object
on
which
the
method
was
invoked,
whose
filter
matches
only
only
elements
that
already
match
the
filter
of
the
HTMLAllCollection
object
on
which
the
method
was
invoked
and
that
are
either:
The
tags(
tagName
)
method
must
return
an
HTMLAllCollection
rooted
at
the
same
node
as
the
HTMLAllCollection
object
on
which
the
method
was
invoked,
whose
filter
matches
only
HTML
elements
whose
local
name
is
the
tagName
argument
and
that
already
match
the
filter
of
the
HTMLAllCollection
object
on
which
the
method
was
invoked.
In
HTML
documents
,
the
argument
must
first
be
converted
to
ASCII
lowercase
.
The
HTMLFormControlsCollection
interface
represents
a
collection
of
listed
elements
in
form
and
fieldset
elements.
interface HTMLFormControlsCollection {
readonly attribute unsigned long length;
caller getter HTMLElement item(in unsigned long index);
caller getter Object namedItem(in DOMString name);
};
interface RadioNodeList : NodeList {
attribute DOMString value;
};
length
Returns the number of elements in the collection.
item
(
index
)
Returns the item with index index from the collection. The items are sorted in tree order .
Returns null if index is out of range.
namedItem
(
name
)
namedItem
(
name
)
Returns
the
item
with
ID
or
name
name
from
the
collection.
If
there
are
multiple
matching
items,
then
a
RadioNodeList
object
containing
all
those
elements
is
returned.
Returns
null
if
no
element
with
that
ID
or
name
could
be
found.
Returns the value of the first checked radio button represented by the object.
Can be set, to check the first radio button with the given value represented by the object.
The object's indices of the supported indexed properties are the numbers in the range zero to one less than the number of nodes represented by the collection . If there are no such elements, then there are no supported indexed properties .
The
length
attribute
must
return
the
number
of
nodes
represented
by
the
collection
.
The
item(
index
)
method
must
return
the
index
th
node
in
the
collection.
If
there
is
no
index
th
node
in
the
collection,
then
the
method
must
return
null.
The
names
of
the
supported
named
properties
consist
of
the
values
of
all
the
id
and
name
attributes
of
all
the
elements
represented
by
the
collection
.
The
namedItem(
name
)
method
must
act
according
to
the
following
algorithm:
id
attribute
or
a
name
attribute
equal
to
name
,
then
return
that
node
and
stop
the
algorithm.
id
attribute
or
a
name
attribute
equal
to
name
,
then
return
null
and
stop
the
algorithm.
RadioNodeList
object
representing
a
live
view
of
the
HTMLFormControlsCollection
object,
further
filtered
so
that
the
only
nodes
in
the
RadioNodeList
object
are
those
that
have
either
an
id
attribute
or
a
name
attribute
equal
to
name
.
The
nodes
in
the
RadioNodeList
object
must
be
sorted
in
tree
order
.
RadioNodeList
object.
A
members
of
the
RadioNodeList
interface
inherited
from
the
NodeList
interface
must
behave
as
they
would
on
a
NodeList
object.
The
value
DOM
attribute
on
the
RadioNodeList
object,
on
getting,
must
return
the
value
returned
by
running
the
following
steps:
Let
element
be
the
first
element
in
tree
order
represented
by
the
RadioNodeList
object
that
is
an
input
element
whose
type
attribute
is
in
the
Radio
Button
state
and
whose
checkedness
is
true.
Otherwise,
let
it
be
null.
If
element
is
null,
or
if
it
is
an
element
with
no
value
attribute,
return
the
empty
string.
Otherwise,
return
the
value
of
element
's
value
attribute.
On
setting,
the
value
DOM
attribute
must
run
the
following
steps:
Let
element
be
the
first
element
in
tree
order
represented
by
the
RadioNodeList
object
that
is
an
input
element
whose
type
attribute
is
in
the
Radio
Button
state
and
whose
value
content
attribute
is
present
and
equal
to
the
new
value,
if
any.
Otherwise,
let
it
be
null.
If element is not null, then set its checkedness to true.
The
HTMLOptionsCollection
interface
represents
a
list
of
option
elements.
It
is
always
rooted
on
a
select
element
and
has
attributes
and
methods
that
manipulate
that
element's
descendants.
interface HTMLOptionsCollection {
attribute unsigned long length;
caller getter HTMLOptionElement item(in unsigned long index);
caller getter Object namedItem(in DOMString name);
void add(in HTMLElement element, optional in HTMLElement before);
void add(in HTMLElement element, in long before);
void remove(in long index);
};
length
[
=
value
]
Returns the number of elements in the collection.
When
set
to
a
smaller
number,
truncates
the
number
of
option
elements
in
the
corresponding
container.
When
set
to
a
greater
number,
adds
new
blank
option
elements
to
that
container.
item
(
index
)
Returns the item with index index from the collection. The items are sorted in tree order .
Returns null if index is out of range.
namedItem
(
name
)
namedItem
(
name
)
Returns
the
item
with
ID
or
name
name
from
the
collection.
If
there
are
multiple
matching
items,
then
a
NodeList
object
containing
all
those
elements
is
returned.
Returns null if no element with that ID could be found.
add
(
element
[,
before
]
)
Inserts element before the node given by before .
The before argument can be a number, in which case element is inserted before the item with that number, or an element from the collection, in which case element is inserted before that element.
If before is omitted, null, or a number out of range, then element will be added at the end of the list.
This
method
will
throw
a
HIERARCHY_REQUEST_ERR
exception
if
element
is
an
ancestor
of
the
element
into
which
it
is
to
be
inserted.
If
element
is
not
an
option
or
optgroup
element,
then
the
method
does
nothing.
The object's indices of the supported indexed properties are the numbers in the range zero to one less than the number of nodes represented by the collection . If there are no such elements, then there are no supported indexed properties .
On
getting,
the
length
attribute
must
return
the
number
of
nodes
represented
by
the
collection
.
On
setting,
the
behavior
depends
on
whether
the
new
value
is
equal
to,
greater
than,
or
less
than
the
number
of
nodes
represented
by
the
collection
at
that
time.
If
the
number
is
the
same,
then
setting
the
attribute
must
do
nothing.
If
the
new
value
is
greater,
then
n
new
option
elements
with
no
attributes
and
no
child
nodes
must
be
appended
to
the
select
element
on
which
the
HTMLOptionsCollection
is
rooted,
where
n
is
the
difference
between
the
two
numbers
(new
value
minus
old
value).
If
the
new
value
is
lower,
then
the
last
n
nodes
in
the
collection
must
be
removed
from
their
parent
nodes,
where
n
is
the
difference
between
the
two
numbers
(old
value
minus
new
value).
Setting
length
never
removes
or
adds
any
optgroup
elements,
and
never
adds
new
children
to
existing
optgroup
elements
(though
it
can
remove
children
from
them).
The
item(
index
)
method
must
return
the
index
th
node
in
the
collection.
If
there
is
no
index
th
node
in
the
collection,
then
the
method
must
return
null.
The
names
of
the
supported
named
properties
consist
of
the
values
of
all
the
id
and
name
attributes
of
all
the
elements
represented
by
the
collection
.
The
namedItem(
name
)
method
must
act
according
to
the
following
algorithm:
id
attribute
or
a
name
attribute
equal
to
name
,
then
return
that
node
and
stop
the
algorithm.
id
attribute
or
a
name
attribute
equal
to
name
,
then
return
null
and
stop
the
algorithm.
NodeList
object
representing
a
live
view
of
the
HTMLOptionsCollection
object,
further
filtered
so
that
the
only
nodes
in
the
NodeList
object
are
those
that
have
either
an
id
attribute
or
a
name
attribute
equal
to
name
.
The
nodes
in
the
NodeList
object
must
be
sorted
in
tree
order
.
NodeList
object.
The
add(
element
,
before
)
method
must
act
according
to
the
following
algorithm:
If
element
is
not
an
option
or
optgroup
element,
then
return
and
abort
these
steps.
If
element
is
an
ancestor
of
the
select
element
on
which
the
HTMLOptionsCollection
is
rooted,
then
throw
a
HIERARCHY_REQUEST_ERR
exception.
If
before
is
an
element,
but
that
element
isn't
a
descendant
of
the
select
element
on
which
the
HTMLOptionsCollection
is
rooted,
then
throw
a
NOT_FOUND_ERR
exception.
If element and before are the same element, then return and abort these steps.
If before is a node, then let reference be that node. Otherwise, if before is an integer, and there is a before th node in the collection, let reference be that node. Otherwise, let reference be null.
If
reference
is
not
null,
let
parent
be
the
parent
node
of
reference
.
Otherwise,
let
parent
be
the
select
element
on
which
the
HTMLOptionsCollection
is
rooted.
Act
as
if
the
DOM
Core
insertBefore()
method
was
invoked
on
the
parent
node,
with
element
as
the
first
argument
and
reference
as
the
second
argument.
The
remove(
index
)
method
must
act
according
to
the
following
algorithm:
If the number of nodes represented by the collection is zero, abort these steps.
If index is not a number greater than or equal to 0 and less than the number of nodes represented by the collection , let element be the first element in the collection. Otherwise, let element be the index th element in the collection.
Remove element from its parent node.
The
HTMLPropertyCollection
interface
represents
a
collection
of
elements
that
add
name-value
pairs
to
a
particular
item
in
the
microdata
model.
interface HTMLPropertyCollection {
readonly attribute unsigned long length;
readonly attribute DOMStringList names;
caller getter HTMLElement item(in unsigned long index);
caller getter PropertyNodeList namedItem(in DOMString name);
};
typedef sequence<any> PropertyValueArray;
interface PropertyNodeList : NodeList {
attribute PropertyValueArray contents;
};
length
Returns the number of elements in the collection.
names
Returns
a
DOMStringList
with
the
property
names
of
the
elements
in
the
collection.
item
(
index
)
Returns the element with index index from the collection. The items are sorted in tree order .
Returns null if index is out of range.
namedItem
(
name
)
Returns
a
PropertyNodeList
object
containing
any
elements
that
add
a
property
named
name
.
contents
Returns an array of the various values that the relevant elements have.
The object's indices of the supported indexed properties are the numbers in the range zero to one less than the number of nodes represented by the collection . If there are no such elements, then there are no supported indexed properties .
The
length
attribute
must
return
the
number
of
nodes
represented
by
the
collection
.
The
item(
index
)
method
must
return
the
index
th
node
in
the
collection.
If
there
is
no
index
th
node
in
the
collection,
then
the
method
must
return
null.
The names of the supported named properties consist of the property names of all the elements represented by the collection .
The
names
attribute
must
return
a
live
DOMStringList
object
giving
the
property
names
of
all
the
elements
represented
by
the
collection
,
listed
in
tree
order
,
but
with
duplicates
removed,
leaving
only
the
first
occurrence
of
each
name.
The
same
object
must
be
returned
each
time.
The
namedItem(
name
)
method
must
return
a
PropertyNodeList
object
representing
a
live
view
of
the
HTMLPropertyCollection
object,
further
filtered
so
that
the
only
nodes
in
the
RadioNodeList
object
are
those
that
have
a
property
name
equal
to
name
.
The
nodes
in
the
PropertyNodeList
object
must
be
sorted
in
tree
order
,
and
the
same
object
must
be
returned
each
time
a
particular
name
is
queried.
A
members
of
the
PropertyNodeList
interface
inherited
from
the
NodeList
interface
must
behave
as
they
would
on
a
NodeList
object.
The
contents
DOM
attribute
on
the
PropertyNodeList
object,
on
getting,
must
return
a
newly
constructed
array
whose
values
are
the
values
obtained
from
the
content
DOM
property
of
each
of
the
elements
represented
by
the
object,
in
tree
order
.
The
DOMTokenList
interface
represents
an
interface
to
an
underlying
string
that
consists
of
a
set
of
space-separated
tokens
.
DOMTokenList
objects
are
always
case-sensitive
,
even
when
the
underlying
string
might
ordinarily
be
treated
in
a
case-insensitive
manner.
interface DOMTokenList {
readonly attribute unsigned long length;
getter DOMString item(in unsigned long index);
boolean contains(in DOMString token);
void add(in DOMString token);
void remove(in DOMString token);
boolean toggle(in DOMString token);
stringifier DOMString ();
};
length
Returns the number of tokens in the string.
item
(
index
)
Returns the token with index index . The tokens are returned in the order they are found in the underlying string.
Returns null if index is out of range.
contains
(
token
)
Returns true if the token is present; false otherwise.
Throws
a
SYNTAX_ERR
exception
if
token
is
empty.
Throws
an
INVALID_CHARACTER_ERR
exception
if
token
contains
any
spaces.
add
(
token
)
Adds token , unless it is already present.
Throws
a
SYNTAX_ERR
exception
if
token
is
empty.
Throws
an
INVALID_CHARACTER_ERR
exception
if
token
contains
any
spaces.
remove
(
token
)
Removes token if it is present.
Throws
a
SYNTAX_ERR
exception
if
token
is
empty.
Throws
an
INVALID_CHARACTER_ERR
exception
if
token
contains
any
spaces.
toggle
(
token
)
Adds token if it is not present, or removes it if it is.
Throws
a
SYNTAX_ERR
exception
if
token
is
empty.
Throws
an
INVALID_CHARACTER_ERR
exception
if
token
contains
any
spaces.
The
length
attribute
must
return
the
number
of
tokens
that
result
from
splitting
the
underlying
string
on
spaces
.
This
is
the
length
.
The object's indices of the supported indexed properties are the numbers in the range zero to length -1 , unless the length is zero, in which case there are no supported indexed properties .
The
item(
index
)
method
must
split
the
underlying
string
on
spaces
,
preserving
the
order
of
the
tokens
as
found
in
the
underlying
string,
and
then
return
the
index
th
item
in
this
list.
If
index
is
equal
to
or
greater
than
the
number
of
tokens,
then
the
method
must
return
null.
For
example,
if
the
string
is
"
a
b
a
c
"
then
there
are
four
tokens:
the
token
with
index
0
is
"
a
",
the
token
with
index
1
is
"
b
",
the
token
with
index
2
is
"
a
",
and
the
token
with
index
3
is
"
c
".
The
contains(
token
)
method
must
run
the
following
algorithm:
SYNTAX_ERR
exception
and
stop
the
algorithm.
INVALID_CHARACTER_ERR
exception
and
stop
the
algorithm.
The
add(
token
)
method
must
run
the
following
algorithm:
SYNTAX_ERR
exception
and
stop
the
algorithm.
INVALID_CHARACTER_ERR
exception
and
stop
the
algorithm.
DOMTokenList
object's
underlying
string
then
stop
the
algorithm.
DOMTokenList
object's
underlying
string
is
not
the
empty
string
and
the
last
character
of
that
string
is
not
a
space
character
,
then
append
a
U+0020
SPACE
character
to
the
end
of
that
string.
DOMTokenList
object's
underlying
string.
The
remove(
token
)
method
must
run
the
following
algorithm:
SYNTAX_ERR
exception
and
stop
the
algorithm.
INVALID_CHARACTER_ERR
exception
and
stop
the
algorithm.
The
toggle(
token
)
method
must
run
the
following
algorithm:
SYNTAX_ERR
exception
and
stop
the
algorithm.
INVALID_CHARACTER_ERR
exception
and
stop
the
algorithm.
DOMTokenList
object's
underlying
string
then
remove
the
given
token
from
the
underlying
string
and
stop
the
algorithm,
returning
false.
DOMTokenList
object's
underlying
string
is
not
the
empty
string
and
the
last
character
of
that
string
is
not
a
space
character
,
then
append
a
U+0020
SPACE
character
to
the
end
of
that
string.
DOMTokenList
object's
underlying
string.
Objects
implementing
the
DOMTokenList
interface
must
stringify
to
the
object's
underlying
string
representation.
The
DOMSettableTokenList
interface
is
the
same
as
the
DOMTokenList
interface,
except
that
it
allows
the
underlying
string
to
be
directly
changed.
interface DOMSettableTokenList : DOMTokenList {
attribute DOMString value;
};
value
Returns the underlying string.
Can be set, to change the underlying string.
An
object
implementing
the
DOMSettableTokenList
interface
must
act
as
defined
for
the
DOMTokenList
interface,
except
for
the
value
attribute
defined
here.
The
value
attribute
must
return
the
underlying
string
on
getting,
and
must
replace
the
underlying
string
with
the
new
value
on
setting.
When a user agent is required to obtain a structured clone of an object, it must run the following algorithm, which either returns a separate object, or throws an exception.
Let input be the object being cloned.
Let memory be a list of objects, initially empty. (This is used to catch cycles.)
Let output be the object resulting from calling the internal structured cloning algorithm with input and memory .
Return output .
The internal structured cloning algorithm is always called with two arguments, input and memory , and its behavior depends on the type of input , as follows:
Return the undefined value.
Return the null value.
Return the false value.
Return the true value.
Return a newly constructed Number object with the same value as input .
Return a newly constructed String object with the same value as input .
Date
object
Return
a
newly
constructed
Date
object
with
the
same
value
as
input
.
RegExp
object
Return
a
newly
constructed
RegExp
object
with
the
same
pattern
and
flags
as
input
.
The
value
of
the
lastIndex
property
is
not
copied.
ImageData
object
Return
a
newly
constructed
ImageData
object
with
the
same
width
and
height
as
input
,
and
with
a
newly
constructed
CanvasPixelArray
for
its
data
attribute,
with
the
same
length
and
pixel
values
as
the
input
's.
File
object
Return
a
newly
constructed
File
object
corresponding
to
the
same
underlying
data.
FileData
object
Return
a
newly
constructed
FileData
object
corresponding
to
the
same
underlying
data.
FileList
object
Return
a
newly
constructed
FileList
object
containing
a
list
of
newly
constructed
File
objects
corresponding
to
the
same
underlying
data
as
those
in
input
,
maintaining
their
relative
order.
Return the null value.
If
input
is
in
memory
,
then
throw
a
NOT_SUPPORTED_ERR
exception
and
abort
the
overall
structured
clone
algorithm.
Otherwise, let new memory be a list consisting of the items in memory with the addition of input .
Create a new object, output , of the same type as input : either an Array or an Object.
For each enumerable property in input , add a corresponding property to output having the same name, and having a value created from invoking the internal structured cloning algorithm recursively with the value of the property as the " input " argument and new memory as the " memory " argument. The order of the properties in the input and output objects must be the same.
This does not walk the prototype chain.
Return output .
Error
)
Return the null value.
The
DOMStringMap
interface
represents
a
set
of
name-value
pairs.
It
exposes
these
using
the
scripting
language's
native
mechanisms
for
property
access.
When
a
DOMStringMap
object
is
instantiated,
it
is
associated
with
three
algorithms,
one
for
getting
the
list
of
name-value
pairs,
one
for
setting
names
to
certain
values,
and
one
for
deleting
names.
interface DOMStringMap {
getter DOMString (in DOMString name);
setter void (in DOMString name, in DOMString value);
creator void (in DOMString name);
deleter void (in DOMString name);
};
The
names
of
the
supported
named
properties
on
a
DOMStringMap
object
at
any
instant
are
the
names
of
each
pair
returned
from
the
algorithm
for
getting
the
list
of
name-value
pairs
at
that
instant.
When
a
DOMStringMap
object
is
indexed
to
retrieve
a
named
property
name
,
the
value
returned
must
be
the
value
component
of
the
name-value
pair
whose
name
component
is
name
in
the
list
returned
by
the
algorithm
for
getting
the
list
of
name-value
pairs.
When
a
DOMStringMap
object
is
indexed
to
create
or
modify
a
named
property
name
with
value
value
,
the
algorithm
for
setting
names
to
certain
values
must
be
run,
passing
name
as
the
name
and
the
result
of
converting
value
to
a
DOMString
as
the
value.
When
a
DOMStringMap
object
is
indexed
to
delete
a
named
property
named
name
,
the
algorithm
for
deleting
names
must
be
run,
passing
name
as
the
name.
The
DOMStringMap
interface
definition
here
is
only
intended
for
JavaScript
environments.
Other
language
bindings
will
need
to
define
how
DOMStringMap
is
to
be
implemented
for
those
languages.
The
dataset
attribute
on
elements
exposes
the
data-*
attributes
on
the
element.
Given the following fragment and elements with similar constructions:
<img class="tower" id="tower5" data-x="12" data-y="5"
data-ai="robotarget" data-hp="46" data-ability="flames"
src="towers/rocket.png
alt="Rocket
Tower">
...one
could
imagine
a
function
splashDamage()
that
takes
some
arguments,
the
first
of
which
is
the
element
to
process:
function splashDamage(node, x, y, damage) {
if (node.classList.contains('tower') && // checking the 'class' attribute
node.dataset.x == x && // reading the 'data-x' attribute
node.dataset.y == y) { // reading the 'data-y' attribute
var hp = parseInt(node.dataset.hp); // reading the 'data-hp' attribute
hp = hp - damage;
if (hp < 0) {
hp = 0;
node.dataset.ai = 'dead'; // setting the 'data-ai' attribute
delete node.dataset.ability; // removing the 'data-ability' attribute
}
node.dataset.hp = hp; // setting the 'data-hp' attribute
}
}
DOM3 Core defines mechanisms for checking for interface support, and for obtaining implementations of interfaces, using feature strings . [DOMCORE]
Authors are strongly discouraged from using these, as they are notoriously unreliable and imprecise. Authors are encouraged to rely on explicit feature testing or the graceful degradation behavior intrinsic to some of the features in this specification.
For
historical
reasons,
user
agents
should
return
the
true
value
when
the
hasFeature(
feature
,
version
)
method
of
the
DOMImplementation
interface
is
invoked
with
feature
set
to
either
"
HTML
"
or
"
XHTML
"
and
version
set
to
either
"
1.0
"
or
"
2.0
".
The
following
DOMException
codes
are
defined
in
DOM
Core.
[DOMCORE]
INDEX_SIZE_ERR
DOMSTRING_SIZE_ERR
HIERARCHY_REQUEST_ERR
WRONG_DOCUMENT_ERR
INVALID_CHARACTER_ERR
NO_DATA_ALLOWED_ERR
NO_MODIFICATION_ALLOWED_ERR
NOT_FOUND_ERR
NOT_SUPPORTED_ERR
INUSE_ATTRIBUTE_ERR
INVALID_STATE_ERR
SYNTAX_ERR
INVALID_MODIFICATION_ERR
NAMESPACE_ERR
INVALID_ACCESS_ERR
VALIDATION_ERR
TYPE_MISMATCH_ERR
SECURITY_ERR
NETWORK_ERR
ABORT_ERR
URL_MISMATCH_ERR
QUOTA_EXCEEDED_ERR
PARSE_ERR
SERIALISE_ERR
There is an implied strong reference from any DOM attribute that returns a pre-existing object to that object.
For
example,
the
document.location
attribute
means
that
there
is
a
strong
reference
from
a
Document
object
to
its
Location
object.
Similarly,
there
is
always
a
strong
reference
from
a
Document
to
any
descendant
nodes,
and
from
any
node
to
its
owner
Document
.
This section is controversial and does not enjoy broad consensus.
It is arguable that implementation advice for the DOM regarding garbage collection techniques should be in the HTML5 specification as normative language. The purpose of section is also vague and should either be removed, or clarified.
This section is non-normative.
An introduction to marking up a document.
Every
XML
and
HTML
document
in
an
HTML
UA
is
represented
by
a
Document
object.
[DOMCORE]
The
document's
address
is
an
absolute
URL
that
is
set
when
the
Document
is
created.
The
document's
current
address
is
an
absolute
URL
that
can
change
during
the
lifetime
of
the
Document
,
for
example
when
the
user
navigates
to
a
fragment
identifier
on
the
page.
The
document's
current
address
must
be
set
to
the
document's
address
when
the
Document
is
created.
Interactive user agents typically expose the document's current address in their user interface.
When
a
Document
is
created
by
a
script
using
the
createDocument()
API,
the
document's
address
is
the
same
as
the
document's
address
of
the
active
document
of
the
script's
browsing
context
.
Document
objects
are
assumed
to
be
XML
documents
unless
they
are
flagged
as
being
HTML
documents
when
they
are
created.
Whether
a
document
is
an
HTML
document
or
an
XML
document
affects
the
behavior
of
certain
APIs,
as
well
as
a
few
CSS
rendering
rules.
[CSS]
A
Document
object
created
by
the
createDocument()
API
on
the
DOMImplementation
object
is
initially
an
XML
document
,
but
can
be
made
into
an
HTML
document
by
calling
document.open()
on
it.
All
Document
objects
(in
user
agents
implementing
this
specification)
must
also
implement
the
HTMLDocument
interface,
available
using
binding-specific
methods.
(This
is
the
case
whether
or
not
the
document
in
question
is
an
HTML
document
or
indeed
whether
it
contains
any
HTML
elements
at
all.)
Document
objects
must
also
implement
the
document-level
interface
of
any
other
namespaces
found
in
the
document
that
the
UA
supports.
For
example,
if
an
HTML
implementation
also
supports
SVG,
then
the
Document
object
implements
both
HTMLDocument
and
SVGDocument
.
Because
the
HTMLDocument
interface
is
now
obtained
using
binding-specific
casting
methods
instead
of
simply
being
the
primary
interface
of
the
document
object,
it
is
no
longer
defined
as
inheriting
from
Document
.
[OverrideBuiltins]
interface HTMLDocument {
// resource metadata management
[PutForwards=href] readonly attribute Location location;
readonly attribute DOMString URL;
attribute DOMString domain;
readonly attribute DOMString referrer;
attribute DOMString cookie;
readonly attribute DOMString lastModified;
readonly attribute DOMString compatMode;
attribute DOMString charset;
readonly attribute DOMString characterSet;
readonly attribute DOMString defaultCharset;
readonly attribute DOMString readyState;
// DOM tree accessors
attribute DOMString title;
attribute DOMString dir;
attribute HTMLElement body;
readonly attribute HTMLCollection images;
readonly attribute HTMLCollection embeds;
readonly attribute HTMLCollection plugins;
readonly attribute HTMLCollection links;
readonly attribute HTMLCollection forms;
readonly attribute HTMLCollection scripts;
NodeList getElementsByName(in DOMString elementName);
NodeList getElementsByClassName(in DOMString classNames);
NodeList getItems(optional in DOMString typeNames);
getter any (in DOMString name);
// dynamic markup insertion
attribute DOMString innerHTML;
HTMLDocument open(optional in DOMString type, optional in DOMString replace);
WindowProxy open(in DOMString url, in DOMString name, in DOMString features, optional in boolean replace);
void close();
void write(in DOMString... text);
void writeln(in DOMString... text);
// user interaction
Selection getSelection();
readonly attribute Element activeElement;
boolean hasFocus();
attribute DOMString designMode;
boolean execCommand(in DOMString commandId);
boolean execCommand(in DOMString commandId, in boolean showUI);
boolean execCommand(in DOMString commandId, in boolean showUI, in DOMString value);
boolean queryCommandEnabled(in DOMString commandId);
boolean queryCommandIndeterm(in DOMString commandId);
boolean queryCommandState(in DOMString commandId);
boolean queryCommandSupported(in DOMString commandId);
DOMString queryCommandValue(in DOMString commandId);
readonly attribute HTMLCollection commands;
// event handler DOM attributes
attribute Function onabort;
attribute Function onblur;
attribute Function oncanplay;
attribute Function oncanplaythrough;
attribute Function onchange;
attribute Function onclick;
attribute Function oncontextmenu;
attribute Function ondblclick;
attribute Function ondrag;
attribute Function ondragend;
attribute Function ondragenter;
attribute Function ondragleave;
attribute Function ondragover;
attribute Function ondragstart;
attribute Function ondrop;
attribute Function ondurationchange;
attribute Function onemptied;
attribute Function onended;
attribute Function onerror;
attribute Function onfocus;
attribute Function onformchange;
attribute Function onforminput;
attribute Function oninput;
attribute Function oninvalid;
attribute Function onkeydown;
attribute Function onkeypress;
attribute Function onkeyup;
attribute Function onload;
attribute Function onloadeddata;
attribute Function onloadedmetadata;
attribute Function onloadstart;
attribute Function onmousedown;
attribute Function onmousemove;
attribute Function onmouseout;
attribute Function onmouseover;
attribute Function onmouseup;
attribute Function onmousewheel;
attribute Function onpause;
attribute Function onplay;
attribute Function onplaying;
attribute Function onprogress;
attribute Function onratechange;
attribute Function onreadystatechange;
attribute Function onscroll;
attribute Function onseeked;
attribute Function onseeking;
attribute Function onselect;
attribute Function onshow;
attribute Function onstalled;
attribute Function onsubmit;
attribute Function onsuspend;
attribute Function ontimeupdate;
attribute Function onvolumechange;
attribute Function onwaiting;
};
Document implements HTMLDocument;
Since
the
HTMLDocument
interface
holds
methods
and
attributes
related
to
a
number
of
disparate
features,
the
members
of
this
interface
are
described
in
various
different
sections.
User
agents
must
raise
a
SECURITY_ERR
exception
whenever
any
of
the
members
of
an
HTMLDocument
object
are
accessed
by
scripts
whose
effective
script
origin
is
not
the
same
as
the
Document
's
effective
script
origin
.
URL
Returns the document's address .
referrer
Returns
the
address
of
the
Document
from
which
the
user
navigated
to
this
one,
unless
it
was
blocked
or
there
was
no
such
document,
in
which
case
it
returns
the
empty
string.
The
noreferrer
link
type
can
be
used
to
block
the
referrer.
The
URL
attribute
must
return
the
document's
address
.
The
referrer
attribute
must
return
either
the
current
address
of
the
active
document
of
the
source
browsing
context
at
the
time
the
navigation
was
started
(that
is,
the
page
which
navigated
the
browsing
context
to
the
current
document),
or
the
empty
string
if
there
is
no
such
originating
page,
or
if
the
UA
has
been
configured
not
to
report
referrers
in
this
case,
or
if
the
navigation
was
initiated
for
a
hyperlink
with
a
noreferrer
keyword.
In
the
case
of
HTTP,
the
referrer
DOM
attribute
will
match
the
Referer
(sic)
header
that
was
sent
when
fetching
the
current
page.
Typically
user
agents
are
configured
to
not
report
referrers
in
the
case
where
the
referrer
uses
an
encrypted
protocol
and
the
current
page
does
not
(e.g.
when
navigating
from
an
https:
page
to
an
http:
page).
cookie
[
=
value
]
Returns
the
HTTP
cookies
that
apply
to
the
Document
.
If
there
are
no
cookies
or
cookies
can't
be
applied
to
this
resource,
the
empty
string
will
be
returned.
Can be set, to add a new cookie to the element's set of HTTP cookies.
If
the
Document
has
no
browsing
context
an
INVALID_STATE_ERR
exception
will
be
thrown.
If
the
contents
are
sandboxed
into
a
unique
origin
,
a
SECURITY_ERR
exception
will
be
thrown.
The
cookie
attribute
represents
the
cookies
of
the
resource.
On
getting,
if
the
document
is
not
associated
with
a
browsing
context
then
the
user
agent
must
raise
an
INVALID_STATE_ERR
exception.
Otherwise,
if
the
sandboxed
origin
browsing
context
flag
was
set
on
the
browsing
context
of
the
Document
when
the
Document
was
created,
the
user
agent
must
raise
a
SECURITY_ERR
exception.
Otherwise,
if
the
document's
address
does
not
use
a
server-based
naming
authority,
it
must
return
the
empty
string.
Otherwise,
it
must
first
obtain
the
storage
mutex
and
then
return
the
same
string
as
the
value
of
the
Cookie
HTTP
header
it
would
include
if
fetching
the
resource
indicated
by
the
document's
address
over
HTTP,
as
per
RFC
2109
section
4.3.4
or
later
specifications,
excluding
HTTP-only
cookies.
[RFC2109]
[COOKIES]
On
setting,
if
the
document
is
not
associated
with
a
browsing
context
then
the
user
agent
must
raise
an
INVALID_STATE_ERR
exception.
Otherwise,
if
the
sandboxed
origin
browsing
context
flag
was
set
on
the
browsing
context
of
the
Document
when
the
Document
was
created,
the
user
agent
must
raise
a
SECURITY_ERR
exception.
Otherwise,
if
the
document's
address
does
not
use
a
server-based
naming
authority,
it
must
do
nothing.
Otherwise,
the
user
agent
must
obtain
the
storage
mutex
and
then
act
as
it
would
when
processing
cookies
if
it
had
just
attempted
to
fetch
the
document's
address
over
HTTP,
and
had
received
a
response
with
a
Set-Cookie
header
whose
value
was
the
specified
value,
as
per
RFC
2109
sections
4.3.1,
4.3.2,
and
4.3.3
or
later
specifications,
but
without
overwriting
the
values
of
HTTP-only
cookies.
[RFC2109]
[COOKIES]
This
specification
does
not
define
what
makes
an
HTTP-only
cookie,
and
at
the
time
of
publication
the
editor
is
not
aware
of
any
reference
for
HTTP-only
cookies.
They
are
a
feature
supported
by
some
Web
browsers
wherein
an
"
httponly
"
parameter
added
to
the
cookie
string
causes
the
cookie
to
be
hidden
from
script.
Since
the
cookie
attribute
is
accessible
across
frames,
the
path
restrictions
on
cookies
are
only
a
tool
to
help
manage
which
cookies
are
sent
to
which
parts
of
the
site,
and
are
not
in
any
way
a
security
feature.
lastModified
Returns
the
date
of
the
last
modification
to
the
document,
as
reported
by
the
server,
in
the
form
"
MM/DD/YYYY hh:mm:ss
".
If the last modification date is not known, the current time is returned instead.
The
lastModified
attribute,
on
getting,
must
return
the
date
and
time
of
the
Document
's
source
file's
last
modification,
in
the
user's
local
time
zone,
in
the
following
format:
All the numeric components above, other than the year, must be given as two digits in the range U+0030 DIGIT ZERO to U+0039 DIGIT NINE representing the number in base ten, zero-padded if necessary. The year must be given as four or more digits in the range U+0030 DIGIT ZERO to U+0039 DIGIT NINE representing the number in base ten, zero-padded if necessary.
The
Document
's
source
file's
last
modification
date
and
time
must
be
derived
from
relevant
features
of
the
networking
protocols
used,
e.g.
from
the
value
of
the
HTTP
Last-Modified
header
of
the
document,
or
from
metadata
in
the
file
system
for
local
files.
If
the
last
modification
date
and
time
are
not
known,
the
attribute
must
return
the
current
date
and
time
in
the
above
format.
compatMode
In
a
conforming
document,
returns
the
string
"
CSS1Compat
".
(In
quirks
mode
documents,
returns
the
string
"
BackCompat
",
but
a
conforming
document
can
never
trigger
quirks
mode
.)
A
Document
is
always
set
to
one
of
three
modes:
no
quirks
mode
,
the
default;
quirks
mode
,
used
typically
for
legacy
documents;
and
limited
quirks
mode
,
also
known
as
"almost
standards"
mode.
The
mode
is
only
ever
changed
from
the
default
by
the
HTML
parser
,
based
on
the
presence,
absence,
or
value
of
the
DOCTYPE
string.
The
compatMode
DOM
attribute
must
return
the
literal
string
"
CSS1Compat
"
unless
the
document
has
been
set
to
quirks
mode
by
the
HTML
parser
,
in
which
case
it
must
instead
return
the
literal
string
"
BackCompat
".
charset
[
=
value
]
Returns the document's character encoding .
Can be set, to dynamically change the document's character encoding .
New values that are not IANA-registered aliases supported by the user agent are ignored.
characterSet
Returns the document's character encoding .
defaultCharset
Returns what might be the user agent's default character encoding.
Documents
have
an
associated
character
encoding
.
When
a
Document
object
is
created,
the
document's
character
encoding
must
be
initialized
to
UTF-16.
Various
algorithms
during
page
loading
affect
this
value,
as
does
the
charset
setter.
[IANACHARSET]
The
charset
DOM
attribute
must,
on
getting,
return
the
preferred
MIME
name
of
the
document's
character
encoding
.
On
setting,
if
the
new
value
is
an
IANA-registered
alias
for
a
character
encoding
supported
by
the
user
agent,
the
document's
character
encoding
must
be
set
to
that
character
encoding.
(Otherwise,
nothing
happens.)
The
characterSet
DOM
attribute
must,
on
getting,
return
the
preferred
MIME
name
of
the
document's
character
encoding
.
The
defaultCharset
DOM
attribute
must,
on
getting,
return
the
preferred
MIME
name
of
a
character
encoding,
possibly
the
user's
default
encoding,
or
an
encoding
associated
with
the
user's
current
geographical
location,
or
any
arbitrary
encoding
name.
readyState
Returns
"loading"
while
the
Document
is
loading,
and
"complete"
once
it
has
loaded.
The
readystatechange
event
fires
on
the
Document
object
when
this
value
changes.
Each
document
has
a
current
document
readiness
.
When
a
Document
object
is
created,
it
must
have
its
current
document
readiness
set
to
the
string
"loading"
if
the
document
is
associated
with
an
HTML
parser
or
an
XML
parser
,
or
to
the
string
"complete"
otherwise.
Various
algorithms
during
page
loading
affect
this
value.
When
the
value
is
set,
the
user
agent
must
fire
a
simple
event
called
readystatechange
at
the
Document
object.
A
Document
is
said
to
have
an
active
parser
if
it
is
associated
with
an
HTML
parser
or
an
XML
parser
that
has
not
yet
been
stopped
or
aborted.
The
readyState
DOM
attribute
must,
on
getting,
return
the
current
document
readiness
.
The
html
element
of
a
document
is
the
document's
root
element,
if
there
is
one
and
it's
an
html
element,
or
null
otherwise.
The
head
element
of
a
document
is
the
first
head
element
that
is
a
child
of
the
html
element
,
if
there
is
one,
or
null
otherwise.
title
[
=
value
]
Returns
the
document's
title,
as
given
by
the
title
element
.
Can
be
set,
to
update
the
document's
title.
If
there
is
no
head
element
,
the
new
value
is
ignored.
In
SVG
documents,
the
SVGDocument
interface's
title
attribute
takes
precedence.
The
title
element
of
a
document
is
the
first
title
element
in
the
document
(in
tree
order),
if
there
is
one,
or
null
otherwise.
The
title
attribute
must,
on
getting,
run
the
following
algorithm:
If
the
root
element
is
an
svg
element
in
the
"
http://www.w3.org/2000/svg
"
namespace,
and
the
user
agent
supports
SVG,
then
return
the
value
that
would
have
been
returned
by
the
DOM
attribute
of
the
same
name
on
the
SVGDocument
interface.
[SVG]
Otherwise,
let
value
be
a
concatenation
of
the
data
of
all
the
child
text
nodes
of
the
title
element
,
in
tree
order,
or
the
empty
string
if
the
title
element
is
null.
Replace any sequence of two or more consecutive space characters in value with a single U+0020 SPACE character.
Remove any leading or trailing space characters in value .
Return value .
On setting, the following algorithm must be run. Mutation events must be fired as appropriate.
If
the
root
element
is
an
svg
element
in
the
"
http://www.w3.org/2000/svg
"
namespace,
and
the
user
agent
supports
SVG,
then
the
setter
must
defer
to
the
setter
for
the
DOM
attribute
of
the
same
name
on
the
SVGDocument
interface
(if
it
is
readonly,
then
this
will
raise
an
exception).
Stop
the
algorithm
here.
[SVG]
title
element
is
null
and
the
head
element
is
null,
then
the
attribute
must
do
nothing.
Stop
the
algorithm
here.
title
element
is
null,
then
a
new
title
element
must
be
created
and
appended
to
the
head
element
.
Let
element
be
that
element.
Otherwise,
let
element
be
the
title
element
.
Text
node
whose
data
is
the
new
value
being
assigned
must
be
appended
to
element
.
The
title
attribute
on
the
HTMLDocument
interface
should
shadow
the
attribute
of
the
same
name
on
the
SVGDocument
interface
when
the
user
agent
supports
both
HTML
and
SVG.
[SVG]
body
[
=
value
]
Returns the body element .
Can be set, to replace the body element .
If
the
new
value
is
not
a
body
or
frameset
element,
this
will
throw
a
HIERARCHY_REQUEST_ERR
exception.
The
body
element
of
a
document
is
the
first
child
of
the
html
element
that
is
either
a
body
element
or
a
frameset
element.
If
there
is
no
such
element,
it
is
null.
If
the
body
element
is
null,
then
when
the
specification
requires
that
events
be
fired
at
"the
body
element",
they
must
instead
be
fired
at
the
Document
object.
The
body
attribute,
on
getting,
must
return
the
body
element
of
the
document
(either
a
body
element,
a
frameset
element,
or
null).
On
setting,
the
following
algorithm
must
be
run:
body
or
frameset
element,
then
raise
a
HIERARCHY_REQUEST_ERR
exception
and
abort
these
steps.
replaceChild()
method
had
been
called
with
the
new
value
and
the
incumbent
body
element
as
its
two
arguments
respectively,
then
abort
these
steps.
images
Returns
an
HTMLCollection
of
the
img
elements
in
the
Document
.
embeds
plugins
Return
an
HTMLCollection
of
the
embed
elements
in
the
Document
.
links
Returns
an
HTMLCollection
of
the
a
and
area
elements
in
the
Document
that
have
href
attributes.
forms
Return
an
HTMLCollection
of
the
form
elements
in
the
Document
.
scripts
Return
an
HTMLCollection
of
the
script
elements
in
the
Document
.
The
images
attribute
must
return
an
HTMLCollection
rooted
at
the
Document
node,
whose
filter
matches
only
img
elements.
The
embeds
attribute
must
return
an
HTMLCollection
rooted
at
the
Document
node,
whose
filter
matches
only
embed
elements.
The
plugins
attribute
must
return
the
same
object
as
that
returned
by
the
embeds
attribute.
The
links
attribute
must
return
an
HTMLCollection
rooted
at
the
Document
node,
whose
filter
matches
only
a
elements
with
href
attributes
and
area
elements
with
href
attributes.
The
forms
attribute
must
return
an
HTMLCollection
rooted
at
the
Document
node,
whose
filter
matches
only
form
elements.
The
scripts
attribute
must
return
an
HTMLCollection
rooted
at
the
Document
node,
whose
filter
matches
only
script
elements.
getElementsByName
(
name
)
Returns
a
NodeList
of
elements
in
the
Document
that
have
a
name
attribute
with
the
value
name
.
getElementsByClassName(
classes
)
getElementsByClassName(
classes
)
Returns
a
NodeList
of
the
elements
in
the
object
on
which
the
method
was
invoked
(a
Document
or
an
Element
)
that
have
all
the
classes
given
by
classes
.
The classes argument is interpreted as a space-separated list of classes.
The
getElementsByName(
name
)
method
takes
a
string
name
,
and
must
return
a
live
NodeList
containing
all
the
HTML
elements
in
that
document
that
have
a
name
attribute
whose
value
is
equal
to
the
name
argument
(in
a
case-sensitive
manner),
in
tree
order
.
The
getElementsByClassName(
classNames
)
method
takes
a
string
that
contains
an
unordered
set
of
unique
space-separated
tokens
representing
classes.
When
called,
the
method
must
return
a
live
NodeList
object
containing
all
the
elements
in
the
document,
in
tree
order
,
that
have
all
the
classes
specified
in
that
argument,
having
obtained
the
classes
by
splitting
a
string
on
spaces
.
If
there
are
no
tokens
specified
in
the
argument,
then
the
method
must
return
an
empty
NodeList
.
If
the
document
is
in
quirks
mode
,
then
the
comparisons
for
the
classes
must
be
done
in
an
ASCII
case-insensitive
manner,
otherwise,
the
comparisons
must
be
done
in
a
case-sensitive
manner.
The
getElementsByClassName(
classNames
)
method
on
the
HTMLElement
interface
must
return
a
live
NodeList
with
the
nodes
that
the
HTMLDocument
getElementsByClassName()
method
would
return
when
passed
the
same
argument(s),
excluding
any
elements
that
are
not
descendants
of
the
HTMLElement
object
on
which
the
method
was
invoked.
HTML,
SVG,
and
MathML
elements
define
which
classes
they
are
in
by
having
an
attribute
in
the
per-element
partition
with
the
name
class
containing
a
space-separated
list
of
classes
to
which
the
element
belongs.
Other
specifications
may
also
allow
elements
in
their
namespaces
to
be
labeled
as
being
in
specific
classes.
Given the following XHTML fragment:
<div id="example"> <p id="p1" class="aaa bbb"/> <p id="p2" class="aaa ccc"/> <p id="p3" class="bbb ccc"/> </div>
A
call
to
document.getElementById('example').getElementsByClassName('aaa')
would
return
a
NodeList
with
the
two
paragraphs
p1
and
p2
in
it.
A
call
to
getElementsByClassName('ccc bbb')
would
only
return
one
node,
however,
namely
p3
.
A
call
to
document.getElementById('example').getElementsByClassName('bbb ccc ')
would
return
the
same
thing.
A
call
to
getElementsByClassName('aaa,bbb')
would
return
no
nodes;
none
of
the
elements
above
are
in
the
"aaa,bbb"
class.
The
HTMLDocument
interface
supports
named
properties
.
The
names
of
the
supported
named
properties
at
any
moment
consist
of
the
values
of
the
name
content
attributes
of
all
the
applet
,
embed
,
form
,
iframe
,
img
,
and
fallback-free
object
elements
in
the
Document
that
have
name
content
attributes,
and
the
values
of
the
id
content
attributes
of
all
the
applet
and
fallback-free
object
elements
in
the
Document
that
have
id
content
attributes,
and
the
values
of
the
id
content
attributes
of
all
the
img
elements
in
the
Document
that
have
both
name
content
attributes
and
id
content
attributes.
When
the
HTMLDocument
object
is
indexed
for
property
retrieval
using
a
name
name
,
then
the
user
agent
must
return
the
value
obtained
using
the
following
steps:
Let
elements
be
the
list
of
named
elements
with
the
name
name
in
the
Document
.
There will be at least one such element, by definition.
If
elements
has
only
one
element,
and
that
element
is
an
iframe
element,
then
return
the
WindowProxy
object
of
the
nested
browsing
context
represented
by
that
iframe
element,
and
abort
these
steps.
Otherwise, if elements has only one element, return that element and abort these steps.
Otherwise
return
an
HTMLCollection
rooted
at
the
Document
node,
whose
filter
matches
only
named
elements
with
the
name
name
.
Named elements with the name name , for the purposes of the above algorithm, are those that are either:
applet
,
embed
,
form
,
iframe
,
img
,
or
fallback-free
object
elements
that
have
a
name
content
attribute
whose
value
is
name
,
or
applet
or
fallback-free
object
elements
that
have
an
id
content
attribute
whose
value
is
name
,
or
img
elements
that
have
an
id
content
attribute
whose
value
is
name
,
and
that
have
a
name
content
attribute
present
also.
An
object
element
is
said
to
be
fallback-free
if
it
has
no
object
or
embed
descendants.
The
dir
attribute
on
the
HTMLDocument
interface
is
defined
along
with
the
dir
content
attribute.
Elements,
attributes,
and
attribute
values
in
HTML
are
defined
(by
this
specification)
to
have
certain
meanings
(semantics).
For
example,
the
ol
element
represents
an
ordered
list,
and
the
lang
attribute
represents
the
language
of
the
content.
Authors must not use elements, attributes, and attribute values for purposes other than their appropriate intended semantic purpose. Authors must not use elements, attributes, and attribute values that are not permitted by this specification or other applicable specifications.
For example, the following document is non-conforming, despite being syntactically correct:
<!DOCTYPE html>
<html lang="en-GB">
<head> <title> Demonstration </title> </head>
<body>
<table>
<tr> <td> My favourite animal is the cat. </td> </tr>
<tr>
<td>
—<a href="http://example.org/~ernest/"><cite>Ernest</cite></a>,
in an essay from 1992
</td>
</tr>
</table>
</body>
</html>
...because
the
data
placed
in
the
cells
is
clearly
not
tabular
data
(and
the
cite
element
mis-used).
A
corrected
version
of
this
document
might
be:
<!DOCTYPE html> <html lang="en-GB"> <head> <title> Demonstration </title> </head> <body> <blockquote> <p> My favourite animal is the cat. </p> </blockquote> <p> —<a href="http://example.org/~ernest/">Ernest</a>, in an essay from 1992 </p> </body> </html>
This next document fragment, intended to represent the heading of a corporate site, is similarly non-conforming because the second line is not intended to be a heading of a subsection, but merely a subheading or subtitle (a subordinate heading for the same section).
<body> <h1>ABC Company</h1> <h2>Leading the way in widget design since 1432</h2> ...
The
hgroup
element
should
be
used
in
these
kinds
of
situations:
<body> <hgroup> <h1>ABC Company</h1> <h2>Leading the way in widget design since 1432</h2> </hgroup> ...
In the next example, there is a non-conforming attribute value ("carpet") and a non-conforming attribute ("texture"), which is not permitted by this specification:
<label>Carpet: <input type="carpet" name="c" texture="deep pile"></label>
Here would be an alternative and correct way to mark this up:
<label>Carpet: <input type="text" class="carpet" name="c" data-texture="deep pile"></label>
Through scripting and using other mechanisms, the values of attributes, text, and indeed the entire structure of the document may change dynamically while a user agent is processing it. The semantics of a document at an instant in time are those represented by the state of the document at that instant in time, and the semantics of a document can therefore change over time. User agents must update their presentation of the document as this occurs.
HTML
has
a
progress
element
that
describes
a
progress
bar.
If
its
"value"
attribute
is
dynamically
updated
by
a
script,
the
UA
would
update
the
rendering
to
show
the
progress
changing.
The nodes representing HTML elements in the DOM must implement, and expose to scripts, the interfaces listed for them in the relevant sections of this specification. This includes HTML elements in XML documents , even when those documents are in another context (e.g. inside an XSLT transform).
Elements in the DOM represent things; that is, they have intrinsic meaning , also known as semantics.
For
example,
an
ol
element
represents
an
ordered
list.
The
basic
interface,
from
which
all
the
HTML
elements
'
interfaces
inherit,
and
which
must
be
used
by
elements
that
have
no
additional
requirements,
is
the
HTMLElement
interface.
interface HTMLElement : Element { // DOM tree accessors NodeList getElementsByClassName(in DOMString classNames); // dynamic markup insertion attribute DOMString innerHTML; attribute DOMString outerHTML; void insertAdjacentHTML(in DOMString position, in DOMString text); // metadata attributes attribute DOMString id; attribute DOMString title; attribute DOMString lang; attribute DOMString dir; attribute DOMString className; readonly attribute DOMTokenList classList; readonly attribute DOMStringMap dataset; // microdata [PutForwards=value] readonly attribute DOMSettableTokenList item; [PutForwards=value] readonly attribute DOMSettableTokenList itemprop; readonly attribute HTMLPropertyCollection properties; attribute DOMString content; attribute HTMLElement subject; // user interaction attribute boolean hidden; void click(); void scrollIntoView(); void scrollIntoView(in boolean top); attribute long tabIndex; void focus(); void blur(); attribute DOMString accessKey; readonly attribute DOMString accessKeyLabel; attribute boolean draggable; attribute DOMString contentEditable; readonly attribute boolean isContentEditable; attribute HTMLMenuElement contextMenu; attribute DOMString spellcheck; // command API readonly attribute DOMString commandType; readonly attribute DOMString label; readonly attribute DOMString icon; readonly attribute boolean disabled; readonly attribute boolean checked; // styling readonly attribute CSSStyleDeclaration style; // event handler DOM attributes attribute Function onabort; attribute Function onblur; attribute Function oncanplay; attribute Function oncanplaythrough; attribute Function onchange; attribute Function onclick; attribute Function oncontextmenu; attribute Function ondblclick; attribute Function ondrag; attribute Function ondragend; attribute Function ondragenter; attribute Function ondragleave; attribute Function ondragover; attribute Function ondragstart; attribute Function ondrop; attribute Function ondurationchange; attribute Function onemptied; attribute Function onended; attribute Function onerror; attribute Function onfocus; attribute Function onformchange; attribute Function onforminput; attribute Function oninput; attribute Function oninvalid; attribute Function onkeydown; attribute Function onkeypress; attribute Function onkeyup; attribute Function onload; attribute Function onloadeddata; attribute Function onloadedmetadata; attribute Function onloadstart; attribute Function onmousedown; attribute Function onmousemove; attribute Function onmouseout; attribute Function onmouseover; attribute Function onmouseup; attribute Function onmousewheel; attribute Function onpause; attribute Function onplay; attribute Function onplaying; attribute Function onprogress; attribute Function onratechange; attribute Function onreadystatechange; attribute Function onscroll; attribute Function onseeked; attribute Function onseeking; attribute Function onselect; attribute Function onshow; attribute Function onstalled; attribute Function onsubmit; attribute Function onsuspend; attribute Function ontimeupdate; attribute Function onvolumechange; attribute Function onwaiting; }; interface HTMLUnknownElement : HTMLElement { };
The
HTMLElement
interface
holds
methods
and
attributes
related
to
a
number
of
disparate
features,
and
the
members
of
this
interface
are
therefore
described
in
various
different
sections
of
this
specification.
The
HTMLUnknownElement
interface
must
be
used
for
HTML
elements
that
are
not
defined
by
this
specification.
The following attributes are common to and may be specified on all HTML elements (even those not defined in this specification) :
accesskey
class
contenteditable
contextmenu
dir
draggable
id
item
hidden
lang
itemprop
spellcheck
style
subject
tabindex
title
In addition, unless otherwise specified, the following event handler content attributes may be specified on any HTML element :
onabort
onblur
*
oncanplay
oncanplaythrough
onchange
onclick
oncontextmenu
ondblclick
ondrag
ondragend
ondragenter
ondragleave
ondragover
ondragstart
ondrop
ondurationchange
onemptied
onended
onerror
*
onfocus
*
onformchange
onforminput
oninput
oninvalid
onkeydown
onkeypress
onkeyup
onload
*
onloadeddata
onloadedmetadata
onloadstart
onmousedown
onmousemove
onmouseout
onmouseover
onmouseup
onmousewheel
onpause
onplay
onplaying
onprogress
onratechange
onreadystatechange
onscroll
onseeked
onseeking
onselect
onshow
onstalled
onsubmit
onsuspend
ontimeupdate
onvolumechange
onwaiting
The
attributes
marked
with
an
asterisk
cannot
be
specified
on
body
elements
as
those
elements
expose
event
handler
attributes
of
the
Window
object
with
the
same
names.
Also,
custom
data
attributes
(e.g.
data-foldername
or
data-msgid
)
can
be
specified
on
any
HTML
element
,
to
store
custom
data
specific
to
the
page.
In
HTML
documents
,
elements
in
the
HTML
namespace
may
have
an
xmlns
attribute
specified,
if,
and
only
if,
it
has
the
exact
value
"
http://www.w3.org/1999/xhtml
".
This
does
not
apply
to
XML
documents
.
In
HTML,
the
xmlns
attribute
has
absolutely
no
effect.
It
is
basically
a
talisman.
It
is
allowed
merely
to
make
migration
to
and
from
XHTML
mildly
easier.
When
parsed
by
an
HTML
parser
,
the
attribute
ends
up
in
no
namespace,
not
the
"
http://www.w3.org/2000/xmlns/
"
namespace
like
namespace
declaration
attributes
in
XML
do.
In
XML,
an
xmlns
attribute
is
part
of
the
namespace
declaration
mechanism,
and
an
element
cannot
actually
have
an
xmlns
attribute
in
no
namespace
specified.
id
attribute
The
id
attribute
represents
its
element's
unique
identifier.
The
value
must
be
unique
in
the
element's
home
subtree
and
must
contain
at
least
one
character.
The
value
must
not
contain
any
space
characters
.
An element's unique identifier can be used for a variety of purposes, most notably as a way to link to specific parts of a document using fragment identifiers, as a way to target an element when scripting, and as a way to style a specific element from CSS.
If
the
value
is
not
the
empty
string,
user
agents
must
associate
the
element
with
the
given
value
(exactly,
including
any
space
characters)
for
the
purposes
of
ID
matching
within
the
element's
home
subtree
(e.g.
for
selectors
in
CSS
or
for
the
getElementById()
method
in
the
DOM).
Identifiers
are
opaque
strings.
Particular
meanings
should
not
be
derived
from
the
value
of
the
id
attribute.
This
specification
doesn't
preclude
an
element
having
multiple
IDs,
if
other
mechanisms
(e.g.
DOM
Core
methods)
can
set
an
element's
ID
in
a
way
that
doesn't
conflict
with
the
id
attribute.
title
attribute
The
title
attribute
represents
advisory
information
for
the
element,
such
as
would
be
appropriate
for
a
tooltip.
On
a
link,
this
could
be
the
title
or
a
description
of
the
target
resource;
on
an
image,
it
could
be
the
image
credit
or
a
description
of
the
image;
on
a
paragraph,
it
could
be
a
footnote
or
commentary
on
the
text;
on
a
citation,
it
could
be
further
information
about
the
source;
and
so
forth.
The
value
is
text.
If
this
attribute
is
omitted
from
an
element,
then
it
implies
that
the
title
attribute
of
the
nearest
ancestor
HTML
element
with
a
title
attribute
set
is
also
relevant
to
this
element.
Setting
the
attribute
overrides
this,
explicitly
stating
that
the
advisory
information
of
any
ancestors
is
not
relevant
to
this
element.
Setting
the
attribute
to
the
empty
string
indicates
that
the
element
has
no
advisory
information.
If
the
title
attribute's
value
contains
U+000A
LINE
FEED
(LF)
characters,
the
content
is
split
into
multiple
lines.
Each
U+000A
LINE
FEED
(LF)
character
represents
a
line
break.
Caution
is
advised
with
respect
to
the
use
of
newlines
in
title
attributes.
For instance, the following snippet actually defines an abbreviation's expansion with a line break in it :
<p>My logs show that there was some interest in <abbr title="Hypertext Transport Protocol">HTTP</abbr> today.</p>
Some
elements,
such
as
link
,
abbr
,
and
input
,
define
additional
semantics
for
the
title
attribute
beyond
the
semantics
described
above.
lang
and
xml:lang
attributes
The
lang
attribute
(in
no
namespace)
specifies
the
primary
language
for
the
element's
contents
and
for
any
of
the
element's
attributes
that
contain
text.
Its
value
must
be
a
valid
RFC
3066
language
code,
or
the
empty
string.
[RFC3066]
The
lang
attribute
in
the
XML
namespace
is
defined
in
XML.
[XML]
If these attributes are omitted from an element, then the language of this element is the same as the language of its parent element, if any. Setting the attribute to the empty string indicates that the primary language is unknown.
The
lang
attribute
in
no
namespace
may
be
used
on
any
HTML
element
.
The
lang
attribute
in
the
XML
namespace
may
be
used
on
HTML
elements
in
XML
documents
,
as
well
as
elements
in
other
namespaces
if
the
relevant
specifications
allow
it
(in
particular,
MathML
and
SVG
allow
lang
attributes
in
the
XML
namespace
to
be
specified
on
their
elements).
If
both
the
lang
attribute
in
no
namespace
and
the
lang
attribute
in
the
XML
namespace
are
specified
on
the
same
element,
they
must
have
exactly
the
same
value
when
compared
in
an
ASCII
case-insensitive
manner.
Authors
must
not
use
the
lang
attribute
in
the
XML
namespace
in
HTML
documents
.
To
ease
migration
to
and
from
XHTML,
authors
may
specify
an
attribute
in
no
namespace
with
no
prefix
and
with
the
literal
localname
"
xml:lang
"
on
HTML
elements
in
HTML
documents
,
but
such
attributes
must
only
be
specified
if
a
lang
attribute
in
no
namespace
is
also
specified,
and
both
attributes
must
have
the
same
value
when
compared
in
an
ASCII
case-insensitive
manner.
To
determine
the
language
of
a
node,
user
agents
must
look
at
the
nearest
ancestor
element
(including
the
element
itself
if
the
node
is
an
element)
that
has
a
lang
attribute
in
the
XML
namespace
set
or
is
an
HTML
element
and
has
a
lang
in
no
namespace
attribute
set.
That
attribute
specifies
the
language
of
the
node.
If
both
the
lang
attribute
in
no
namespace
and
the
lang
attribute
in
the
XML
namespace
are
set
on
an
element,
user
agents
must
use
the
lang
attribute
in
the
XML
namespace
,
and
the
lang
attribute
in
no
namespace
must
be
ignored
for
the
purposes
of
determining
the
element's
language.
If no explicit language is given for the root element , but there is a document-wide default language set, then that is the language of the node.
If there is no document-wide default language , then language information from a higher-level protocol (such as HTTP), if any, must be used as the final fallback language. In the absence of any language information, the default value is unknown (the empty string).
If the resulting value is not a recognized language code, then it must be treated as an unknown language (as if the value was the empty string).
User agents may use the element's language to determine proper processing or rendering (e.g. in the selection of appropriate fonts or pronunciations, or for dictionary selection).
The
lang
DOM
attribute
must
reflect
the
lang
content
attribute
in
no
namespace.
xml:base
attribute
(XML
only)
The
xml:base
attribute
is
defined
in
XML
Base.
[XMLBASE]
The
xml:base
attribute
may
be
used
on
elements
of
XML
documents
.
Authors
must
not
use
the
xml:base
attribute
in
HTML
documents
.
dir
attribute
The
dir
attribute
specifies
the
element's
text
directionality.
The
attribute
is
an
enumerated
attribute
with
the
keyword
ltr
mapping
to
the
state
ltr
,
and
the
keyword
rtl
mapping
to
the
state
rtl
.
The
attribute
has
no
defaults.
The processing of this attribute is primarily performed by the presentation layer. For example, the rendering section in this specification defines a mapping from this attribute to the CSS 'direction' and 'unicode-bidi' properties, and CSS defines rendering in terms of those properties.
The
directionality
of
an
element,
which
is
used
in
particular
by
the
canvas
element's
text
rendering
API,
is
either
'ltr'
or
'rtl'.
If
the
user
agent
supports
CSS
and
the
'direction'
property
on
this
element
has
a
computed
value
of
either
'ltr'
or
'rtl',
then
that
is
the
directionality
of
the
element.
Otherwise,
if
the
element
is
being
rendered,
then
the
directionality
of
the
element
is
the
directionality
used
by
the
presentation
layer,
potentially
determined
from
the
value
of
the
dir
attribute
on
the
element.
Otherwise,
if
the
element's
dir
attribute
has
the
state
ltr
,
the
element's
directionality
is
'ltr'
(left-to-right);
if
the
attribute
has
the
state
rtl
,
the
element's
directionality
is
'rtl'
(right-to-left);
and
otherwise,
the
element's
directionality
is
the
same
as
its
parent
element,
or
'ltr'
if
there
is
no
parent
element.
dir
[
=
value
]
Returns
the
html
element
's
dir
attribute's
value,
if
any.
Can
be
set,
to
either
"
ltr
"
or
"
rtl
",
to
replace
the
html
element
's
dir
attribute's
value.
If
there
is
no
html
element
,
returns
the
empty
string
and
ignores
new
values.
The
dir
DOM
attribute
on
an
element
must
reflect
the
dir
content
attribute
of
that
element,
limited
to
only
known
values
.
The
dir
DOM
attribute
on
HTMLDocument
objects
must
reflect
the
dir
content
attribute
of
the
html
element
,
if
any,
limited
to
only
known
values
.
If
there
is
no
such
element,
then
the
attribute
must
return
the
empty
string
and
do
nothing
on
setting.
Authors
are
strongly
encouraged
to
use
the
dir
attribute
to
indicate
text
direction
rather
than
using
CSS,
since
that
way
their
documents
will
continue
to
render
correctly
even
in
the
absence
of
CSS
(e.g.
as
interpreted
by
search
engines).
class
attribute
Every
HTML
element
may
have
a
class
attribute
specified.
The attribute, if specified, must have a value that is an unordered set of unique space-separated tokens representing the various classes that the element belongs to.
The
classes
that
an
HTML
element
has
assigned
to
it
consists
of
all
the
classes
returned
when
the
value
of
the
class
attribute
is
split
on
spaces
.
Assigning
classes
to
an
element
affects
class
matching
in
selectors
in
CSS,
the
getElementsByClassName()
method
in
the
DOM,
and
other
such
features.
Authors
may
use
any
value
in
the
class
attribute,
but
are
encouraged
to
use
the
values
that
describe
the
nature
of
the
content,
rather
than
values
that
describe
the
desired
presentation
of
the
content.
style
attribute
All
HTML
elements
may
have
the
style
content
attribute
set.
If
specified,
the
attribute
must
contain
only
a
list
of
zero
or
more
semicolon-separated
(;)
CSS
declarations.
[CSS]
In user agents that support CSS, the attribute's value must be parsed when the attribute is added or has its value changed, with its value treated as the body (the part inside the curly brackets) of a declaration block in a rule whose selector matches just the element on which the attribute is set. All URLs in the value must be resolved relative to the element when the attribute is parsed. For the purposes of the CSS cascade, the attribute must be considered to be a 'style' attribute at the author level.
Documents
that
use
style
attributes
on
any
of
their
elements
must
still
be
comprehensible
and
usable
if
those
attributes
were
removed.
In
particular,
using
the
style
attribute
to
hide
and
show
content,
or
to
convey
meaning
that
is
otherwise
not
included
in
the
document,
is
non-conforming.
(To
hide
and
show
content,
use
the
hidden
attribute.)
style
Returns
a
CSSStyleDeclaration
object
for
the
element's
style
attribute.
The
style
DOM
attribute
must
return
a
CSSStyleDeclaration
whose
value
represents
the
declarations
specified
in
the
attribute,
if
present.
Mutating
the
CSSStyleDeclaration
object
must
create
a
style
attribute
on
the
element
(if
there
isn't
one
already)
and
then
change
its
value
to
be
a
value
representing
the
serialized
form
of
the
CSSStyleDeclaration
object.
[CSSOM]
In
the
following
example,
the
words
that
refer
to
colors
are
marked
up
using
the
span
element
and
the
style
attribute
to
make
those
words
show
up
in
the
relevant
colors
in
visual
media.
<p>My sweat suit is <span style="color: green; background: transparent">green</span> and my eyes are <span style="color: blue; background: transparent">blue</span>.</p>
A
custom
data
attribute
is
an
attribute
in
no
namespace
whose
name
starts
with
the
string
"
data-
",
has
at
least
one
character
after
the
hyphen,
is
XML-compatible
,
and
contains
no
characters
in
the
range
U+0041
..
U+005A
(LATIN
CAPITAL
LETTER
A
..
LATIN
CAPITAL
LETTER
Z).
All attributes in HTML documents get lowercased automatically, so the restriction on uppercase letters doesn't affect such documents.
Custom data attributes are intended to store custom data private to the page or application, for which there are no more appropriate attributes or elements.
These attributes are not intended for use by software that is independent of the site that uses the attributes.
For instance, a site about music could annotate list items representing tracks in an album with custom data attributes containing the length of each track. This information could then be used by the site itself to allow the user to sort the list by track length, or to filter the list for tracks of certain lengths.
<ol> <li data-length="2m11s">Beyond The Sea</li> ... </ol>
It would be inappropriate, however, for the user to use generic software not associated with that music site to search for tracks of a certain length by looking at this data.
This is because these attributes are intended for use by the site's own scripts, and are not a generic extension mechanism for publicly-usable metadata.
Every HTML element may have any number of custom data attributes specified, with any value.
dataset
Returns
a
DOMStringMap
object
for
the
element's
data-*
attributes.
The
dataset
DOM
attribute
provides
convenient
accessors
for
all
the
data-*
attributes
on
an
element.
On
getting,
the
dataset
DOM
attribute
must
return
a
DOMStringMap
object,
associated
with
the
following
algorithms,
which
expose
these
attributes
on
their
element:
data-
",
add
a
name-value
pair
to
list
whose
name
is
the
attribute's
name
with
the
first
five
character
removed
and
whose
value
is
the
attribute's
value.
data-
and
the
name
passed
to
the
algorithm.
setAttribute()
would
have
raised
an
exception
when
setting
an
attribute
with
the
name
name
,
then
this
must
raise
the
same
exception.
data-
and
the
name
passed
to
the
algorithm.
If
a
Web
page
wanted
an
element
to
represent
a
space
ship,
e.g.
as
part
of
a
game,
it
would
have
to
use
the
class
attribute
along
with
data-*
attributes:
<div class="spaceship" data-id="92432"
data-weapons="laser 2" data-shields="50%"
data-x="30" data-y="10" data-z="90">
<button class="fire"
onclick="spaceships[this.parentNode.dataset.id].fire()">
Fire
</button>
</div>
Authors should carefully design such extensions so that when the attributes are ignored and any associated CSS dropped, the page is still usable.
User agents must not derive any implementation behavior from these attributes or values. Specifications intended for user agents must not define these attributes to have any meaningful values.
Each element in this specification has a definition that includes the following information:
This is then followed by a description of what the element represents , along with any additional normative conformance criteria that may apply to authors and implementations . Examples are sometimes also included.
All the elements in this specification have a defined content model, which describes what nodes are allowed inside the elements, and thus what the structure of an HTML document or fragment must look like.
As
noted
in
the
conformance
and
terminology
sections,
for
the
purposes
of
determining
if
an
element
matches
its
content
model
or
not,
CDATASection
nodes
in
the
DOM
are
treated
as
equivalent
to
Text
nodes
,
and
entity
reference
nodes
are
treated
as
if
they
were
expanded
in
place
.
The space characters are always allowed between elements. User agents represent these characters between elements in the source markup as text nodes in the DOM. Empty text nodes and text nodes consisting of just sequences of those characters are considered inter-element whitespace .
Inter-element whitespace , comment nodes, and processing instruction nodes must be ignored when establishing whether an element matches its content model or not, and must be ignored when following algorithms that define document and element semantics.
An element A is said to be preceded or followed by a second element B if A and B have the same parent node and there are no other element nodes or text nodes (other than inter-element whitespace ) between them.
Authors must not use elements in the HTML namespace anywhere except where they are explicitly allowed, as defined for each element, or as explicitly required by other specifications. For XML compound documents, these contexts could be inside elements from other namespaces, if those elements are defined as providing the relevant contexts.
The
Atom
specification
defines
the
Atom
content
element,
when
its
type
attribute
has
the
value
xhtml
,
as
requiring
that
it
contains
a
single
HTML
div
element.
Thus,
a
div
element
is
allowed
in
that
context,
even
though
this
is
not
explicitly
normatively
stated
by
this
specification.
[ATOM]
In addition, elements in the HTML namespace may be orphan nodes (i.e. without a parent node).
For
example,
creating
a
td
element
and
storing
it
in
a
global
variable
in
a
script
is
conforming,
even
though
td
elements
are
otherwise
only
supposed
to
be
used
inside
tr
elements.
var data = {
name: "Banana",
cell: document.createElement('td'),
};
Each element in HTML falls into zero or more categories that group elements with similar characteristics together. The following broad categories are used in this specification:
Some elements also fall into other categories, which are defined in other parts of this specification.
These categories are related as follows:
In addition, certain elements are categorized as form-associated elements and further subcategorized to define their role in various form-related processing models.
Some elements have unique requirements and do not fit into any particular category.
Metadata content is content that sets up the presentation or behavior of the rest of the content, or that sets up the relationship of the document with other documents, or that conveys other "out of band" information.
Elements from other namespaces whose semantics are primarily metadata-related (e.g. RDF) are also metadata content .
Thus, in the XML serialization, one can use RDF, like this:
<html xmlns="http://www.w3.org/1999/xhtml"
xmlns:r="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<head>
<title>Hedral's Home Page</title>
<r:RDF>
<Person xmlns="http://www.w3.org/2000/10/swap/pim/contact#"
r:about="http://hedral.example.com/#">
<fullName>Cat Hedral</fullName>
<mailbox r:resource="mailto:hedral@damowmow.com"/>
<personalTitle>Sir</personalTitle>
</Person>
</r:RDF>
</head>
<body>
<h1>My home page</h1>
<p>I like playing with string, I guess. Sister says squirrels are fun
too so sometimes I follow her to play with them.</p>
</body>
</html>
This isn't possible in the HTML serialization, however.
Most elements that are used in the body of documents and applications are categorized as flow content .
a
abbr
address
area
(if
it
is
a
descendant
of
a
map
element)
article
aside
audio
b
bb
bdo
blockquote
br
button
canvas
cite
code
command
datalist
del
details
dfn
dialog
div
dl
em
embed
fieldset
figure
footer
form
h1
h2
h3
h4
h5
h6
header
hgroup
hr
i
iframe
img
input
ins
kbd
keygen
label
link
(if
the
itemprop
attribute
is
present)
map
mark
math
menu
meta
(if
the
itemprop
attribute
is
present)
meter
nav
noscript
object
ol
output
p
pre
progress
q
ruby
samp
script
section
select
small
span
strong
style
(if
the
scoped
attribute
is
present)
sub
sup
svg
table
textarea
time
ul
var
video
As
a
general
rule,
elements
whose
content
model
allows
any
flow
content
should
have
either
at
least
one
descendant
text
node
that
is
not
inter-element
whitespace
,
or
at
least
one
descendant
element
node
that
is
embedded
content
.
For
the
purposes
of
this
requirement,
del
elements
and
their
descendants
must
not
be
counted
as
contributing
to
the
ancestors
of
the
del
element.
This requirement is not a hard requirement, however, as there are many cases where an element can be empty legitimately, for example when it is used as a placeholder which will later be filled in by a script, or when the element is part of a template and would on most pages be filled in but on some pages is not relevant.
Sectioning content is content that defines the scope of headings and footers .
Each sectioning content element potentially has a heading and an outline . See the section on headings and sections for further details.
There are also certain elements that are sectioning roots . These are distinct from sectioning content , but they can also have an outline .
Heading content defines the header of a section (whether explicitly marked up using sectioning content elements, or implied by the heading content itself).
Phrasing content is the text of the document, as well as elements that mark up that text at the intra-paragraph level. Runs of phrasing content form paragraphs .
a
(if
it
contains
only
phrasing
content
)
abbr
area
(if
it
is
a
descendant
of
a
map
element)
audio
b
bb
bdo
br
button
canvas
cite
code
command
datalist
del
(if
it
contains
only
phrasing
content
)
dfn
em
embed
i
iframe
img
input
ins
(if
it
contains
only
phrasing
content
)
kbd
keygen
label
link
(if
the
itemprop
attribute
is
present)
map
(if
it
contains
only
phrasing
content
)
mark
math
meta
(if
the
itemprop
attribute
is
present)
meter
noscript
object
output
progress
q
ruby
samp
script
select
small
span
strong
sub
sup
svg
textarea
time
var
video
As
a
general
rule,
elements
whose
content
model
allows
any
phrasing
content
should
have
either
at
least
one
descendant
text
node
that
is
not
inter-element
whitespace
,
or
at
least
one
descendant
element
node
that
is
embedded
content
.
For
the
purposes
of
this
requirement,
nodes
that
are
descendants
of
del
elements
must
not
be
counted
as
contributing
to
the
ancestors
of
the
del
element.
Most elements that are categorized as phrasing content can only contain elements that are themselves categorized as phrasing content, not any flow content.
Text , in the context of content models, means text nodes . Text is sometimes used as a content model on its own, but is also phrasing content , and can be inter-element whitespace (if the text nodes are empty or contain just space characters ).
Embedded content is content that imports another resource into the document, or content from another vocabulary that is inserted into the document.
Elements that are from namespaces other than the HTML namespace and that convey content but not metadata, are embedded content for the purposes of the content models defined in this specification. (For example, MathML, or SVG.)
Some embedded content elements can have fallback content : content that is to be used when the external resource cannot be used (e.g. because it is of an unsupported format). The element definitions state what the fallback is, if any.
Interactive content is content that is specifically intended for user interaction.
a
audio
(if
the
controls
attribute
is
present)
bb
button
details
embed
iframe
img
(if
the
usemap
attribute
is
present)
input
(if
the
type
attribute
is
not
in
the
Hidden
state)
keygen
label
menu
(if
the
type
attribute
is
in
the
tool
bar
state)
object
(if
the
usemap
attribute
is
present)
select
textarea
video
(if
the
controls
attribute
is
present)
Certain
elements
in
HTML
have
an
activation
behavior
,
which
means
that
the
user
can
activate
them.
This
triggers
a
sequence
of
events
dependent
on
the
activation
mechanism,
and
normally
culminating
in
a
click
event
followed
by
a
DOMActivate
event
,
as
described
below
.
The user agent should allow the user to manually trigger elements that have an activation behavior , for instance using keyboard or voice input, or through mouse clicks. When the user triggers an element with a defined activation behavior in a manner other than clicking it, the default action of the interaction event must be to run synthetic click activation steps on the element.
When
a
user
agent
is
to
run
synthetic
click
activation
steps
on
an
element,
the
user
agent
must
run
pre-click
activation
steps
on
the
element,
then
fire
a
click
event
at
the
element.
The
default
action
of
this
click
event
must
be
to
run
post-click
activation
steps
on
the
element.
If
the
event
is
canceled,
the
user
agent
must
run
canceled
activation
steps
on
the
element
instead.
Given an element target , the nearest activatable element is the element returned by the following algorithm:
If target has a defined activation behavior , then return target and abort these steps.
If target has a parent element, then set target to that parent element and return to the first step.
Otherwise, there is no nearest activatable element .
When a pointing device is clicked, the user agent must run these steps:
Let e be the nearest activatable element of the element designated by the user, if any.
If there is an element e , run pre-click activation steps on it.
Dispatching
the
required
click
event.
If there is an element e , then the default action of the click event must be to run post-click activation steps on element e .
If there is an element e but the event is canceled, the user agent must run canceled activation steps on element e .
The
above
doesn't
happen
for
arbitrary
synthetic
events
dispatched
by
author
script.
However,
the
click()
method
can
be
used
to
make
it
happen
programmatically.
When a user agent is to run pre-click activation steps on an element, it must run the pre-click activation steps defined for that element, if any.
When
a
user
agent
is
to
run
post-click
activation
steps
on
an
element,
the
user
agent
must
fire
a
simple
event
called
DOMActivate
that
is
cancelable
at
that
element.
The
default
action
of
this
event
must
be
to
run
final
activation
steps
on
that
element.
If
the
event
is
canceled,
the
user
agent
must
run
canceled
activation
steps
on
the
element
instead.
When a user agent is to run canceled activation steps on an element, it must run the canceled activation steps defined for that element, if any.
When
a
user
agent
is
to
run
final
activation
steps
on
an
element,
it
must
run
the
activation
behavior
defined
for
that
element.
Activation
behaviors
can
refer
to
the
click
and
DOMActivate
events
that
were
fired
by
the
steps
above
leading
up
to
this
point.
Some elements are described as transparent ; they have "transparent" in the description of their content model.
When a content model includes a part that is "transparent", those parts must not contain content that would not be conformant if all transparent elements in the tree were replaced, in their parent element, by the children in the "transparent" part of their content model, retaining order.
When a transparent element has no parent, then the part of its content model that is "transparent" must instead be treated as accepting any flow content .
The
term
paragraph
as
defined
in
this
section
is
distinct
from
(though
related
to)
the
p
element
defined
later.
The
paragraph
concept
defined
here
is
used
to
describe
how
to
interpret
documents.
A paragraph is typically a run of phrasing content that forms a block of text with one or more sentences that discuss a particular topic, as in typography, but can also be used for more general thematic grouping. For instance, an address is also a paragraph, as is a part of a form, a byline, or a stanza in a poem.
In the following example, there are two paragraphs in a section. There is also a heading, which contains phrasing content that is not a paragraph. Note how the comments and inter-element whitespace do not form paragraphs.
<section> <h1>Example of paragraphs</h1> This is the <em>first</em> paragraph in this example. <p>This is the second.</p> <!-- This is not a paragraph. --> </section>
Paragraphs
in
flow
content
are
defined
relative
to
what
the
document
looks
like
without
the
a
,
ins
,
del
,
and
map
elements
complicating
matters,
since
those
elements,
with
their
hybrid
content
models,
can
straddle
paragraph
boundaries,
as
shown
in
the
first
two
examples
below.
Generally, having elements straddle paragraph boundaries is best avoided. Maintaining such markup can be difficult.
The
following
example
takes
the
markup
from
the
earlier
example
and
puts
ins
and
del
elements
around
some
of
the
markup
to
show
that
the
text
was
changed
(though
in
this
case,
the
changes
admittedly
don't
make
much
sense).
Notice
how
this
example
has
exactly
the
same
paragraphs
as
the
previous
one,
despite
the
ins
and
del
elements
—
the
ins
element
straddles
the
heading
and
the
first
paragraph,
and
the
del
element
straddles
the
boundary
between
the
two
paragraphs.
<section> <ins><h1>Example of paragraphs</h1> This is the <em>first</em> paragraph in</ins> this example<del>. <p>This is the second.</p></del> <!-- This is not a paragraph. --> </section>
Let
view
be
a
view
of
the
DOM
that
replaces
all
a
,
ins
,
del
,
and
map
elements
in
the
document
with
their
contents.
Then,
in
view
,
for
each
run
of
sibling
phrasing
content
nodes
uninterrupted
by
other
types
of
content,
in
an
element
that
accepts
content
other
than
phrasing
content
,
let
first
be
the
first
node
of
the
run,
and
let
last
be
the
last
node
of
the
run.
For
each
such
run
that
consists
of
at
least
one
node
that
is
neither
embedded
content
nor
inter-element
whitespace
,
a
paragraph
exists
in
the
original
DOM
from
immediately
before
first
to
immediately
after
last
.
(Paragraphs
can
thus
span
across
a
,
ins
,
del
,
and
map
elements.)
Conformance
checkers
may
warn
authors
of
cases
where
they
have
paragraphs
that
overlap
each
other
(this
can
happen
with
object
,
video
,
audio
,
and
canvas
elements).
A
paragraph
is
also
formed
explicitly
by
p
elements.
The
p
element
can
be
used
to
wrap
individual
paragraphs
when
there
would
otherwise
not
be
any
content
other
than
phrasing
content
to
separate
the
paragraphs
from
each
other.
In the following example, the link spans half of the first paragraph, all of the heading separating the two paragraphs, and half of the second paragraph. It straddles the paragraphs and the heading.
<aside> Welcome! <a href="about.html"> This is home of... <h1>The Falcons!</h1> The Lockheed Martin multirole jet fighter aircraft! </a> This page discusses the F-16 Fighting Falcon's innermost secrets. </aside>
Here is another way of marking this up, this time showing the paragraphs explicitly, and splitting the one link element into three:
<aside> <p>Welcome! <a href="about.html">This is home of...</a></p> <h1><a href="about.html">The Falcons!</a></h1> <p><a href="about.html">The Lockheed Martin multirole jet fighter aircraft!</a> This page discusses the F-16 Fighting Falcon's innermost secrets.</p> </aside>
It is possible for paragraphs to overlap when using certain elements that define fallback content. For example, in the following section:
<section> <h1>My Cats</h1> You can play with my cat simulator. <object data="cats.sim"> To see the cat simulator, use one of the following links: <ul> <li><a href="cats.sim">Download simulator file</a> <li><a href="http://sims.example.com/watch?v=LYds5xY4INU">Use online simulator</a> </ul> Alternatively, upgrade to the Mellblom Browser. </object> I'm quite proud of it. </section>
There are five paragraphs:
object
element.
The first paragraph is overlapped by the other four. A user agent that supports the "cats.sim" resource will only show the first one, but a user agent that shows the fallback will confusingly show the first sentence of the first paragraph as if it was in the same paragraph as the second one, and will show the last paragraph as if it was at the start of the second sentence of the first paragraph.
To
avoid
this
confusion,
explicit
p
elements
can
be
used.
For HTML documents , and for HTML elements in HTML documents , certain APIs defined in DOM Core become case-insensitive or case-changing, as sometimes defined in DOM Core, and as summarized or required below. [DOMCORE] .
This does not apply to XML documents or to elements that are not in the HTML namespace despite being in HTML documents .
Element.tagName
and
Node.nodeName
These attributes must return element names converted to ASCII uppercase , regardless of the case with which they were created.
Document.createElement()
The canonical form of HTML markup is all-lowercase; thus, this method will lowercase the argument before creating the requisite element. Also, the element created must be in the HTML namespace .
This
doesn't
apply
to
Document.createElementNS()
.
Thus,
it
is
possible,
by
passing
this
last
method
a
tag
name
in
the
wrong
case,
to
create
an
element
that
claims
to
have
the
tag
name
of
an
element
defined
in
this
specification,
but
doesn't
support
its
interfaces,
because
it
really
has
another
tag
name
not
accessible
from
the
DOM
APIs.
Element.setAttribute()
Element.setAttributeNode()
Attribute names are converted to ASCII lowercase .
Specifically:
when
an
attribute
is
set
on
an
HTML
element
using
Element.setAttribute()
,
the
name
argument
must
be
converted
to
ASCII
lowercase
before
the
element
is
affected;
and
when
an
Attr
node
is
set
on
an
HTML
element
using
Element.setAttributeNode()
,
it
must
have
its
name
converted
to
ASCII
lowercase
before
the
element
is
affected.
This
doesn't
apply
to
Document.setAttributeNS()
and
Document.setAttributeNodeNS()
.
Element.getAttribute()
Element.getAttributeNode()
Attribute names are converted to ASCII lowercase .
Specifically:
When
the
Element.getAttribute()
method
or
the
Element.getAttributeNode()
method
is
invoked
on
an
HTML
element
,
the
name
argument
must
be
converted
to
ASCII
lowercase
before
the
element's
attributes
are
examined.
This
doesn't
apply
to
Document.getAttributeNS()
and
Document.getAttributeNodeNS()
.
Document.getElementsByTagName()
Element.getElementsByTagName()
HTML elements match by lower-casing the argument before comparison, elements from other namespaces are treated as in XML (case-sensitively).
Specifically, these methods (but not their namespaced counterparts) must compare the given argument in a case-sensitive manner, but when looking at HTML elements , the argument must first be converted to ASCII lowercase .
Thus, in an HTML document with nodes in multiple namespaces, these methods will effectively be both case-sensitive and case-insensitive at the same time.
Implementations
of
XPath
1.0
that
operate
on
HTML
documents
parsed
or
created
in
the
manners
described
in
this
specification
(e.g.
as
part
of
the
document.evaluate()
API)
are
affected
as
follows:
In addition to the cases where a name expression would match a node per XPath 1.0, a name expression must evaluate to matching a node when all the following conditions are also met:
These requirements are a willful violation of the XPath 1.0 specification, motivated by desire to have implementations be compatible with legacy content while still supporting the changes that this specification introduces to HTML regarding which namespace is used for HTML elements. [XPATH10]
XSLT 1.0 processors outputting to a DOM when the output method is "html" (either explicitly or via the defaulting rule in XSLT 1.0) are affected as follows:
If the transformation program outputs an element in no namespace, the processor must, prior to constructing the corresponding DOM element node, change the namespace of the element to the HTML namespace , ASCII-lowercase the element's local name, and ASCII-lowercase the names of any non-namespaced attributes on the element.
This requirement is a willful violation of the XSLT 1.0 specification, required because this specification changes the namespaces and case-sensitivity rules of HTML in a manner that would otherwise be incompatible with DOM-based XSLT transformations. (Processors that serialize the output are unaffected.) [XSLT10]
APIs for dynamically inserting markup into the document interact with the parser, and thus their behavior, varies depending on whether they are used with HTML documents (and the HTML parser ) or XHTML in XML documents (and the XML parser ).
The
open()
method
comes
in
several
variants
with
different
numbers
of
arguments.
open
(
[
type
[,
replace
]
]
)
Causes
the
Document
to
be
replaced
in-place,
as
if
it
was
a
new
Document
object,
but
reusing
the
previous
object,
which
is
then
returned.
If
the
type
argument
is
omitted
or
has
the
value
"
text/html
",
then
the
resulting
Document
has
an
HTML
parser
associated
with
it,
which
can
be
given
data
to
parse
using
document.write()
.
Otherwise,
all
content
passed
to
document.write()
will
be
parsed
as
plain
text.
If
the
replace
argument
is
absent
or
false,
a
new
entry
is
added
to
the
session
history
to
represent
this
entry,
and
the
previous
entries
for
this
Document
are
all
collapsed
into
one
entry
with
a
new
Document
object.
The
method
has
no
effect
if
the
Document
is
still
being
parsed.
open
(
url
,
name
,
features
[,
replace
]
)
Works
like
the
window.open()
method.
close
()
Closes
the
input
stream
that
was
opened
by
the
document.open()
method.
When called with two or fewer arguments, the method must act as follows:
Let
type
be
the
value
of
the
first
argument,
if
there
is
one,
or
"
text/html
"
otherwise.
Let replace be true if there is a second argument and it is an ASCII case-insensitive match for the value "replace", and false otherwise.
If
the
document
has
an
active
parser
that
isn't
a
script-created
parser
,
and
the
insertion
point
associated
with
that
parser's
input
stream
is
not
undefined
(that
is,
it
does
point
to
somewhere
in
the
input
stream),
then
the
method
does
nothing.
Abort
these
steps
and
return
the
Document
object
on
which
the
method
was
invoked.
This
basically
causes
document.open()
to
be
ignored
when
it's
called
in
an
inline
script
found
during
the
parsing
of
data
sent
over
the
network,
while
still
letting
it
have
an
effect
when
called
asynchronously
or
on
a
document
that
is
itself
being
spoon-fed
using
these
APIs.
Unload
the
Document
object,
with
the
recycle
parameter
set
to
true.
If
the
user
refused
to
allow
the
document
to
be
unloaded
,
then
these
steps
must
be
aborted.
If the document has an active parser , then abort that parser, and throw away any pending content in the input stream .
Unregister
all
event
listeners
registered
on
the
Document
node
and
its
descendants.
Remove
any
tasks
associated
with
the
Document
in
any
task
source
.
Remove all child nodes of the document, without firing any mutation events.
Replace
the
Document
's
singleton
objects
with
new
instances
of
those
objects.
(This
includes
in
particular
the
Window
,
Location
,
History
,
ApplicationCache
,
UndoManager
,
Navigator
,
and
Selection
objects,
the
various
BarProp
objects,
the
two
Storage
objects,
and
the
various
HTMLCollection
objects.
It
also
includes
all
the
Web
IDL
prototypes
in
the
JavaScript
binding,
including
the
Document
object's
prototype.)
Change the document's character encoding to UTF-16.
Change the document's address to the first script 's browsing context 's active document 's address .
Create
a
new
HTML
parser
and
associate
it
with
the
document.
This
is
a
script-created
parser
(meaning
that
it
can
be
closed
by
the
document.open()
and
document.close()
methods,
and
that
the
tokenizer
will
wait
for
an
explicit
call
to
document.close()
before
emitting
an
end-of-file
token).
The
encoding
confidence
is
irrelevant
.
If the type string contains a U+003B SEMICOLON (;) character, remove the first such character and all characters from it up to the end of the string.
Strip all leading and trailing space characters from type .
If
type
is
not
now
an
ASCII
case-insensitive
match
for
the
string
"
text/html
",
then
act
as
if
the
tokenizer
had
emitted
a
start
tag
token
with
the
tag
name
"pre",
then
set
the
HTML
parser
's
tokenization
stage's
content
model
flag
to
PLAINTEXT
.
If replace is false, then:
Document
's
History
object
Document
Document
object,
as
well
as
the
state
of
the
document
at
the
start
of
these
steps.
(This
allows
the
user
to
step
backwards
in
the
session
history
to
see
the
page
before
it
was
blown
away
by
the
document.open()
call.)
Finally, set the insertion point to point at just before the end of the input stream (which at this point will be empty).
Return
the
Document
on
which
the
method
was
invoked.
When
called
with
three
or
more
arguments,
the
open()
method
on
the
HTMLDocument
object
must
call
the
open()
method
on
the
Window
object
of
the
HTMLDocument
object,
with
the
same
arguments
as
the
original
call
to
the
open()
method,
and
return
whatever
that
method
returned.
If
the
HTMLDocument
object
has
no
Window
object,
then
the
method
must
raise
an
INVALID_ACCESS_ERR
exception.
The
close()
method
must
do
nothing
if
there
is
no
script-created
parser
associated
with
the
document.
If
there
is
such
a
parser,
then,
when
the
method
is
called,
the
user
agent
must
insert
an
explicit
"EOF"
character
at
the
end
of
the
parser's
input
stream
.
document.write()
write
(
text
...)
Adds
the
given
string(s)
to
the
Document
's
input
stream.
If
necessary,
calls
the
open()
method
implicitly
first.
This
method
throws
an
INVALID_ACCESS_ERR
exception
when
invoked
on
XML
documents
.
The
document.write(...)
method
must
act
as
follows:
If
the
method
was
invoked
on
an
XML
document
,
throw
an
INVALID_ACCESS_ERR
exception
and
abort
these
steps.
If
the
insertion
point
is
undefined,
the
open()
method
must
be
called
(with
no
arguments)
on
the
document
object.
If
the
user
refused
to
allow
the
document
to
be
unloaded
,
then
these
steps
must
be
aborted.
Otherwise,
the
insertion
point
will
point
at
just
before
the
end
of
the
(empty)
input
stream
.
The string consisting of the concatenation of all the arguments to the method must be inserted into the input stream just before the insertion point .
If there is a pending external script , then the method must now return without further processing of the input stream .
Otherwise,
the
tokenizer
must
process
the
characters
that
were
inserted,
one
at
a
time,
processing
resulting
tokens
as
they
are
emitted,
and
stopping
when
the
tokenizer
reaches
the
insertion
point
or
when
the
processing
of
the
tokenizer
is
aborted
by
the
tree
construction
stage
(this
can
happen
if
a
script
end
tag
token
is
emitted
by
the
tokenizer).
If
the
document.write()
method
was
called
from
script
executing
inline
(i.e.
executing
because
the
parser
parsed
a
set
of
script
tags),
then
this
is
a
reentrant
invocation
of
the
parser
.
Finally, the method must return.
document.writeln()
writeln
(
text
...)
Adds
the
given
string(s)
to
the
Document
's
input
stream,
followed
by
a
newline
character.
If
necessary,
calls
the
open()
method
implicitly
first.
This
method
throws
an
INVALID_ACCESS_ERR
exception
when
invoked
on
XML
documents
.
The
document.writeln(...)
method,
when
invoked,
must
act
as
if
the
document.write()
method
had
been
invoked
with
the
same
argument(s),
plus
an
extra
argument
consisting
of
a
string
containing
a
single
line
feed
character
(U+000A).
innerHTML
The
innerHTML
DOM
attribute
represents
the
markup
of
the
node's
contents.
innerHTML
[
=
value
]
Returns
a
fragment
of
HTML
or
XML
that
represents
the
Document
.
Can
be
set,
to
replace
the
Document
's
contents
with
the
result
of
parsing
the
given
string.
In
the
case
of
XML
documents
,
will
throw
a
SYNTAX_ERR
if
the
Document
cannot
be
serialized
to
XML,
or
if
the
given
string
is
not
well-formed.
innerHTML
[
=
value
]
Returns a fragment of HTML or XML that represents the element's contents.
Can be set, to replace the contents of the element with nodes parsed from the given string.
In
the
case
of
XML
documents
,
will
throw
a
SYNTAX_ERR
if
the
element
cannot
be
serialized
to
XML,
or
if
the
given
string
is
not
well-formed.
On getting, if the node's document is an HTML document , then the attribute must return the result of running the HTML fragment serialization algorithm on the node; otherwise, the node's document is an XML document , and the attribute must return the result of running the XML fragment serialization algorithm on the node instead (this might raise an exception instead of returning a string).
On setting, the following steps must be run:
If the node's document is an HTML document : Invoke the HTML fragment parsing algorithm .
If the node's document is an XML document : Invoke the XML fragment parsing algorithm .
In
either
case,
the
algorithm
must
be
invoked
with
the
string
being
assigned
into
the
innerHTML
attribute
as
the
input
.
If
the
node
is
an
Element
node,
then,
in
addition,
that
element
must
be
passed
as
the
context
element.
If this raises an exception, then abort these steps.
Otherwise, let new children be the nodes returned.
If
the
attribute
is
being
set
on
a
Document
node,
and
that
document
has
an
active
parser
,
then
abort
that
parser.
Remove
the
child
nodes
of
the
node
whose
innerHTML
attribute
is
being
set,
firing
appropriate
mutation
events.
If
the
attribute
is
being
set
on
a
Document
node,
let
target
document
be
that
Document
node.
Otherwise,
the
attribute
is
being
set
on
an
Element
node;
let
target
document
be
the
ownerDocument
of
that
Element
.
Set
the
ownerDocument
of
all
the
nodes
in
new
children
to
the
target
document
.
Append
all
the
new
children
nodes
to
the
node
whose
innerHTML
attribute
is
being
set,
preserving
their
order,
and
firing
mutation
events
as
if
a
DocumentFragment
containing
the
new
children
had
been
inserted.
outerHTML
The
outerHTML
DOM
attribute
represents
the
markup
of
the
element
and
its
contents.
outerHTML
[
=
value
]
Returns a fragment of HTML or XML that represents the element and its contents.
Can be set, to replace the element with nodes parsed from the given string.
In
the
case
of
XML
documents
,
will
throw
a
SYNTAX_ERR
if
the
element
cannot
be
serialized
to
XML,
or
if
the
given
string
is
not
well-formed.
On getting, if the node's document is an HTML document , then the attribute must return the result of running the HTML fragment serialization algorithm on a fictional node whose only child is the node on which the attribute was invoked; otherwise, the node's document is an XML document , and the attribute must return the result of running the XML fragment serialization algorithm on that fictional node instead (this might raise an exception instead of returning a string).
On setting, the following steps must be run:
Let
target
be
the
element
whose
outerHTML
attribute
is
being
set.
If target has no parent node, then abort these steps. There would be no way to obtain a reference to the nodes created even if the remaining steps were run.
If
target
's
parent
node
is
a
Document
object,
throw
a
NO_MODIFICATION_ALLOWED_ERR
exception
and
abort
these
steps.
Let
parent
be
target
's
parent
node,
unless
that
is
a
DocumentFragment
node,
in
which
case
let
parent
be
an
arbitrary
body
element.
If target 's document is an HTML document : Invoke the HTML fragment parsing algorithm .
If target 's document is an XML document : Invoke the XML fragment parsing algorithm .
In
either
case,
the
algorithm
must
be
invoked
with
the
string
being
assigned
into
the
outerHTML
attribute
as
the
input
,
and
parent
as
the
context
element.
If this raises an exception, then abort these steps.
Otherwise, let new children be target s returned.
Set
the
ownerDocument
of
all
the
nodes
in
new
children
to
target
's
document.
Remove
target
from
its
parent
node,
firing
mutation
events
as
appropriate,
and
then
insert
in
its
place
all
the
new
children
nodes,
preserving
their
order,
and
again
firing
mutation
events
as
if
a
DocumentFragment
containing
the
new
children
had
been
inserted.
insertAdjacentHTML()
insertAdjacentHTML
(
position
,
text
)
Parsed the given string text as HTML or XML and inserts the resulting nodes into the tree in the position given by the position argument, as follows:
Throws
a
SYNTAX_ERR
exception
the
arguments
have
invalid
values
(e.g.,
in
the
case
of
XML
documents
,
if
the
given
string
is
not
well-formed).
Throws
a
NO_MODIFICATION_ALLOWED_ERR
exception
if
the
given
position
isn't
possible
(e.g.
inserting
elements
after
the
root
element
of
a
Document
).
The
insertAdjacentHTML(
position
,
text
)
method,
when
invoked,
must
run
the
following
algorithm:
Let position and text be the method's first and second arguments, respectively.
Let target be the element on which the method was invoked.
Use the first matching item from this list:
If target has no parent node, then abort these steps.
If
target
's
parent
node
is
a
Document
object,
then
throw
a
NO_MODIFICATION_ALLOWED_ERR
exception
and
abort
these
steps.
Otherwise, let context be the parent node of target .
Let context be the same as target .
Throw
a
SYNTAX_ERR
exception.
If target 's document is an HTML document : Invoke the HTML fragment parsing algorithm .
If target 's document is an XML document : Invoke the XML fragment parsing algorithm .
In either case, the algorithm must be invoked with text as the input , and the element selected in by the previous step as the context element.
If this raises an exception, then abort these steps.
Otherwise, let new children be target s returned.
Set
the
ownerDocument
of
all
the
nodes
in
new
children
to
target
's
document.
Use the first matching item from this list:
Insert all the new children nodes immediately before target .
Insert all the new children nodes before the first child of target , if there is one. If there is no such child, append them all to target .
Append all the new children nodes to target .
Insert all the new children nodes immediately after target .
The
new
children
nodes
must
be
inserted
in
a
manner
that
preserves
their
order
and
fires
mutation
events
as
if
a
DocumentFragment
containing
the
new
children
had
been
inserted.
html
element
head
element
followed
by
a
body
element.
manifest
interface
HTMLHtmlElement
:
HTMLElement
{};
The
html
element
represents
the
root
of
an
HTML
document.
The
manifest
attribute
gives
the
address
of
the
document's
application
cache
manifest
,
if
there
is
one.
If
the
attribute
is
present,
the
attribute's
value
must
be
a
valid
URL
.
The
manifest
attribute
only
has
an
effect
during
the
early
stages
of
document
load.
Changing
the
attribute
dynamically
thus
has
no
effect
(and
thus,
no
DOM
API
is
provided
for
this
attribute).
For
the
purposes
of
application
cache
selection
,
later
base
elements
cannot
affect
the
resolving
of
relative
URLs
in
manifest
attributes,
as
the
attributes
are
processed
before
those
elements
are
seen.
head
element
The
removal
of
the
profile
attribute
from
the
head
element
is
controversial
and
currently
does
not
enjoy
consensus.
Microformats
and
GRDDL
require
the
existence
of
the
profile
attribute.
There
is
currently
an
active
issue
in
HTML
WG
that
is
tracking
the
discussion
of
the
profile
attribute
on
the
head
element
in
HTML5.
html
element.
title
element.
interface
HTMLHeadElement
:
HTMLElement
{};
The
head
element
represents
a
collection
of
metadata
for
the
Document
.
title
element
head
element
containing
no
other
title
elements.
interface HTMLTitleElement : HTMLElement {
attribute DOMString text;
};
The
title
element
represents
the
document's
title
or
name.
Authors
should
use
titles
that
identify
their
documents
even
when
they
are
used
out
of
context,
for
example
in
a
user's
history
or
bookmarks,
or
in
search
results.
The
document's
title
is
often
different
from
its
first
heading,
since
the
first
heading
does
not
have
to
stand
alone
when
taken
out
of
context.
There
must
be
no
more
than
one
title
element
per
document.
The
title
element
must
not
contain
any
elements.
The
text
DOM
attribute
must
return
the
same
value
as
the
textContent
DOM
attribute
on
the
element.
Here are some examples of appropriate titles, contrasted with the top-level headings that might be used on those same pages.
<title>Introduction to The Mating Rituals of Bees</title>
...
<h1>Introduction</h1>
<p>This companion guide to the highly successful
<cite>Introduction to Medieval Bee-Keeping</cite> book is...
The next page might be a part of the same site. Note how the title describes the subject matter unambiguously, while the first heading assumes the reader knows what the context is and therefore won't wonder if the dances are Salsa or Waltz:
<title>Dances used during bee mating rituals</title>
...
<h1>The
Dances</h1>
The
string
to
use
as
the
document's
title
is
given
by
the
document.title
DOM
attribute.
User
agents
should
use
the
document's
title
when
referring
to
the
document
in
their
user
interface.
base
element
head
element
containing
no
other
base
elements.
href
target
interface HTMLBaseElement : HTMLElement {
attribute DOMString href;
attribute DOMString target;
};
The
base
element
allows
authors
to
specify
the
document
base
URL
for
the
purposes
of
resolving
relative
URLs
,
and
the
name
of
the
default
browsing
context
for
the
purposes
of
following
hyperlinks
.
The
element
does
not
represent
any
content
beyond
this
information.
There
must
be
no
more
than
one
base
element
per
document.
A
base
element
must
have
either
an
href
attribute,
a
target
attribute,
or
both.
The
href
content
attribute,
if
specified,
must
contain
a
valid
URL
.
A
base
element,
if
it
has
an
href
attribute,
must
come
before
any
other
elements
in
the
tree
that
have
attributes
defined
as
taking
URLs
,
except
the
html
element
(its
manifest
attribute
isn't
affected
by
base
elements).
The
target
attribute,
if
specified,
must
contain
a
valid
browsing
context
name
or
keyword
,
which
specifies
which
browsing
context
is
to
be
used
as
the
default
when
hyperlinks
and
forms
in
the
Document
cause
navigation
.
A
base
element,
if
it
has
a
target
attribute,
must
come
before
any
elements
in
the
tree
that
represent
hyperlinks
.
If
there
are
multiple
base
elements
with
target
attributes,
all
but
the
first
are
ignored.
The
href
and
target
DOM
attributes
must
reflect
the
respective
content
attributes
of
the
same
name.
link
element
itemprop
attribute
is
present:
flow
content
.
itemprop
attribute
is
present:
phrasing
content
.
noscript
element
that
is
a
child
of
a
head
element.
itemprop
attribute
is
present:
where
phrasing
content
is
expected.
href
rel
media
hreflang
type
sizes
title
attribute
has
special
semantics
on
this
element.
interface HTMLLinkElement : HTMLElement {
attribute boolean disabled;
attribute DOMString href;
attribute DOMString rel;
readonly attribute DOMTokenList relList;
attribute DOMString media;
attribute DOMString hreflang;
attribute DOMString type;
attribute DOMString sizes;
};
HTMLLinkElement
implements
LinkStyle
;
The
link
element
allows
authors
to
link
their
document
to
other
resources.
The
destination
of
the
link(s)
is
given
by
the
href
attribute,
which
must
be
present
and
must
contain
a
valid
URL
.
If
the
href
attribute
is
absent,
then
the
element
does
not
define
a
link.
The
types
of
link
indicated
(the
relationships)
are
given
by
the
value
of
the
rel
attribute,
which
must
be
present,
and
must
have
a
value
that
is
a
set
of
space-separated
tokens
.
The
allowed
values
and
their
meanings
are
defined
in
a
later
section.
If
the
rel
attribute
is
absent,
or
if
the
values
used
are
not
allowed
according
to
the
definitions
in
this
specification,
then
the
element
does
not
define
a
link.
Two
categories
of
links
can
be
created
using
the
link
element.
Links
to
external
resources
are
links
to
resources
that
are
to
be
used
to
augment
the
current
document,
and
hyperlink
links
are
links
to
other
documents
.
The
link
types
section
defines
whether
a
particular
link
type
is
an
external
resource
or
a
hyperlink.
One
element
can
create
multiple
links
(of
which
some
might
be
external
resource
links
and
some
might
be
hyperlinks);
exactly
which
and
how
many
links
are
created
depends
on
the
keywords
given
in
the
rel
attribute.
User
agents
must
process
the
links
on
a
per-link
basis,
not
a
per-element
basis.
Each
link
is
handled
separately.
For
instance,
if
there
are
two
link
elements
with
rel="stylesheet"
,
they
each
count
as
a
separate
external
resource,
and
each
is
affected
by
its
own
attributes
independently.
The
exact
behavior
for
links
to
external
resources
depends
on
the
exact
relationship,
as
defined
for
the
relevant
link
type.
Some
of
the
attributes
control
whether
or
not
the
external
resource
is
to
be
applied
(as
defined
below).
For
external
resources
that
are
represented
in
the
DOM
(for
example,
style
sheets),
the
DOM
representation
must
be
made
available
even
if
the
resource
is
not
applied.
To
obtain
the
resource,
the
user
agent
must
resolve
the
URL
given
by
the
href
attribute,
relative
to
the
element,
and
then
fetch
the
resulting
absolute
URL
.
User
agents
may
opt
to
only
fetch
such
resources
when
they
are
needed,
instead
of
pro-actively
fetching
all
the
external
resources
that
are
not
applied.
The semantics of the protocol used (e.g. HTTP) must be followed when fetching external resources. (For example, redirects must be followed and 404 responses must cause the external resource to not be applied.)
Fetching external resources must delay the load event of the element's document until the task that is queued by the networking task source once the resource has been fetched (defined below) has been run.
The
task
that
is
queued
by
the
networking
task
source
once
the
resource
has
been
fetched
must,
if
the
loads
were
successful,
queue
a
task
to
fire
a
simple
event
called
load
at
the
link
element;
otherwise,
if
the
resource
or
one
of
its
subresources
failed
to
completely
load
for
any
reason
(e.g.
DNS
error,
HTTP
404
response,
a
connection
being
prematurely
closed,
unsupported
Content-Type),
it
must
instead
queue
a
task
to
fire
a
simple
event
called
error
at
the
link
element.
Non-network
errors
in
processing
the
resource
or
its
subresources
(e.g.
CSS
parse
errors,
PNG
decoding
errors)
are
not
failures
for
the
purposes
of
this
paragraph.
The task source for these tasks is the DOM manipulation task source .
Interactive
user
agents
should
provide
users
with
a
means
to
follow
the
hyperlinks
created
using
the
link
element,
somewhere
within
their
user
interface.
The
exact
interface
is
not
defined
by
this
specification,
but
it
should
include
the
following
information
(obtained
from
the
element's
attributes,
again
as
defined
below),
in
some
form
or
another
(possibly
simplified),
for
each
hyperlink
created
with
each
link
element
in
the
document:
rel
attribute)
title
attribute).
href
attribute).
hreflang
attribute).
media
attribute).
User
agents
may
also
include
other
information,
such
as
the
type
of
the
resource
(as
given
by
the
type
attribute).
Hyperlinks
created
with
the
link
element
and
its
rel
attribute
apply
to
the
whole
page.
This
contrasts
with
the
rel
attribute
of
a
and
area
elements,
which
indicates
the
type
of
a
link
whose
context
is
given
by
the
link's
location
within
the
document.
The
media
attribute
says
which
media
the
resource
applies
to.
The
value
must
be
a
valid
media
query
.
[MQ]
If
the
link
is
a
hyperlink
then
the
media
attribute
is
purely
advisory,
and
describes
for
which
media
the
document
in
question
was
designed.
However,
if
the
link
is
an
external
resource
link
,
then
the
media
attribute
is
prescriptive.
The
user
agent
must
apply
the
external
resource
to
views
while
their
state
match
the
listed
media
and
the
other
relevant
conditions
apply,
and
must
not
apply
them
otherwise.
The
external
resource
might
have
further
restrictions
defined
within
that
limit
its
applicability.
For
example,
a
CSS
style
sheet
might
have
some
@media
blocks.
This
specification
does
not
override
such
further
restrictions
or
requirements.
The
default,
if
the
media
attribute
is
omitted,
is
all
,
meaning
that
by
default
links
apply
to
all
media.
The
hreflang
attribute
on
the
link
element
has
the
same
semantics
as
the
hreflang
attribute
on
hyperlink
elements
.
The
type
attribute
gives
the
MIME
type
of
the
linked
resource.
It
is
purely
advisory.
The
value
must
be
a
valid
MIME
type,
optionally
with
parameters.
[RFC2046]
For
external
resource
links
,
the
type
attribute
is
used
as
a
hint
to
user
agents
so
that
they
can
avoid
fetching
resources
they
do
not
support.
If
the
attribute
is
present,
then
the
user
agent
must
assume
that
the
resource
is
of
the
given
type.
If
the
attribute
is
omitted,
but
the
external
resource
link
type
has
a
default
type
defined,
then
the
user
agent
must
assume
that
the
resource
is
of
that
type.
If
the
UA
does
not
support
the
given
MIME
type
for
the
given
link
relationship,
then
the
UA
should
not
fetch
the
resource;
if
the
UA
does
support
the
given
MIME
type
for
the
given
link
relationship,
then
the
UA
should
fetch
the
resource.
If
the
attribute
is
omitted,
and
the
external
resource
link
type
does
not
have
a
default
type
defined,
but
the
user
agent
would
fetch
the
resource
if
the
type
was
known
and
supported,
then
the
user
agent
should
fetch
the
resource
under
the
assumption
that
it
will
be
supported.
User
agents
must
not
consider
the
type
attribute
authoritative
—
upon
fetching
the
resource,
user
agents
must
not
use
the
type
attribute
to
determine
its
actual
type.
Only
the
actual
type
(as
defined
in
the
next
paragraph)
is
used
to
determine
whether
to
apply
the
resource,
not
the
aforementioned
assumed
type.
If the external resource link type defines rules for processing the resource's Content-Type metadata , then those rules apply. Otherwise, if the resource is expected to be an image, user agents may apply the image sniffing rules , with the official type being the type determined from the resource's Content-Type metadata , and use the resulting sniffed type of the resource as if it was the actual type. Otherwise, if neither of these conditions apply or if the user agent opts not to apply the image sniffing rules, then the user agent must use the resource's Content-Type metadata to determine the type of the resource. If there is no type metadata, but the external resource link type has a default type defined, then the user agent must assume that the resource is of that type.
The
stylesheet
link
type
defines
rules
for
processing
the
resource's
Content-Type
metadata
.
Once the user agent has established the type of the resource, the user agent must apply the resource if it is of a supported type and the other relevant conditions apply, and must ignore the resource otherwise.
If a document contains style sheet links labeled as follows:
<link rel="stylesheet" href="A" type="text/plain"> <link rel="stylesheet" href="B" type="text/css"> <link rel="stylesheet" href="C">
...then
a
compliant
UA
that
supported
only
CSS
style
sheets
would
fetch
the
B
and
C
files,
and
skip
the
A
file
(since
text/plain
is
not
the
MIME
type
for
CSS
style
sheets).
For
files
B
and
C,
it
would
then
check
the
actual
types
returned
by
the
server.
For
those
that
are
sent
as
text/css
,
it
would
apply
the
styles,
but
for
those
labeled
as
text/plain
,
or
any
other
type,
it
would
not.
If
one
the
two
files
was
returned
without
a
Content-Type
metadata,
or
with
a
syntactically
incorrect
type
like
Content-Type: "null"
,
then
the
default
type
for
stylesheet
links
would
kick
in.
Since
that
default
type
is
text/css
,
the
style
sheet
would
nonetheless
be
applied.
The
title
attribute
gives
the
title
of
the
link.
With
one
exception,
it
is
purely
advisory.
The
value
is
text.
The
exception
is
for
style
sheet
links,
where
the
title
attribute
defines
alternative
style
sheet
sets
.
The
title
attribute
on
link
elements
differs
from
the
global
title
attribute
of
most
other
elements
in
that
a
link
without
a
title
does
not
inherit
the
title
of
the
parent
element:
it
merely
has
no
title.
The
sizes
attribute
is
used
with
the
icon
link
type.
The
attribute
must
not
be
specified
on
link
elements
that
do
not
have
a
rel
attribute
that
specifies
the
icon
keyword.
Some
versions
of
HTTP
defined
a
Link:
header,
to
be
processed
like
a
series
of
link
elements.
If
supported,
for
the
purposes
of
ordering
links
defined
by
HTTP
headers
must
be
assumed
to
come
before
any
links
in
the
document,
in
the
order
that
they
were
given
in
the
HTTP
entity
header.
(URIs
in
these
headers
are
to
be
processed
and
resolved
according
to
the
rules
given
in
HTTP;
the
rules
of
this
specification
don't
apply.)
[HTTP]
[WEBLINK]
The
DOM
attributes
href
,
rel
,
media
,
hreflang
,
and
type
,
and
sizes
each
must
reflect
the
respective
content
attributes
of
the
same
name.
The
DOM
attribute
relList
must
reflect
the
rel
content
attribute.
The
DOM
attribute
disabled
only
applies
to
style
sheet
links.
When
the
link
element
defines
a
style
sheet
link,
then
the
disabled
attribute
behaves
as
defined
for
the
alternative
style
sheets
DOM
.
For
all
other
link
elements
it
always
return
false
and
does
nothing
on
setting.
The
LinkStyle
interface
is
also
be
implemented
by
this
element;
the
styling
processing
model
defines
how.
[CSSOM]
meta
element
itemprop
attribute
is
present:
flow
content
.
itemprop
attribute
is
present:
phrasing
content
.
charset
attribute
is
present,
or
if
the
element's
http-equiv
attribute
is
in
the
Encoding
declaration
state
:
in
a
head
element.
http-equiv
attribute
is
present
and
in
the
Encoding
declaration
state
:
in
a
head
element.
http-equiv
attribute
is
present
but
not
in
the
Encoding
declaration
state
:
in
a
noscript
element
that
is
a
child
of
a
head
element.
name
attribute
is
present:
where
metadata
content
is
expected.
itemprop
attribute
is
present:
where
phrasing
content
is
expected.
name
http-equiv
content
charset
interface HTMLMetaElement : HTMLElement {
attribute DOMString name;
attribute DOMString httpEquiv;
};
The
meta
element
represents
various
kinds
of
metadata
that
cannot
be
expressed
using
the
title
,
base
,
link
,
style
,
and
script
elements.
The
meta
element
can
represent
document-level
metadata
with
the
name
attribute,
pragma
directives
with
the
http-equiv
attribute,
and
the
file's
character
encoding
declaration
when
an
HTML
document
is
serialized
to
string
form
(e.g.
for
transmission
over
the
network
or
for
disk
storage)
with
the
charset
attribute.
Exactly
one
of
the
name
,
http-equiv
,
charset
,
and
itemprop
attributes
must
be
specified.
If
either
name
,
http-equiv
,
or
itemprop
is
specified,
then
the
content
attribute
must
also
be
specified.
Otherwise,
it
must
be
omitted.
The
charset
attribute
specifies
the
character
encoding
used
by
the
document.
This
is
a
character
encoding
declaration
.
If
the
attribute
is
present
in
an
XML
document
,
its
value
must
be
an
ASCII
case-insensitive
match
for
the
string
"
UTF-8
"
(and
the
document
is
therefore
required
to
use
UTF-8
as
its
encoding).
The
charset
attribute
on
the
meta
element
has
no
effect
in
XML
documents,
and
is
only
allowed
in
order
to
facilitate
migration
to
and
from
XHTML.
There
must
not
be
more
than
one
meta
element
with
a
charset
attribute
per
document.
The
content
attribute
gives
the
value
of
the
document
metadata
or
pragma
directive
when
the
element
is
used
for
those
purposes.
The
allowed
values
depend
on
the
exact
context,
as
described
in
subsequent
sections
of
this
specification.
If
a
meta
element
has
a
name
attribute,
it
sets
document
metadata.
Document
metadata
is
expressed
in
terms
of
name/value
pairs,
the
name
attribute
on
the
meta
element
giving
the
name,
and
the
content
attribute
on
the
same
element
giving
the
value.
The
name
specifies
what
aspect
of
metadata
is
being
set;
valid
names
and
the
meaning
of
their
values
are
described
in
the
following
sections.
If
a
meta
element
has
no
content
attribute,
then
the
value
part
of
the
metadata
name/value
pair
is
the
empty
string.
The
name
DOM
attribute
must
reflect
the
content
attribute
of
the
same
name.
The
DOM
attribute
httpEquiv
must
reflect
the
content
attribute
http-equiv
.
This
specification
defines
a
few
names
for
the
name
attribute
of
the
meta
element.
The
value
must
be
a
short
free-form
string
that
giving
the
name
of
the
Web
application
that
the
page
represents.
If
the
page
is
not
a
Web
application,
the
application-name
metadata
name
must
not
be
used.
User
agents
may
use
the
application
name
in
UI
in
preference
to
the
page's
title
,
since
the
title
might
include
status
messages
and
the
like
relevant
to
the
status
of
the
page
at
a
particular
moment
in
time
instead
of
just
being
the
name
of
the
application.
The value must be a free-form string that describes the page. The value must be appropriate for use in a directory of pages, e.g. in a search engine.
The value must be a free-form string that identifies the software used to generate the document. This value must not be used on hand-authored pages.
This section is marked as controversial and does not enjoy broad consensus.
The WHATWG may not be the ideal community to provide an Internet-wide registry of metadata names. The IETF, IANA and W3C have proven that they are capable of operating these types of registries and have the organizational and legal backing to provide a lasting (multiple decades) metadata registry. While consensus may form around the WHAT WG as the primary keeper of the metadata registry, this approach has yet to be heavily debated outside the WHAT WG community.
The notion that there would be a central metadata registry is controversial by itself as the HTML specifications have traditionally not needed a central metadata registry. The approach of operating a centralized metadata registry must be discussed further.
Extensions to the predefined set of metadata names may be registered in the WHATWG Wiki MetaExtensions page .
Anyone is free to edit the WHATWG Wiki MetaExtensions page at any time to add a type. These new names must be specified with the following information:
The actual name being defined. The name should not be confusingly similar to any other defined name (e.g. differing only in case).
A short description of what the metadata name's meaning is, including the format the value is required to be in.
A list of other names that have exactly the same processing requirements. Authors should not use the names defined to be synonyms, they are only intended to allow user agents to support legacy content.
One of the following:
If a metadata name is added with the "proposal" status and found to be redundant with existing values, it should be removed and listed as a synonym for the existing value.
Conformance checkers must use the information given on the WHATWG Wiki MetaExtensions page to establish if a value not explicitly defined in this specification is allowed or not. Conformance checkers may cache this information (e.g. for performance reasons or to avoid the use of unreliable network connectivity).
When an author uses a new type not defined by either this specification or the Wiki page, conformance checkers should offer to add the value to the Wiki, with the details described above, with the "proposal" status.
This specification does not define how new values will get approved. It is expected that the Wiki will have a community that addresses this.
Metadata
names
whose
values
are
to
be
URLs
must
not
be
proposed
or
accepted.
Links
must
be
represented
using
the
link
element,
not
the
meta
element.
When
the
http-equiv
attribute
is
specified
on
a
meta
element,
the
element
is
a
pragma
directive.
The
http-equiv
attribute
is
an
enumerated
attribute
.
The
following
table
lists
the
keywords
defined
for
this
attribute.
The
states
given
in
the
first
cell
of
the
rows
with
keywords
give
the
states
to
which
those
keywords
map.
Some
of
the
keywords
are
non-conforming,
as
noted
in
the
last
column.
| State | Keywords | Notes |
|---|---|---|
| Content Language |
content-language
|
Non-conforming |
| Encoding declaration |
content-type
|
|
| Default style |
default-style
|
|
| Refresh |
refresh
|
When
a
meta
element
is
inserted
into
the
document
,
if
its
http-equiv
attribute
is
present
and
represents
one
of
the
above
states,
then
the
user
agent
must
run
the
algorithm
appropriate
for
that
state,
as
described
in
the
following
list:
http-equiv="content-language"
)
This non-conforming pragma sets the document-wide default language . Until the pragma is successfully processed, there is no document-wide default language .
If
another
meta
element
with
an
http-equiv
attribute
in
the
Content
Language
state
has
already
been
successfully
processed
(i.e.
when
it
was
inserted
the
user
agent
processed
it
and
reached
the
last
step
of
this
list
of
steps),
then
abort
these
steps.
If
the
meta
element
has
no
content
attribute,
or
if
that
attribute's
value
is
the
empty
string,
then
abort
these
steps.
Let
input
be
the
value
of
the
element's
content
attribute.
Let position point at the first character of input .
Collect a sequence of characters that are neither space characters nor a U+002C COMMA character (",").
Let the document-wide default language be the string that resulted from the previous step.
For
meta
elements
with
an
http-equiv
attribute
in
the
Content
Language
state
,
the
content
attribute
must
have
a
value
consisting
of
a
valid
RFC
3066
language
code.
[RFC3066]
This
pragma
is
not
exactly
equivalent
to
the
HTTP
Content-Language
header,
for
instance
it
only
supports
one
language.
[HTTP]
http-equiv="content-type"
)
The
Encoding
declaration
state
is
just
an
alternative
form
of
setting
the
charset
attribute:
it
is
a
character
encoding
declaration
.
This
state's
user
agent
requirements
are
all
handled
by
the
parsing
section
of
the
specification.
For
meta
elements
with
an
http-equiv
attribute
in
the
Encoding
declaration
state
,
the
content
attribute
must
have
a
value
that
is
an
ASCII
case-insensitive
match
for
a
string
that
consists
of:
the
literal
string
"
text/html;
",
optionally
followed
by
any
number
of
space
characters
,
followed
by
the
literal
string
"
charset=
",
followed
by
the
character
encoding
name
of
the
character
encoding
declaration
.
If
the
document
contains
a
meta
element
with
an
http-equiv
attribute
in
the
Encoding
declaration
state
,
then
the
document
must
not
contain
a
meta
element
with
the
charset
attribute
present.
The
Encoding
declaration
state
may
be
used
in
HTML
documents
,
but
elements
with
an
http-equiv
attribute
in
that
state
must
not
be
used
in
XML
documents
.
http-equiv="default-style"
)
This pragma sets the name of the default alternative style sheet set .
http-equiv="refresh"
)
This pragma acts as timed redirect.
If
another
meta
element
with
an
http-equiv
attribute
in
the
Refresh
state
has
already
been
successfully
processed
(i.e.
when
it
was
inserted
the
user
agent
processed
it
and
reached
the
last
step
of
this
list
of
steps),
then
abort
these
steps.
If
the
meta
element
has
no
content
attribute,
or
if
that
attribute's
value
is
the
empty
string,
then
abort
these
steps.
Let
input
be
the
value
of
the
element's
content
attribute.
Let position point at the first character of input .
Collect a sequence of characters in the range U+0030 DIGIT ZERO to U+0039 DIGIT NINE, and parse the resulting string using the rules for parsing non-negative integers . If the sequence of characters collected is the empty string, then no number will have been parsed; abort these steps. Otherwise, let time be the parsed number.
Collect
a
sequence
of
characters
in
the
range
U+0030
DIGIT
ZERO
to
U+0039
DIGIT
NINE
and
U+002E
FULL
STOP
("
.
").
Ignore
any
collected
characters.
Let url be the address of the current page.
If
the
character
in
input
pointed
to
by
position
is
a
U+003B
SEMICOLON
("
;
"),
then
advance
position
to
the
next
character.
Otherwise,
jump
to
the
last
step.
If the character in input pointed to by position is one of U+0055 LATIN CAPITAL LETTER U or U+0075 LATIN SMALL LETTER U, then advance position to the next character. Otherwise, jump to the last step.
If the character in input pointed to by position is one of U+0052 LATIN CAPITAL LETTER R or U+0072 LATIN SMALL LETTER R, then advance position to the next character. Otherwise, jump to the last step.
If the character in input pointed to by position is one of U+004C LATIN CAPITAL LETTER L or U+006C LATIN SMALL LETTER L, then advance position to the next character. Otherwise, jump to the last step.
If
the
character
in
input
pointed
to
by
position
is
a
U+003D
EQUALS
SIGN
("
=
"),
then
advance
position
to
the
next
character.
Otherwise,
jump
to
the
last
step.
If the character in input pointed to by position is either a U+0027 APOSTROPHE character (') or U+0022 QUOTATION MARK character ("), then let quote be that character, and advance position to the next character. Otherwise, let quote be the empty string.
Let url be equal to the substring of input from the character at position to the end of the string.
If quote is not the empty string, and there is a character in url equal to quote , then truncate url at that character, so that it and all subsequent characters are removed.
Strip any trailing space characters from the end of url .
Strip any U+0009 CHARACTER TABULATION, U+000A LINE FEED (LF), and U+000D CARRIAGE RETURN (CR) characters from url .
Resolve
the
url
value
to
an
absolute
URL
,
relative
to
the
meta
element.
If
this
fails,
abort
these
steps.
Perform one or more of the following steps:
Set a timer so that in time seconds, adjusted to take into account user or user agent preferences, if the user has not canceled the redirect, the user agent navigates the document's browsing context to url , with replacement enabled , and with the document's browsing context as the source browsing context .
Provide the user with an interface that, when selected, navigates a browsing context to url , with the document's browsing context as the source browsing context .
Do nothing.
In addition, the user agent may, as with anything, inform the user of any and all aspects of its operation, including the state of any timers, the destinations of any timed redirects, and so forth.
For
meta
elements
with
an
http-equiv
attribute
in
the
Refresh
state
,
the
content
attribute
must
have
a
value
consisting
either
of:
;
),
followed
by
one
or
more
space
characters
,
followed
by
either
a
U+0055
LATIN
CAPITAL
LETTER
U
or
a
U+0075
LATIN
SMALL
LETTER
U,
a
U+0052
LATIN
CAPITAL
LETTER
R
or
a
U+0072
LATIN
SMALL
LETTER
R,
a
U+004C
LATIN
CAPITAL
LETTER
L
or
a
U+006C
LATIN
SMALL
LETTER
L,
a
U+003D
EQUALS
SIGN
(
=
),
and
then
a
valid
URL
.
In the former case, the integer represents a number of seconds before the page is to be reloaded; in the latter case the integer represents a number of seconds before the page is to be replaced by the page at the given URL .
There
must
not
be
more
than
one
meta
element
with
any
particular
state
in
the
document
at
a
time.
Extensions to the predefined set of pragma directives may, under certain conditions, be registered in the WHATWG Wiki PragmaExtensions page .
Such extensions must use a name that is identical to a previously-registered HTTP header defined in an RFC, and must have behavior identical to that described for the HTTP header. Pragma directions corresponding to headers describing metadata, or not requiring specific user agent processing, must not be registered; instead, use metadata names . Pragma directions corresponding to headers that affect the HTTP processing model (e.g. caching) must not be registered, as they would result in HTTP-level behavior being different for user agents that implement HTML than for user agents that do not.
Anyone is free to edit the WHATWG Wiki PragmaExtensions page at any time to add a pragma directive satisfying these conditions. Such registrations must specify the following information:
The actual name being defined.
A short description of the purpose of the pragma directive.
Conformance checkers must use the information given on the WHATWG Wiki PragmaExtensions page to establish if a value not explicitly defined in this specification is allowed or not. Conformance checkers may cache this information (e.g. for performance reasons or to avoid the use of unreliable network connectivity).
A character encoding declaration is a mechanism by which the character encoding used to store or transmit a document is specified.
The following restrictions apply to character encoding declarations:
If
an
HTML
document
does
not
start
with
a
BOM,
and
if
its
encoding
is
not
explicitly
given
by
Content-Type
metadata
,
then
the
character
encoding
used
must
be
an
ASCII-compatible
character
encoding
,
and,
in
addition,
if
that
encoding
isn't
US-ASCII
itself,
then
the
encoding
must
be
specified
using
a
meta
element
with
a
charset
attribute
or
a
meta
element
with
an
http-equiv
attribute
in
the
Encoding
declaration
state
.
If
an
HTML
document
contains
a
meta
element
with
a
charset
attribute
or
a
meta
element
with
an
http-equiv
attribute
in
the
Encoding
declaration
state
,
then
the
character
encoding
used
must
be
an
ASCII-compatible
character
encoding
.
Authors should not use JIS-X-0208 (JIS_C6226-1983), JIS-X-0212 (JIS_X0212-1990), encodings based on ISO-2022 , and encodings based on EBCDIC. Authors should not use UTF-32. Authors must not use the CESU-8, UTF-7, BOCU-1 and SCSU encodings. [RFC1345] [RFC1468] [RFC2237] [RFC1554] [RFC1922] [RFC1557] [UTF32] [CESU8] [UTF7] [BOCU1] [SCSU]
Authors are encouraged to use UTF-8. Conformance checkers may advise against authors using legacy encodings.
Authoring tools should default to using UTF-8 for newly-created documents.
Using non-UTF-8 encodings can have unexpected results on form submission and URL encodings, which use the document's character encoding by default.
In XHTML, the XML declaration should be used for inline character encoding information, if necessary.
style
element
scoped
attribute
is
present:
flow
content
.
scoped
attribute
is
absent:
where
metadata
content
is
expected.
scoped
attribute
is
absent:
in
a
noscript
element
that
is
a
child
of
a
head
element.
scoped
attribute
is
present:
where
flow
content
is
expected,
but
before
any
other
flow
content
other
than
other
style
elements
and
inter-element
whitespace
.
type
attribute.
media
type
scoped
title
attribute
has
special
semantics
on
this
element.
interface HTMLStyleElement : HTMLElement {
attribute boolean disabled;
attribute DOMString media;
attribute DOMString type;
attribute boolean scoped;
};
HTMLStyleElement
implements
LinkStyle
;
The
style
element
allows
authors
to
embed
style
information
in
their
documents.
The
style
element
is
one
of
several
inputs
to
the
styling
processing
model
.
The
element
does
not
represent
content
for
the
user.
If
the
type
attribute
is
given,
it
must
contain
a
valid
MIME
type,
optionally
with
parameters,
that
designates
a
styling
language.
[RFC2046]
If
the
attribute
is
absent,
the
type
defaults
to
text/css
.
[RFC2138]
When examining types to determine if they support the language, user agents must not ignore unknown MIME parameters — types with unknown parameters must be assumed to be unsupported.
The
media
attribute
says
which
media
the
styles
apply
to.
The
value
must
be
a
valid
media
query
.
[MQ]
User
agents
must
apply
the
styles
to
views
while
their
state
match
the
listed
media,
and
must
not
apply
them
otherwise.
The
styles
might
be
further
limited
in
scope,
e.g.
in
CSS
with
the
use
of
@media
blocks.
This
specification
does
not
override
such
further
restrictions
or
requirements.
The
default,
if
the
media
attribute
is
omitted,
is
all
,
meaning
that
by
default
styles
apply
to
all
media.
The
scoped
attribute
is
a
boolean
attribute
.
If
set,
it
indicates
that
the
styles
are
intended
just
for
the
subtree
rooted
at
the
style
element's
parent
element,
as
opposed
to
the
whole
Document
.
If
the
scoped
attribute
is
present,
then
the
user
agent
must
apply
the
specified
style
information
only
to
the
style
element's
parent
element
(if
any),
and
that
element's
child
nodes.
Otherwise,
the
specified
styles
must,
if
applied,
be
applied
to
the
entire
document.
The
title
attribute
on
style
elements
defines
alternative
style
sheet
sets
.
If
the
style
element
has
no
title
attribute,
then
it
has
no
title;
the
title
attribute
of
ancestors
does
not
apply
to
the
style
element.
[CSSOM]
The
title
attribute
on
style
elements,
like
the
title
attribute
on
link
elements,
differs
from
the
global
title
attribute
in
that
a
style
block
without
a
title
does
not
inherit
the
title
of
the
parent
element:
it
merely
has
no
title.
All
descendant
elements
must
be
processed,
according
to
their
semantics,
before
the
style
element
itself
is
evaluated.
For
styling
languages
that
consist
of
pure
text,
user
agents
must
evaluate
style
elements
by
passing
the
concatenation
of
the
contents
of
all
the
text
nodes
that
are
direct
children
of
the
style
element
(not
any
other
nodes
such
as
comments
or
elements),
in
tree
order
,
to
the
style
system.
For
XML-based
styling
languages,
user
agents
must
pass
all
the
child
nodes
of
the
style
element
to
the
style
system.
All URLs found by the styling language's processor must be resolved , relative to the element (or as defined by the styling language), when the processor is invoked.
Once
the
element
has
been
evaluated,
if
it
had
no
subresources
or
once
all
the
subresources
it
uses
have
been
fetched
,
the
user
agent
must
queue
a
task
to
fire
a
simple
event
called
load
at
the
style
element.
If
the
resource
has
a
subresource
that
fails
to
completely
load
for
any
reason
(e.g.
DNS
error,
HTTP
404
response,
the
connection
being
prematurely
closed,
unsupported
Content-Type),
the
user
agent
must
instead
queue
a
task
to
fire
a
simple
event
called
error
at
the
style
element.
Non-network
errors
in
the
processing
of
the
element's
contents
or
its
subresources
(e.g.
CSS
parse
errors)
are
not
failures
for
the
purposes
of
this
paragraph.
The
style
element
must
delay
the
load
event
of
the
element's
document
until
one
of
these
tasks
has
been
queued.
The task source for these tasks is the DOM manipulation task source .
This specification does not specify a style system, but CSS is expected to be supported by most Web browsers. [CSS]
The
media
,
type
and
scoped
DOM
attributes
must
reflect
the
respective
content
attributes
of
the
same
name.
The
DOM
disabled
attribute
behaves
as
defined
for
the
alternative
style
sheets
DOM
.
The
LinkStyle
interface
is
also
be
implemented
by
this
element;
the
styling
processing
model
defines
how.
[CSSOM]
The
link
and
style
elements
can
provide
styling
information
for
the
user
agent
to
use
when
rendering
the
document.
The
DOM
Styling
specification
specifies
what
styling
information
is
to
be
used
by
the
user
agent
and
how
it
is
to
be
used.
[CSSOM]
The
style
and
link
elements
implement
the
LinkStyle
interface.
[CSSOM]
For
style
elements,
if
the
user
agent
does
not
support
the
specified
styling
language,
then
the
sheet
attribute
of
the
element's
LinkStyle
interface
must
return
null.
Similarly,
link
elements
that
do
not
represent
external
resource
links
that
contribute
to
the
styling
processing
model
(i.e.
that
do
not
have
a
stylesheet
keyword
in
their
rel
attribute),
and
link
elements
whose
specified
resource
has
not
yet
been
fetched,
or
is
not
in
a
supported
styling
language,
must
have
their
LinkStyle
interface's
sheet
attribute
return
null.
Otherwise,
the
LinkStyle
interface's
sheet
attribute
must
return
a
StyleSheet
object
with
the
following
properties:
[CSSOM]
The
style
sheet
type
must
be
the
same
as
the
style's
specified
type.
For
style
elements,
this
is
the
same
as
the
type
content
attribute's
value,
or
text/css
if
that
is
omitted.
For
link
elements,
this
is
the
Content-Type
metadata
of
the
specified
resource
.
For
link
elements,
the
location
must
be
the
result
of
resolving
the
URL
given
by
the
element's
href
content
attribute,
relative
to
the
element,
or
the
empty
string
if
that
fails.
For
style
elements,
there
is
no
location.
The
media
must
be
the
same
as
the
value
of
the
element's
media
content
attribute,
or
the
empty
string,
if
the
attribute
is
omitted.
The
title
must
be
the
same
as
the
value
of
the
element's
title
content
attribute,
if
the
attribute
is
present
and
has
a
non-empty
value.
If
the
attribute
is
absent
or
its
value
is
the
empty
string,
then
the
style
sheet
does
not
have
a
title
(it
is
the
empty
string).
The
title
is
used
for
defining
alternative
style
sheet
sets
.
For
link
elements,
true
if
the
link
is
an
alternative
stylesheet
.
In
all
other
cases,
false.
The
disabled
DOM
attribute
on
link
and
style
elements
must
return
false
and
do
nothing
on
setting,
if
the
sheet
attribute
of
their
LinkStyle
interface
is
null.
Otherwise,
it
must
return
the
value
of
the
StyleSheet
interface's
disabled
attribute
on
getting,
and
forward
the
new
value
to
that
same
attribute
on
setting.
The rules for handling alternative style sheets are defined in the CSS object model specification. [CSSOM]
Scripts allow authors to add interactivity to their documents.
Authors are encouraged to use declarative alternatives to scripting where possible, as declarative mechanisms are often more maintainable, and many users disable scripting.
For
example,
instead
of
using
script
to
show
or
hide
a
section
to
show
more
details,
the
details
element
could
be
used.
Authors are also encouraged to make their applications degrade gracefully in the absence of scripting support.
For example, if an author provides a link in a table header to dynamically resort the table, the link could also be made to function without scripts by requesting the sorted table from the server.
script
element
src
attribute,
depends
on
the
value
of
the
type
attribute.
src
attribute,
the
element
must
be
either
empty
or
contain
only
script
documentation
.
src
async
defer
type
charset
interface HTMLScriptElement : HTMLElement {
attribute DOMString src;
attribute boolean async;
attribute boolean defer;
attribute DOMString type;
attribute DOMString charset;
attribute DOMString text;
};
The
script
element
allows
authors
to
include
dynamic
script
and
data
blocks
in
their
documents.
The
element
does
not
represent
content
for
the
user.
When
used
to
include
dynamic
scripts,
the
scripts
may
either
be
embedded
inline
or
may
be
imported
from
an
external
file
using
the
src
attribute.
If
the
language
is
not
that
described
by
"
text/javascript
",
then
the
type
attribute
must
be
present.
If
the
type
attribute
is
present,
its
value
must
be
the
type
of
the
script's
language.
When
used
to
include
data
blocks,
the
data
must
be
embedded
inline,
the
format
of
the
data
must
be
given
using
the
type
attribute,
and
the
src
attribute
must
not
be
specified.
The
type
attribute
gives
the
language
of
the
script
or
format
of
the
data.
If
the
attribute
is
present,
its
value
must
be
a
valid
MIME
type,
optionally
with
parameters.
The
charset
parameter
must
not
be
specified.
(The
default,
which
is
used
if
the
attribute
is
absent,
is
"
text/javascript
".)
[RFC2046]
The
src
attribute,
if
specified,
gives
the
address
of
the
external
script
resource
to
use.
The
value
of
the
attribute
must
be
a
valid
URL
identifying
a
script
resource
of
the
type
given
by
the
type
attribute,
if
the
attribute
is
present,
or
of
the
type
"
text/javascript
",
if
the
attribute
is
absent.
The
charset
attribute
gives
the
character
encoding
of
the
external
script
resource.
The
attribute
must
not
be
specified
if
the
src
attribute
is
not
present.
If
the
attribute
is
set,
its
value
must
be
a
valid
character
encoding
name,
must
be
the
preferred
name
for
that
encoding,
and
must
match
the
encoding
given
in
the
charset
parameter
of
the
Content-Type
metadata
of
the
external
file,
if
any.
[IANACHARSET]
The
async
and
defer
attributes
are
boolean
attributes
that
indicate
how
the
script
should
be
executed.
There
are
three
possible
modes
that
can
be
selected
using
these
attributes.
If
the
async
attribute
is
present,
then
the
script
will
be
executed
asynchronously,
as
soon
as
it
is
available.
If
the
async
attribute
is
not
present
but
the
defer
attribute
is
present,
then
the
script
is
executed
when
the
page
has
finished
parsing.
If
neither
attribute
is
present,
then
the
script
is
fetched
and
executed
immediately,
before
the
user
agent
continues
parsing
the
page.
The
exact
processing
details
for
these
attributes
is
described
below.
The
defer
attribute
may
be
specified
even
if
the
async
attribute
is
specified,
to
cause
legacy
Web
browsers
that
only
support
defer
(and
not
async
)
to
fall
back
to
the
defer
behavior
instead
of
the
synchronous
blocking
behavior
that
is
the
default.
If
the
defer
attribute
may
be
specified,
the
src
attribute
must
also
be
specified.
Changing
the
src
,
type
,
charset
,
async
,
and
defer
attributes
dynamically
has
no
direct
effect;
these
attribute
are
only
used
at
specific
times
described
below
(namely,
when
the
element
is
inserted
into
the
document
).
script
elements
have
four
associated
pieces
of
metadata.
The
first
is
a
flag
indicating
whether
or
not
the
script
block
has
been
"already
executed"
.
Initially,
script
elements
must
have
this
flag
unset
(script
blocks,
when
created,
are
not
"already
executed").
When
a
script
element
is
cloned,
the
"already
executed"
flag,
if
set,
must
be
propagated
to
the
clone
when
it
is
created.
The
second
is
a
flag
indicating
whether
the
element
was
"parser-inserted"
.
This
flag
is
set
by
the
HTML
parser
and
is
used
to
handle
document.write()
calls.
The
third
and
fourth
pieces
of
metadata
are
the
script
block's
type
and
the
script
block's
character
encoding
.
They
are
determined
when
the
script
is
run,
based
on
the
attributes
on
the
element
at
that
time.
When
a
script
element
that
is
neither
marked
as
having
"already
executed"
nor
marked
as
being
"parser-inserted"
experiences
one
of
the
events
listed
in
the
following
list,
the
user
agent
must
run
the
script
element:
script
element
gets
inserted
into
a
document
.
script
element's
child
nodes
are
changed.
script
element
has
a
src
attribute
set
where
previously
the
element
had
no
such
attribute.
Running
a
script
:
When
a
script
element
is
to
be
run,
the
user
agent
must
act
as
follows:
If either:
script
element
has
a
type
attribute
and
its
value
is
the
empty
string,
or
script
element
has
no
type
attribute
but
it
has
a
language
attribute
and
that
attribute's
value
is
the
empty
string,
or
script
element
has
neither
a
type
attribute
nor
a
language
attribute,
then
...let
the
script
block's
type
for
this
script
element
be
"
text/javascript
".
Otherwise,
if
the
script
element
has
a
type
attribute,
let
the
script
block's
type
for
this
script
element
be
the
value
of
that
attribute.
Otherwise,
the
element
has
a
non-empty
language
attribute;
let
the
script
block's
type
for
this
script
element
be
the
concatenation
of
the
string
"
text/
"
followed
by
the
value
of
the
language
attribute.
The
language
attribute
is
never
conforming,
and
is
always
ignored
if
there
is
a
type
attribute
present.
If
the
script
element
has
a
charset
attribute,
then
let
the
script
block's
character
encoding
for
this
script
element
be
the
encoding
given
by
the
charset
attribute.
Otherwise,
let
the
script
block's
character
encoding
for
this
script
element
be
the
same
as
the
encoding
of
the
document
itself
.
If
scripting
is
disabled
for
the
script
element,
or
if
the
user
agent
does
not
support
the
scripting
language
given
by
the
script
block's
type
for
this
script
element,
then
the
user
agent
must
abort
these
steps
at
this
point.
The
script
is
not
executed.
If
the
element
has
no
src
attribute,
and
its
child
nodes
consist
only
of
comment
nodes
and
empty
text
nodes
,
then
the
user
agent
must
abort
these
steps
at
this
point.
The
script
is
not
executed.
The user agent must set the element's "already executed" flag.
If
the
element
has
a
src
attribute,
then
the
value
of
that
attribute
must
be
resolved
relative
to
the
element,
and
if
that
is
successful,
the
specified
resource
must
then
be
fetched
.
For
historical
reasons,
if
the
URL
is
a
javascript:
URL
,
then
the
user
agent
must
not,
despite
the
requirements
in
the
definition
of
the
fetching
algorithm,
actually
execute
the
given
script;
instead
the
user
agent
must
act
as
if
it
had
received
an
empty
HTTP
400
response.
Once the resource's Content Type metadata is available, if it ever is, apply the algorithm for extracting an encoding from a Content-Type to it. If this returns an encoding, and the user agent supports that encoding, then let the script block's character encoding be that encoding.
Once the fetching process has completed, and the script has completed loading , the user agent will have to complete the steps described below . (If the parser is still active at that time, those steps defer to the parser to handle the execution of pending scripts.)
For
performance
reasons,
user
agents
may
start
fetching
the
script
as
soon
as
the
attribute
is
set,
instead,
in
the
hope
that
the
element
will
be
inserted
into
the
document.
Either
way,
once
the
element
is
inserted
into
the
document
,
the
load
must
have
started.
If
the
UA
performs
such
prefetching,
but
the
element
is
never
inserted
in
the
document,
or
the
src
attribute
is
dynamically
changed,
then
the
user
agent
will
not
execute
the
script,
and
the
fetching
process
will
have
been
effectively
wasted.
Then, the first of the following options that describes the situation must be followed:
defer
attribute,
and
the
element
has
a
src
attribute,
and
the
element
does
not
have
an
async
attribute
async
attribute
and
a
src
attribute
async
attribute
but
no
src
attribute,
and
the
list
of
scripts
that
will
execute
asynchronously
is
not
empty
src
attribute
and
has
been
flagged
as
"parser-inserted"
src
attribute
Fetching an external script must delay the load event of the element's document until the task that is queued by the networking task source once the resource has been fetched (defined below) has been run.
When
a
script
completes
loading
:
If
the
script
element
was
added
to
one
of
the
lists
mentioned
above
and
the
document
is
still
being
parsed,
then
the
parser
handles
it.
Otherwise,
the
UA
must
run
the
following
steps
as
the
task
that
the
networking
task
source
places
on
the
task
queue
:
script
element
was
added
to
the
list
of
scripts
that
will
execute
when
the
document
has
finished
parsing
:
If
the
script
element
is
not
the
first
element
in
the
list,
then
do
nothing
yet.
Stop
going
through
these
steps.
Otherwise, execute the script block (the first element in the list).
Remove
the
script
element
from
the
list
(i.e.
shift
out
the
first
entry
in
the
list).
If there are any more entries in the list, and if the script associated with the element that is now the first in the list is already loaded, then jump back to step 2 to execute it.
The
scripts
in
the
list
of
scripts
that
will
execute
when
the
document
has
finished
parsing
can
also
get
executed
prematurely
if
the
innerHTML
attribute
is
set
on
a
node
in
the
document.
script
element
was
added
to
the
list
of
scripts
that
will
execute
asynchronously
:
If the script is not the first element in the list, then do nothing yet. Stop going through these steps.
Execute the script block (the first element in the list).
Remove
the
script
element
from
the
list
(i.e.
shift
out
the
first
entry
in
the
list).
If
there
are
any
more
scripts
in
the
list,
and
the
element
now
at
the
head
of
the
list
had
no
src
attribute
when
it
was
added
to
the
list,
or
had
one,
but
its
associated
script
has
finished
loading,
then
jump
back
to
step
2
to
execute
the
script
associated
with
this
element.
script
element
was
added
to
the
list
of
scripts
that
will
execute
as
soon
as
possible
:
Remove
the
script
element
from
the
list.
Executing a script block : When the steps above require that the script block be executed, the user agent must act as follows:
Executing
the
script
block
must
just
consist
of
firing
a
simple
event
called
error
at
the
element.
Initialize the script block's source as follows:
The contents of that file, interpreted as string of Unicode characters, are the script source.
For each of the rows in the following table, starting with the first one and going down, if the file has as many or more bytes available than the number of bytes in the first column, and the first bytes of the file match the bytes given in the first column, then set the script block's character encoding to the encoding given in the cell in the second column of that row, irrespective of any previous value:
| Bytes in Hexadecimal | Encoding |
|---|---|
| FE FF | UTF-16BE |
| FF FE | UTF-16LE |
| EF BB BF | UTF-8 |
This step looks for Unicode Byte Order Marks (BOMs).
The file must then be converted to Unicode using the character encoding given by the script block's character encoding .
The
value
of
the
DOM
text
attribute
at
the
time
the
"
running
a
script
"
algorithm
was
first
invoked
is
the
script
source.
The
child
nodes
of
the
script
element
at
the
time
the
"
running
a
script
"
algorithm
was
first
invoked
are
the
script
source.
Pause until either any applicable style sheets have been fetched and applied, or the user agent has timed out and decided to not wait for those style sheets.
Create
a
script
from
the
script
element
node,
using
the
the
script
block's
source
and
the
the
script
block's
type
.
This is where the script is compiled and actually executed.
Fire
a
simple
event
called
load
at
the
script
element.
The
DOM
attributes
src
,
type
,
charset
,
async
,
and
defer
,
each
must
reflect
the
respective
content
attributes
of
the
same
name.
text
[
=
value
]
Returns the contents of the element, ignoring child nodes that aren't text nodes .
Can be set, to replace the element's children with the given value.
The
DOM
attribute
text
must
return
a
concatenation
of
the
contents
of
all
the
text
nodes
that
are
direct
children
of
the
script
element
(ignoring
any
other
nodes
such
as
comments
or
elements),
in
tree
order.
On
setting,
it
must
act
the
same
way
as
the
textContent
DOM
attribute.
In
this
example,
two
script
elements
are
used.
One
embeds
an
external
script,
and
the
other
includes
some
data.
<script src="game-engine.js"></script> <script type="text/x-game-map"> ........U.........e o............A....e .....A.....AAA....e .A..AAA...AAAAA...e </script>
The data in this case might be used by the script to generate the map of a video game. The data doesn't have to be used that way, though; maybe the map data is actually embedded in other parts of the page's markup, and the data block here is just used by the site's search engine to help users who are looking for particular features in their game maps.
When
inserted
using
the
document.write()
method,
script
elements
execute
(typically
synchronously),
but
when
inserted
using
innerHTML
and
outerHTML
attributes,
they
do
not
execute
at
all.
A user agent is said to support the scripting language if the script block's type matches the MIME type of a scripting language that the user agent implements.
The following lists some MIME types and the languages to which they refer:
application/ecmascript
application/javascript
application/x-ecmascript
application/x-javascript
text/ecmascript
text/javascript
text/javascript1.0
text/javascript1.1
text/javascript1.2
text/javascript1.3
text/javascript1.4
text/javascript1.5
text/jscript
text/livescript
text/x-ecmascript
text/x-javascript
text/javascript;e4x=1
User agents may support other MIME types and other languages.
When examining types to determine if they support the language, user agents must not ignore unknown MIME parameters — types with unknown parameters must be assumed to be unsupported.
If
a
script
element's
src
attribute
is
specified,
then
the
contents
of
the
script
element,
if
any,
must
be
such
that
the
value
of
the
DOM
text
attribute,
which
is
derived
from
the
element's
contents,
matches
the
documentation
production
in
the
following
ABNF,
the
character
set
for
which
is
Unicode.
[ABNF]
documentation = *( *( space / tab / comment ) [ line-comment ] newline )
comment = slash star *( not-star / star not-slash ) 1*star slash
line-comment = slash slash *not-newline
; characters
tab = %x0009 ; U+0009 TAB
newline = %x000A ; U+000A LINE FEED
space = %x0020 ; U+0020 SPACE
star = %x002A ; U+002A ASTERISK
slash = %x002F ; U+002F SOLIDUS
not-newline = %x0000-0009 / %x000B-10FFFF
; a Unicode character other than U+000A LINE FEED
not-star = %x0000-0029 / %x002B-10FFFF
; a Unicode character other than U+002A ASTERISK
not-slash = %x0000-002E / %x0030-10FFFF
;
a
Unicode
character
other
than
U+002F
SOLIDUS
This
allows
authors
to
include
documentation,
such
as
license
information
or
API
information,
inside
their
documents
while
still
referring
to
external
script
files.
The
syntax
is
constrained
so
that
authors
don't
accidentally
include
what
looks
like
valid
script
while
also
providing
a
src
attribute.
<script src="cool-effects.js"> // create new instances using: // var e = new Effect(); // start the effect using .play, stop using .stop: // e.play(); // e.stop(); </script>
noscript
element
head
element
of
an
HTML
document
,
if
there
are
no
ancestor
noscript
elements.
noscript
elements.
head
element:
in
any
order,
zero
or
more
link
elements,
zero
or
more
style
elements,
and
zero
or
more
meta
elements.
head
element:
transparent
,
but
there
must
be
no
noscript
element
descendants.
HTMLElement
.
The
noscript
element
represents
nothing
if
scripting
is
enabled
,
and
represents
its
children
if
scripting
is
disabled
.
It
is
used
to
present
different
markup
to
user
agents
that
support
scripting
and
those
that
don't
support
scripting,
by
affecting
how
the
document
is
parsed.
When used in HTML documents , the allowed content model is as follows:
head
element,
if
scripting
is
disabled
for
the
noscript
element
The
noscript
element
must
contain
only
link
,
style
,
and
meta
elements.
head
element,
if
scripting
is
enabled
for
the
noscript
element
The
noscript
element
must
contain
only
text,
except
that
invoking
the
HTML
fragment
parsing
algorithm
with
the
noscript
element
as
the
context
element
and
the
text
contents
as
the
input
must
result
in
a
list
of
nodes
that
consists
only
of
link
,
style
,
and
meta
elements,
and
no
parse
errors
.
head
elements,
if
scripting
is
disabled
for
the
noscript
element
The
noscript
element's
content
model
is
transparent
,
with
the
additional
restriction
that
a
noscript
element
must
not
have
a
noscript
element
as
an
ancestor
(that
is,
noscript
can't
be
nested).
head
elements,
if
scripting
is
enabled
for
the
noscript
element
The
noscript
element
must
contain
only
text,
except
that
the
text
must
be
such
that
running
the
following
algorithm
results
in
a
conforming
document
with
no
noscript
elements
and
no
script
elements,
and
such
that
no
step
in
the
algorithm
causes
an
HTML
parser
to
flag
a
parse
error
:
script
element
from
the
document.
noscript
element
in
the
document.
For
every
noscript
element
in
that
list,
perform
the
following
steps:
noscript
element.
noscript
element,
and
call
these
elements
the
before
children
.
noscript
element,
and
call
these
elements
the
after
children
.
noscript
element.
innerHTML
attribute
of
the
parent
element
to
the
value
of
s
.
(This,
as
a
side-effect,
causes
the
noscript
element
to
be
removed
from
the
document.)
All
these
contortions
are
required
because,
for
historical
reasons,
the
noscript
element
is
handled
differently
by
the
HTML
parser
based
on
whether
scripting
was
enabled
or
not
when
the
parser
was
invoked.
The
element
is
not
allowed
in
XML,
because
in
XML
the
parser
is
not
affected
by
such
state,
and
thus
the
element
would
not
have
the
desired
effect.
The
noscript
element
must
not
be
used
in
XML
documents
.
The
noscript
element
is
only
effective
in
the
the
HTML
syntax
,
it
has
no
effect
in
the
the
XHTML
syntax
.
The
noscript
element
has
no
other
requirements.
In
particular,
children
of
the
noscript
element
are
not
exempt
from
form
submission
,
scripting,
and
so
forth,
even
when
scripting
is
enabled
for
the
element.
There are a number of new elements in this section that are controversial and that do not enjoy broad consensus.
Microsoft
and
others
have
asserted
that
there
are
a
number
of
language
design
issues
with
the
section
,
nav
,
article
,
aside
,
and
hgroup
.
It
has
been
suggested
that
either
the
class
attribute
may
perform
the
same
functionality,
or
the
role
attribute
could
be
adopted
from
the
XHTML2
work.
body
element
html
element.
onafterprint
onbeforeprint
onbeforeunload
onblur
onerror
onfocus
onhashchange
onload
onmessage
onoffline
ononline
onpopstate
onredo
onresize
onstorage
onundo
onunload
interface HTMLBodyElement : HTMLElement {
attribute Function onafterprint;
attribute Function onbeforeprint;
attribute Function onbeforeunload;
attribute Function onblur;
attribute Function onerror;
attribute Function onfocus;
attribute Function onhashchange;
attribute Function onload;
attribute Function onmessage;
attribute Function onoffline;
attribute Function ononline;
attribute Function onpopstate;
attribute Function onredo;
attribute Function onresize;
attribute Function onstorage;
attribute Function onundo;
attribute Function onunload;
};
The
body
element
represents
the
main
content
of
the
document.
In
conforming
documents,
there
is
only
one
body
element.
The
document.body
DOM
attribute
provides
scripts
with
easy
access
to
a
document's
body
element.
Some
DOM
operations
(for
example,
parts
of
the
drag
and
drop
model)
are
defined
in
terms
of
"
the
body
element
".
This
refers
to
a
particular
element
in
the
DOM,
as
per
the
definition
of
the
term,
and
not
any
arbitrary
body
element.
The
body
element
exposes
as
event
handler
content
attributes
a
number
of
the
event
handler
attributes
of
the
Window
object.
It
also
mirrors
their
event
handler
DOM
attributes
.
The
onblur
,
onerror
,
onfocus
,
and
onload
event
handler
attributes
of
the
Window
object,
exposed
on
the
body
element,
shadow
the
generic
event
handler
attributes
with
the
same
names
normally
supported
by
HTML
elements
.
Thus,
for
example,
a
bubbling
error
event
fired
on
a
child
of
the
body
element
of
a
Document
would
first
trigger
the
onerror
event
handler
content
attributes
of
that
element,
then
that
of
the
root
html
element,
and
only
then
would
it
trigger
the
onerror
event
handler
content
attribute
on
the
body
element.
This
is
because
the
event
would
bubble
from
the
target,
to
the
body
,
to
the
html
,
to
the
Document
,
to
the
Window
,
and
the
event
handler
attribute
on
the
body
is
watching
the
Window
not
the
body
.
A
regular
event
listener
attached
to
the
body
using
addEventListener()
,
however,
would
fire
when
the
event
bubbled
through
the
body
and
not
when
it
reaches
the
Window
object.
section
element
formatBlock
candidate
.
cite
interface HTMLSectionElement : HTMLElement {
attribute DOMString cite;
};
The
section
element
represents
a
generic
document
or
application
section.
A
section,
in
this
context,
is
a
thematic
grouping
of
content,
typically
with
a
heading,
possibly
with
a
footer.
Examples of sections would be chapters, the various tabbed pages in a tabbed dialog box, or the numbered sections of a thesis. A Web site's home page could be split into sections for an introduction, news items, contact information.
The
cite
attribute
may
be
used
if
the
content
of
the
section
was
taken
from
another
page
(e.g.
syndicating
content
from
multiple
sources
on
one
page).
The
attribute,
if
present,
must
contain
a
valid
URL
referencing
the
original
source.
To
obtain
the
corresponding
citation
link,
the
value
of
the
attribute
must
be
resolved
relative
to
the
element.
User
agents
should
allow
users
to
follow
such
citation
links.
The
section
element
is
not
a
generic
container
element.
When
an
element
is
needed
for
styling
purposes
or
as
a
convenience
for
scripting,
authors
are
encouraged
to
use
the
div
element
instead.
A
general
rule
is
that
the
section
element
is
appropriate
only
if
the
element's
contents
would
be
listed
explicitly
in
the
document's
outline
.
In the following example, we see an article (part of a larger Web page) about apples, containing two short sections.
<article> <hgroup> <h1>Apples</h1> <h2>Tasty, delicious fruit!</h2> </hgroup> <p>The apple is the pomaceous fruit of the apple tree.</p> <section> <h1>Red Delicious</h1> <p>These bright red apples are the most common found in many supermarkets.</p> </section> <section> <h1>Granny Smith</h1> <p>These juicy, green apples make a great filling for apple pies.</p> </section> </article>
Notice
how
the
use
of
section
means
that
the
author
can
use
h1
elements
throughout,
without
having
to
worry
about
whether
a
particular
section
is
at
the
top
level,
the
second
level,
the
third
level,
and
so
on.
nav
element
formatBlock
candidate
.
HTMLElement
.
The
nav
element
represents
a
section
of
a
page
that
links
to
other
pages
or
to
parts
within
the
page:
a
section
with
navigation
links.
Not
all
groups
of
links
on
a
page
need
to
be
in
a
nav
element
—
only
sections
that
consist
of
major
navigation
blocks
are
appropriate
for
the
nav
element.
In
particular,
it
is
common
for
footers
to
have
a
list
of
links
to
various
key
parts
of
a
site,
but
the
footer
element
is
more
appropriate
in
such
cases,
and
no
nav
element
is
necessary
for
those
links.
In the following example, the page has several places where links are present, but only one of those places is considered a navigation section.
<body>
<header>
<h1>Wake up sheeple!</h1>
<p><a href="news.html">News</a> -
<a href="blog.html">Blog</a> -
<a href="forums.html">Forums</a></p>
<p>Last Modified: <time>2009-04-01</time></p>
<nav>
<h1>Navigation</h1>
<ul>
<li><a href="articles.html">Index of all articles</a></li>
<li><a href="today.html">Things sheeple need to wake up for today</a></li>
<li><a href="successes.html">Sheeple we have managed to wake</a></li>
</ul>
</nav>
</header>
<article>
<p>...page content would be here...</p>
</article>
<footer>
<p>Copyright © 2006 The Example Company</p>
<p><a href="about.html">About</a> -
<a href="policy.html">Privacy Policy</a> -
<a href="contact.html">Contact Us</a></p>
</footer>
</body>
In
the
following
example,
there
are
two
nav
elements,
one
for
primary
navigation
around
the
site,
and
one
for
secondary
navigation
around
the
page
itself.
<body>
<h1>The Wiki Center Of Exampland</h1>
<nav>
<ul>
<li><a href="/">Home</a></li>
<li><a href="/events">Current Events</a></li>
...more...
</ul>
</nav>
<article>
<header>
<h1>Demos in Exampland</h1>
<nav>
<ul>
<li><a href="#public">Public demonstrations</a></li>
<li><a href="#destroy">Demolitions</a></li>
...more...
</ul>
</nav>
</header>
<section id="public">
<h1>Public demonstrations</h1>
<p>...more...</p>
</section>
<section id="destroy">
<h1>Demolitions</h1>
<p>...more...</p>
</section>
...more...
<footer>
<p><a href="?edit">Edit</a> | <a href="?delete">Delete</a> | <a href="?Rename">Rename</a></p>
</footer>
</article>
<footer>
<p><small>© copyright 1998 Exampland Emperor</small></p>
</footer>
</body>
article
element
formatBlock
candidate
.
cite
pubdate
interface HTMLArticleElement : HTMLElement {
attribute DOMString cite;
attribute DOMString pubDate;
};
The
article
element
represents
a
section
of
a
page
that
consists
of
a
composition
that
forms
an
independent
part
of
a
document,
page,
or
site.
This
could
be
a
forum
post,
a
magazine
or
newspaper
article,
a
Web
log
entry,
a
user-submitted
comment,
or
any
other
independent
item
of
content.
An
article
element
is
"independent"
in
the
sense
that
its
contents
could
stand
alone,
for
example
in
syndication.
When
article
elements
are
nested,
the
inner
article
elements
represent
articles
that
are
in
principle
related
to
the
contents
of
the
outer
article.
For
instance,
a
Web
log
entry
on
a
site
that
accepts
user-submitted
comments
could
represent
the
comments
as
article
elements
nested
within
the
article
element
for
the
Web
log
entry.
Author
information
associated
with
an
article
element
(q.v.
the
address
element)
does
not
apply
to
nested
article
elements.
The
cite
attribute
may
be
used
if
the
content
of
the
article
was
taken
from
another
page
(e.g.
syndicating
content
from
multiple
sources
on
one
page).
The
attribute,
if
present,
must
contain
a
valid
URL
referencing
the
original
source.
To
obtain
the
corresponding
citation
link,
the
value
of
the
attribute
must
be
resolved
relative
to
the
element.
User
agents
should
allow
users
to
follow
such
citation
links.
The
pubdate
attribute
may
be
used
to
specify
the
time
and
date
that
the
article
was
first
published.
If
present,
the
pubdate
attribute
must
be
a
valid
global
date
and
time
string
value.
The
cite
DOM
attribute
must
reflect
the
element's
cite
content
attribute.
The
pubDate
DOM
attribute
must
reflect
the
element's
pubdate
content
attribute.
aside
element
formatBlock
candidate
.
HTMLElement
.
The
aside
element
represents
a
section
of
a
page
that
consists
of
content
that
is
tangentially
related
to
the
content
around
the
aside
element,
and
which
could
be
considered
separate
from
that
content.
Such
sections
are
often
represented
as
sidebars
in
printed
typography.
The element can also be used for typographical effects like pull quotes.
It's
not
appropriate
to
use
the
aside
element
just
for
parentheticals,
since
those
are
part
of
the
main
flow
of
the
document.
The following example shows how an aside is used to mark up background material on Switzerland in a much longer news story on Europe.
<aside> <h1>Switzerland</h1> <p>Switzerland, a land-locked country in the middle of geographic Europe, has not joined the geopolitical European Union, though it is a signatory to a number of European treaties.</p> </aside>
The following example shows how an aside is used to mark up a pull quote in a longer article.
... <p>He later joined a large company, continuing on the same work. <q>I love my job. People ask me what I do for fun when I'm not at work. But I'm paid to do my hobby, so I never know what to answer. Some people wonder what they would do if they didn't have to work... but I know what I would do, because I was unemployed for a year, and I filled that time doing exactly what I do now.</q></p> <aside> <q> People ask me what I do for fun when I'm not at work. But I'm paid to do my hobby, so I never know what to answer. </q> </aside> <p>Of course his work — or should that be hobby? — isn't his only passion. He also enjoys other pleasures.</p> ...
h1
,
h2
,
h3
,
h4
,
h5
,
and
h6
elements
formatBlock
candidate
.
interface
HTMLHeadingElement
:
HTMLElement
{};
These elements represent headings for their sections.
The semantics and meaning of these elements are defined in the section on headings and sections .
These
elements
have
a
rank
given
by
the
number
in
their
name.
The
h1
element
is
said
to
have
the
highest
rank,
the
h6
element
has
the
lowest
rank,
and
two
elements
with
the
same
name
have
equal
rank.
hgroup
element
formatBlock
candidate
.
h1
,
h2
,
h3
,
h4
,
h5
,
and/or
h6
elements.
HTMLElement
.
The
hgroup
element
represents
the
heading
of
a
section.
The
element
is
used
to
group
a
set
of
h1
–
h6
elements
when
the
heading
has
multiple
levels,
such
as
subheadings,
alternative
titles,
or
taglines.
The
point
of
hgroup
is
to
mask
an
h2
element
(that
acts
as
a
secondary
title)
from
the
outline
algorithm.
For
the
purposes
of
document
summaries,
outlines,
and
the
like,
the
text
of
hgroup
elements
is
defined
to
be
the
text
of
the
highest
ranked
h1
–
h6
element
descendant
of
the
hgroup
element,
if
there
are
any
such
elements,
and
the
first
such
element
if
there
are
multiple
elements
with
that
rank
.
If
there
are
no
such
elements,
then
the
text
of
the
hgroup
element
is
the
empty
string.
Other
elements
of
heading
content
in
the
hgroup
element
indicate
subheadings
or
subtitles.
The
rank
of
an
hgroup
element
is
the
rank
of
the
highest-ranked
h1
–
h6
element
descendant
of
the
hgroup
element,
if
there
are
any
such
elements,
or
otherwise
the
same
as
for
an
h1
element
(the
highest
rank).
The
section
on
headings
and
sections
defines
how
hgroup
elements
are
assigned
to
individual
sections.
Here are some examples of valid headings. In each case, the emphasized text represents the text that would be used as the heading in an application extracting heading data and ignoring subheadings.
<hgroup> <h1>The reality dysfunction</h1> <h2>Space is not the only void</h2> </hgroup>
<hgroup> <h1>Dr. Strangelove</h1> <h2>Or: How I Learned to Stop Worrying and Love the Bomb</h2> </hgroup>
header
element
formatBlock
candidate
.
header
or
footer
element
descendants.
HTMLElement
.
The
header
element
represents
a
group
of
introductory
or
navigational
aids.
A
header
element
is
intended
to
usually
contain
the
section's
heading
(an
h1
–
h6
element
or
an
hgroup
element),
but
this
is
not
required.
The
header
element
can
also
be
used
to
wrap
a
section's
table
of
contents,
a
search
form,
or
any
relevant
logos.
Here are some sample headers. This first one is for a game:
<header> <p>Welcome to...</p> <h1>Voidwars!</h1> </header>
The following snippet shows how the element can be used to mark up a specification's header:
<header> <hgroup> <h1>Scalable Vector Graphics (SVG) 1.2</h1> <h2>W3C Working Draft 27 October 2004</h2> </hgroup> <dl> <dt>This version:</dt> <dd><a href="http://www.w3.org/TR/2004/WD-SVG12-20041027/">http://www.w3.org/TR/2004/WD-SVG12-20041027/</a></dd> <dt>Previous version:</dt> <dd><a href="http://www.w3.org/TR/2004/WD-SVG12-20040510/">http://www.w3.org/TR/2004/WD-SVG12-20040510/</a></dd> <dt>Latest version of SVG 1.2:</dt> <dd><a href="http://www.w3.org/TR/SVG12/">http://www.w3.org/TR/SVG12/</a></dd> <dt>Latest SVG Recommendation:</dt> <dd><a href="http://www.w3.org/TR/SVG/">http://www.w3.org/TR/SVG/</a></dd> <dt>Editor:</dt> <dd>Dean Jackson, W3C, <a href="mailto:dean@w3.org">dean@w3.org</a></dd> <dt>Authors:</dt> <dd>See <a href="#authors">Author List</a></dd> </dl> <p class="copyright"><a href="http://www.w3.org/Consortium/Legal/ipr-notic ... </header>
The
header
element
is
not
sectioning
content
;
it
doesn't
introduce
a
new
section.
In
this
example,
the
page
has
a
page
heading
given
by
the
h1
element,
and
two
subsections
whose
headings
are
given
by
h2
elements.
The
content
after
the
header
element
is
still
part
of
the
last
subsection
started
in
the
header
element,
because
the
header
element
doesn't
take
part
in
the
outline
algorithm.
<body>
<header>
<h1>Little Green Guys With Guns</h1>
<nav>
<ul>
<li><a href="/games">Games</a> |
<li><a href="/forum">Forum</a> |
<li><a href="/download">Download</a>
</ul>
</nav>
<h2>Important News</h2> <!-- this starts a second subsection -->
<!-- this is part of the subsection entitled "Important News" -->
<p>To play today's games you will need to update your client.</p>
<h2>Games</h2> <!-- this starts a third subsection -->
</header>
<p>You have three active games:</p>
<!-- this is still part of the subsection entitled "Games" -->
...
footer
element
formatBlock
candidate
.
header
or
footer
element
descendants.
HTMLElement
.
The
footer
element
represents
a
footer
for
its
nearest
ancestor
sectioning
content
.
A
footer
typically
contains
information
about
its
section
such
as
who
wrote
it,
links
to
related
documents,
copyright
data,
and
the
like.
Contact
information
belongs
in
an
address
element,
possibly
itself
inside
a
footer
.
Footers don't necessarily have to appear at the end of a section, though they usually do.
The
footer
element
is
inappropriate
for
containing
entire
sections.
For
appendices,
indexes,
long
colophons,
verbose
license
agreements,
and
other
such
content
which
needs
sectioning
with
headings
and
so
forth,
regular
section
elements
should
be
used,
not
a
footer
.
Here is a page with two footers, one at the top and one at the bottom, with the same content:
<body> <footer><a href="../">Back to index...</a></footer> <hgroup> <h1>Lorem ipsum</h1> <h2>The ipsum of all lorems</h2> </hgroup> <p>A dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.</p> <footer><a href="../">Back to index...</a></footer> </body>
address
element
formatBlock
candidate
.
header
,
footer
,
or
address
element
descendants.
HTMLElement
.
The
address
element
represents
the
contact
information
for
its
nearest
article
or
body
element
ancestor.
If
that
is
the
body
element
,
then
the
contact
information
applies
to
the
document
as
a
whole.
For example, a page at the W3C Web site related to HTML might include the following contact information:
<ADDRESS> <A href="../People/Raggett/">Dave Raggett</A>, <A href="../People/Arnaud/">Arnaud Le Hors</A>, contact persons for the <A href="Activity">W3C HTML Activity</A> </ADDRESS>
The
address
element
must
not
be
used
to
represent
arbitrary
addresses
(e.g.
postal
addresses),
unless
those
addresses
are
in
fact
the
relevant
contact
information.
(The
p
element
is
the
appropriate
element
for
marking
up
postal
addresses
in
general.)
The
address
element
must
not
contain
information
other
than
contact
information.
For
example,
the
following
is
non-conforming
use
of
the
address
element:
<ADDRESS>Last Modified: 1999/12/24 23:37:50</ADDRESS>
Typically,
the
address
element
would
be
included
along
with
other
information
in
a
footer
element.
The
contact
information
for
a
node
node
is
a
collection
of
address
elements
defined
by
the
first
applicable
entry
from
the
following
list:
article
element
body
element
The
contact
information
consists
of
all
the
address
elements
that
have
node
as
an
ancestor
and
do
not
have
another
body
or
article
element
ancestor
that
is
a
descendant
of
node
.
article
element
body
element
The
contact
information
of
node
is
the
same
as
the
contact
information
of
the
nearest
article
or
body
element
ancestor,
whichever
is
nearest.
Document
has
a
body
element
The
contact
information
of
node
is
the
same
as
the
contact
information
the
body
element
of
the
Document
.
There is no contact information for node .
User agents may expose the contact information of a node to the user, or use it for other purposes, such as indexing sections based on the sections' contact information.
Contact
information
for
one
sectioning
content
element,
e.g.
an
aside
element,
does
not
apply
to
its
ancestor
elements,
e.g.
the
page's
body
.
The
h1
–
h6
elements
and
the
hgroup
element
are
headings.
The first element of heading content in an element of sectioning content represents the heading for that section. Subsequent headings of equal or higher rank start new (implied) sections, headings of lower rank start implied subsections that are part of the previous one. In both cases, the element represents the heading of the implied section.
Sectioning content elements are always considered subsections of their nearest ancestor element of sectioning content , regardless of what implied sections other headings may have created.
Certain
elements
are
said
to
be
sectioning
roots
,
including
blockquote
and
td
elements.
These
elements
can
have
their
own
outlines,
but
the
sections
and
headings
inside
these
elements
do
not
contribute
to
the
outlines
of
their
ancestors.
For the following fragment:
<body> <h1>Foo</h1> <h2>Bar</h2> <blockquote> <h3>Bla</h3> </blockquote> <p>Baz</p> <h2>Quux</h2> <section> <h3>Thud</h3> </section> <p>Grunt</p> </body>
...the structure would be:
body
section,
containing
the
"Grunt"
paragraph)
section
section)
Notice
how
the
section
ends
the
earlier
implicit
section
so
that
a
later
paragraph
("Grunt")
is
back
at
the
top
level.
Sections
may
contain
headings
of
any
rank
,
but
authors
are
strongly
encouraged
to
either
use
only
h1
elements,
or
to
use
elements
of
the
appropriate
rank
for
the
section's
nesting
level.
Authors are also encouraged to explicitly wrap sections in elements of sectioning content , instead of relying on the implicit sections generated by having multiple headings in one element of sectioning content .
For example, the following is correct:
<body> <h4>Apples</h4> <p>Apples are fruit.</p> <section> <h2>Taste</h2> <p>They taste lovely.</p> <h6>Sweet</h6> <p>Red apples are sweeter than green ones.</p> <h1>Color</h1> <p>Apples come in various colors.</p> </section> </body>
However, the same document would be more clearly expressed as:
<body> <h1>Apples</h1> <p>Apples are fruit.</p> <section> <h2>Taste</h2> <p>They taste lovely.</p> <section> <h3>Sweet</h3> <p>Red apples are sweeter than green ones.</p> </section> </section> <section> <h2>Color</h2> <p>Apples come in various colors.</p> </section> </body>
Both of the documents above are semantically identical and would produce the same outline in compliant user agents.
This section defines an algorithm for creating an outline for a sectioning content element or a sectioning root element. It is defined in terms of a walk over the nodes of a DOM tree, in tree order, with each node being visited when it is entered and when it is exited during the walk.
The
outline
for
a
sectioning
content
element
or
a
sectioning
root
element
consists
of
a
list
of
one
or
more
potentially
nested
sections
.
A
section
is
a
container
that
corresponds
to
some
nodes
in
the
original
DOM
tree.
Each
section
can
have
one
heading
associated
with
it,
and
can
contain
any
number
of
further
nested
sections.
The
algorithm
for
the
outline
also
associates
each
node
in
the
DOM
tree
with
a
particular
section
and
potentially
a
heading.
(The
sections
in
the
outline
aren't
section
elements,
though
some
may
correspond
to
such
elements
—
they
are
merely
conceptual
sections.)
The following markup fragment:
<body> <h1>A</h1> <p>B</p> <h2>C</h2> <p>D</p> <h2>E</h2> <p>F</p> </body>
...results
in
the
following
outline
being
created
for
the
body
node
(and
thus
the
entire
document):
Section
created
for
body
node.
Associated with heading "A".
Also associated with paragraph "B".
Nested sections:
The algorithm that must be followed during a walk of a DOM subtree rooted at a sectioning content element or a sectioning root element to determine that element's outline is as follows:
Let current outlinee be null. (It holds the element whose outline is being created.)
Let current section be null. (It holds a pointer to a section , so that elements in the DOM can all be associated with a section.)
Create a stack to hold elements, which is used to handle nesting. Initialize this stack to empty.
As you walk over the DOM in tree order , trigger the first relevant step below for each element as you enter and exit it.
The element being exited is a heading content element.
Pop that element from the stack.
Do nothing.
If current outlinee is not null, push current outlinee onto the stack.
Let current outlinee be the element that is being entered.
Let current section be a newly created section for the current outlinee element.
Let there be a new outline for the new current outlinee , initialized with just the new current section as the only section in the outline.
Pop the top element from the stack, and let the current outlinee be that element.
Let current section be the last section in the outline of the current outlinee element.
Append the outline of the sectioning content element being exited to the current section . (This does not change which section is the last section in the outline .)
Run these steps:
Pop the top element from the stack, and let the current outlinee be that element.
Let current section be the last section in the outline of the current outlinee element.
Finding the deepest child : If current section has no child sections, stop these steps.
Let current section be the last child section of the current current section .
Go back to the substep labeled finding the deepest child .
The current outlinee is the element being exited.
Let current section be the first section in the outline of the current outlinee element.
Skip to the next step in the overall set of steps. (The walk is over.)
Do nothing.
If the current section has no heading, let the element being entered be the heading for the current section .
Otherwise, if the element being entered has a rank equal to or greater than the heading of the last section of the outline of the current outlinee , then create a new section and append it to the outline of the current outlinee element, so that this new section is the new last section of that outline. Let current section be that new section. Let the element being entered be the new heading for the current section .
Otherwise, run these substeps:
Let candidate section be current section .
If the element being entered has a rank lower than the rank of the heading of the candidate section , then create a new section , and append it to candidate section . (This does not change which section is the last section in the outline.) Let current section be this new section. Let the element being entered be the new heading for the current section . Abort these substeps.
Let new candidate section be the section that contains candidate section in the outline of current outlinee .
Let candidate section be new candidate section .
Return to step 2.
Push the element being entered onto the stack. (This causes the algorithm to skip any descendants of the element.)
Recall
that
h1
has
the
highest
rank,
and
h6
has
the
lowest
rank.
Do nothing.
In addition, whenever you exit a node, after doing the steps above, if current section is not null, associate the node with the section current section .
If the current outlinee is null, then there was no sectioning content element or sectioning root element in the DOM. There is no outline . Abort these steps.
Associate any nodes that were not associated with a section in the steps above with current outlinee as their section.
Associate all nodes with the heading of the section with which they are associated, if any.
If current outlinee is the body element , then the outline created for that element is the outline of the entire document.
The tree of sections created by the algorithm above, or a proper subset thereof, must be used when generating document outlines, for example when generating tables of contents.
When creating an interactive table of contents, entries should jump the user to the relevant sectioning content element, if the section was created for a real element in the original document, or to the relevant heading content element, if the section in the tree was generated for a heading in the above process.
Selecting
the
first
section
of
the
document
therefore
always
takes
the
user
to
the
top
of
the
document,
regardless
of
where
the
first
heading
in
the
body
is
to
be
found.
The following JavaScript function shows how the tree walk could be implemented. The root argument is the root of the tree to walk, and the enter and exit arguments are callbacks that are called with the nodes as they are entered and exited. [ECMA262]
function (root, enter, exit) {
var node = root;
start: while (node) {
enter(node);
if (node.firstChild) {
node = node.firstChild;
continue start;
}
while (node) {
exit(node);
if (node.nextSibling) {
node = node.nextSibling;
continue start;
}
if (node == root)
node = null;
else
node = node.parentNode;
}
}
}
Given
the
outline
of
a
document,
but
ignoring
any
sections
created
for
nav
and
aside
elements,
and
any
of
their
descendants,
if
the
only
root
of
the
tree
is
the
body
element
's
section
,
and
it
has
only
a
single
subsection
which
is
created
by
an
article
element,
then
the
heading
of
the
body
element
should
be
assumed
to
be
a
site-wide
heading,
and
the
heading
of
the
article
element
should
be
assumed
to
be
the
page's
heading.
If
a
page
starts
with
a
heading
that
is
common
to
the
whole
site,
the
document
must
be
authored
such
that,
in
the
document's
outline
,
ignoring
any
sections
created
for
nav
and
aside
elements
and
any
of
their
descendants,
the
tree
has
only
one
root
section
,
the
body
element
's
section,
its
heading
is
the
site-wide
heading,
the
body
element
has
just
one
subsection,
that
subsection
is
created
by
an
article
element,
and
that
article
's
heading
is
the
page
heading.
If
a
page
does
not
contain
a
site-wide
heading,
then
the
page
must
be
authored
such
that,
in
the
document's
outline
,
ignoring
any
sections
created
for
nav
and
aside
elements
and
any
of
their
descendants,
either
the
body
element
has
no
subsections,
or
it
has
more
than
one
subsection,
or
it
has
a
single
subsection
but
that
subsection
is
not
created
by
an
article
element,
or
there
is
more
than
one
section
at
the
root
of
the
outline.
Conceptually, a site is thus a document with many articles — when those articles are split into many pages, the heading of the original single page becomes the heading of the site, repeated on every page.
p
element
formatBlock
candidate
.
interface
HTMLParagraphElement
:
HTMLElement
{};
The
p
element
represents
a
paragraph
.
The following examples are conforming HTML fragments:
<p>The little kitten gently seated himself on a piece of carpet. Later in his life, this would be referred to as the time the cat sat on the mat.</p>
<fieldset> <legend>Personal information</legend> <p> <label>Name: <input name="n"></label> <label><input name="anon" type="checkbox"> Hide from other users</label> </p> <p><label>Address: <textarea name="a"></textarea></label></p> </fieldset>
<p>There was once an example from Femley,<br> Whose markup was of dubious quality.<br> The validator complained,<br> So the author was pained,<br> To move the error from the markup to the rhyming.</p>
The
p
element
should
not
be
used
when
a
more
specific
element
is
more
appropriate.
The following example is technically correct:
<section> <!-- ... --> <p>Last modified: 2001-04-23</p> <p>Author: fred@example.com</p> </section>
However, it would be better marked-up as:
<section> <!-- ... --> <footer>Last modified: 2001-04-23</footer> <address>Author: fred@example.com</address> </section>
Or:
<section> <!-- ... --> <footer> <p>Last modified: 2001-04-23</p> <address>Author: fred@example.com</address> </footer> </section>
hr
element
interface
HTMLHRElement
:
HTMLElement
{};
The
hr
element
represents
a
paragraph
-level
thematic
break,
e.g.
a
scene
change
in
a
story,
or
a
transition
to
another
topic
within
a
section
of
a
reference
book.
br
element
interface
HTMLBRElement
:
HTMLElement
{};
The
br
element
represents
a
line
break.
br
elements
must
be
empty.
Any
content
inside
br
elements
must
not
be
considered
part
of
the
surrounding
text.
br
elements
must
be
used
only
for
line
breaks
that
are
actually
part
of
the
content,
as
in
poems
or
addresses.
The
following
example
is
correct
usage
of
the
br
element:
<p>P. Sherman<br> 42 Wallaby Way<br> Sydney</p>
br
elements
must
not
be
used
for
separating
thematic
groups
in
a
paragraph.
The
following
examples
are
non-conforming,
as
they
abuse
the
br
element:
<p><a ...>34 comments.</a><br> <a ...>Add a comment.<a></p>
<p><label>Name: <input name="name"></label><br> <label>Address: <input name="address"></label></p>
Here are alternatives to the above, which are correct:
<p><a ...>34 comments.</a></p> <p><a ...>Add a comment.<a></p>
<p><label>Name: <input name="name"></label></p> <p><label>Address: <input name="address"></label></p>
If
a
paragraph
consists
of
nothing
but
a
single
br
element,
it
represents
a
placeholder
blank
line
(e.g.
as
in
a
template).
Such
blank
lines
must
not
be
used
for
presentation
purposes.
pre
element
formatBlock
candidate
.
interface
HTMLPreElement
:
HTMLElement
{};
The
pre
element
represents
a
block
of
preformatted
text,
in
which
structure
is
represented
by
typographic
conventions
rather
than
by
elements.
In
the
the
HTML
syntax
,
a
leading
newline
character
immediately
following
the
pre
element
start
tag
is
stripped.
Some
examples
of
cases
where
the
pre
element
could
be
used:
Authors are encouraged to consider how preformatted text will be experienced when the formatting is lost, as will be the case for users of speech synthesizers, braille displays, and the like. For cases like ASCII art, it is likely that an alternative presentation, such as a textual description, would be more universally accessible to the readers of the document.
To
represent
a
block
of
computer
code,
the
pre
element
can
be
used
with
a
code
element;
to
represent
a
block
of
computer
output
the
pre
element
can
be
used
with
a
samp
element.
Similarly,
the
kbd
element
can
be
used
within
a
pre
element
to
indicate
text
that
the
user
is
to
enter.
In the following snippet, a sample of computer code is presented.
<p>This is the <code>Panel</code> constructor:</p>
<pre><code>function Panel(element, canClose, closeHandler) {
this.element = element;
this.canClose = canClose;
this.closeHandler = function () { if (closeHandler) closeHandler() };
}</code></pre>
In
the
following
snippet,
samp
and
kbd
elements
are
mixed
in
the
contents
of
a
pre
element
to
show
a
session
of
Zork
I.
<pre><samp>You are in an open field west of a big white house with a boarded front door. There is a small mailbox here. ></samp> <kbd>open mailbox</kbd> <samp>Opening the mailbox reveals: A leaflet. ></samp></pre>
The
following
shows
a
contemporary
poem
that
uses
the
pre
element
to
preserve
its
unusual
formatting,
which
forms
an
intrinsic
part
of
the
poem
itself.
<pre> maxling
it is with a heart
heavy
that i admit loss of a feline
so loved
a friend lost to the
unknown
(night)
~cdr
11dec07</pre>
dialog
element
dt
element
followed
by
one
dd
element.
HTMLElement
.
The
dialog
element
represents
a
conversation,
meeting
minutes,
a
chat
transcript,
a
dialog
in
a
screenplay,
an
instant
message
log,
or
some
other
construct
in
which
different
players
take
turns
in
discourse.
Each
part
of
the
conversation
must
have
an
explicit
talker
(or
speaker)
given
by
a
dt
element,
and
a
discourse
(or
quote)
given
by
a
dd
element.
This example demonstrates this using an extract from Abbot and Costello's famous sketch, Who's on first :
<dialog> <dt> Costello <dd> Look, you gotta first baseman? <dt> Abbott <dd> Certainly. <dt> Costello <dd> Who's playing first? <dt> Abbott <dd> That's right. <dt> Costello <dd> When you pay off the first baseman every month, who gets the money? <dt> Abbott <dd> Every dollar of it. </dialog>
Text
in
a
dt
element
in
a
dialog
element
is
implicitly
the
source
of
the
text
given
in
the
following
dd
element,
and
the
contents
of
the
dd
element
are
implicitly
a
quote
from
that
speaker.
There
is
thus
no
need
to
include
cite
,
q
,
or
blockquote
elements
in
this
markup.
Indeed,
a
q
element
inside
a
dd
element
in
a
conversation
would
actually
imply
the
people
talking
were
themselves
quoting
another
work.
See
the
cite
,
q
,
and
blockquote
elements
for
other
ways
to
cite
or
quote.
blockquote
element
formatBlock
candidate
.
cite
interface HTMLQuoteElement : HTMLElement {
attribute DOMString cite;
};
The
HTMLQuoteElement
interface
is
also
used
by
the
q
element.
The
blockquote
element
represents
a
section
that
is
quoted
from
another
source.
Content
inside
a
blockquote
must
be
quoted
from
another
source,
whose
address,
if
it
has
one,
should
be
cited
in
the
cite
attribute.
If
the
cite
attribute
is
present,
it
must
be
a
valid
URL
.
To
obtain
the
corresponding
citation
link,
the
value
of
the
attribute
must
be
resolved
relative
to
the
element.
User
agents
should
allow
users
to
follow
such
citation
links.
The
cite
DOM
attribute
must
reflect
the
element's
cite
content
attribute.
The
best
way
to
represent
a
conversation
is
not
with
the
cite
and
blockquote
elements,
but
with
the
dialog
element.
This
next
example
shows
the
use
of
cite
alongside
blockquote
:
<p>His next piece was the aptly named <cite>Sonnet 130</cite>:</p> <blockquote cite="http://quotes.example.org/s/sonnet130.html"> <p>My mistress' eyes are nothing like the sun,<br> Coral is far more red, than her lips red,<br> ...
ol
element
li
elements.
reversed
start
interface HTMLOListElement : HTMLElement {
attribute boolean reversed;
attribute long start;
};
The
ol
element
represents
a
list
of
items,
where
the
items
have
been
intentionally
ordered,
such
that
changing
the
order
would
change
the
meaning
of
the
document.
The
items
of
the
list
are
the
li
element
child
nodes
of
the
ol
element,
in
tree
order
.
The
reversed
attribute
is
a
boolean
attribute
.
If
present,
it
indicates
that
the
list
is
a
descending
list
(...,
3,
2,
1).
If
the
attribute
is
omitted,
the
list
is
an
ascending
list
(1,
2,
3,
...).
The
start
attribute,
if
present,
must
be
a
valid
integer
giving
the
ordinal
value
of
the
first
list
item.
If
the
start
attribute
is
present,
user
agents
must
parse
it
as
an
integer
,
in
order
to
determine
the
attribute's
value.
The
default
value,
used
if
the
attribute
is
missing
or
if
the
value
cannot
be
converted
to
a
number
according
to
the
referenced
algorithm,
is
1
if
the
element
has
no
reversed
attribute,
and
is
the
number
of
child
li
elements
otherwise.
The
first
item
in
the
list
has
the
ordinal
value
given
by
the
ol
element's
start
attribute,
unless
that
li
element
has
a
value
attribute
with
a
value
that
can
be
successfully
parsed,
in
which
case
it
has
the
ordinal
value
given
by
that
value
attribute.
Each
subsequent
item
in
the
list
has
the
ordinal
value
given
by
its
value
attribute,
if
it
has
one,
or,
if
it
doesn't,
the
ordinal
value
of
the
previous
item,
plus
one
if
the
reversed
is
absent,
or
minus
one
if
it
is
present.
The
reversed
DOM
attribute
must
reflect
the
value
of
the
reversed
content
attribute.
The
start
DOM
attribute
must
reflect
the
value
of
the
start
content
attribute.
The
following
markup
shows
a
list
where
the
order
matters,
and
where
the
ol
element
is
therefore
appropriate.
Compare
this
list
to
the
equivalent
list
in
the
ul
section
to
see
an
example
of
the
same
items
using
the
ul
element.
<p>I have lived in the following countries (given in the order of when I first lived there):</p> <ol> <li>Switzerland <li>United Kingdom <li>United States <li>Norway </ol>
Note how changing the order of the list changes the meaning of the document. In the following example, changing the relative order of the first two items has changed the birthplace of the author:
<p>I have lived in the following countries (given in the order of when I first lived there):</p> <ol> <li>United Kingdom <li>Switzerland <li>United States <li>Norway </ol>
ul
element
li
elements.
interface
HTMLUListElement
:
HTMLElement
{};
The
ul
element
represents
a
list
of
items,
where
the
order
of
the
items
is
not
important
—
that
is,
where
changing
the
order
would
not
materially
change
the
meaning
of
the
document.
The
items
of
the
list
are
the
li
element
child
nodes
of
the
ul
element.
The
following
markup
shows
a
list
where
the
order
does
not
matter,
and
where
the
ul
element
is
therefore
appropriate.
Compare
this
list
to
the
equivalent
list
in
the
ol
section
to
see
an
example
of
the
same
items
using
the
ol
element.
<p>I have lived in the following countries:</p> <ul> <li>Norway <li>Switzerland <li>United Kingdom <li>United States </ul>
Note that changing the order of the list does not change the meaning of the document. The items in the snippet above are given in alphabetical order, but in the snippet below they are given in order of the size of their current account balance in 2007, without changing the meaning of the document whatsoever:
<p>I have lived in the following countries:</p> <ul> <li>Switzerland <li>Norway <li>United Kingdom <li>United States </ul>
li
element
ol
elements.
ul
elements.
menu
elements.
ol
element:
value
interface HTMLLIElement : HTMLElement {
attribute long value;
};
The
li
element
represents
a
list
item.
If
its
parent
element
is
an
ol
,
ul
,
or
menu
element,
then
the
element
is
an
item
of
the
parent
element's
list,
as
defined
for
those
elements.
Otherwise,
the
list
item
has
no
defined
list-related
relationship
to
any
other
li
element.
The
value
attribute,
if
present,
must
be
a
valid
integer
giving
the
ordinal
value
of
the
list
item.
If
the
value
attribute
is
present,
user
agents
must
parse
it
as
an
integer
,
in
order
to
determine
the
attribute's
value.
If
the
attribute's
value
cannot
be
converted
to
a
number,
the
attribute
must
be
treated
as
if
it
was
absent.
The
attribute
has
no
default
value.
The
value
attribute
is
processed
relative
to
the
element's
parent
ol
element
(q.v.),
if
there
is
one.
If
there
is
not,
the
attribute
has
no
effect.
The
value
DOM
attribute
must
reflect
the
value
of
the
value
content
attribute.
The
following
example,
the
top
ten
movies
are
listed
(in
reverse
order).
Note
the
way
the
list
is
given
a
title
by
using
a
figure
element
and
its
legend
.
<figure> <legend>The top 10 movies of all time</legend> <ol> <li value="10"><cite>Josie and the Pussycats</cite>, 2001</li> <li value="9"><cite lang="sh">Црна мачка, бели мачор</cite>, 1998</li> <li value="8"><cite>A Bug's Life</cite>, 1998</li> <li value="7"><cite>Toy Story</cite>, 1995</li> <li value="6"><cite>Monsters, Inc</cite>, 2001</li> <li value="5"><cite>Cars</cite>, 2006</li> <li value="4"><cite>Toy Story 2</cite>, 1999</li> <li value="3"><cite>Finding Nemo</cite>, 2003</li> <li value="2"><cite>The Incredibles</cite>, 2004</li> <li value="1"><cite>Ratatouille</cite>, 2007</li> </ol> </figure>
The
markup
could
also
be
written
as
follows,
using
the
reversed
attribute
on
the
ol
element:
<figure> <legend>The top 10 movies of all time</legend> <ol reversed> <li><cite>Josie and the Pussycats</cite>, 2001</li> <li><cite lang="sh">Црна мачка, бели мачор</cite>, 1998</li> <li><cite>A Bug's Life</cite>, 1998</li> <li><cite>Toy Story</cite>, 1995</li> <li><cite>Monsters, Inc</cite>, 2001</li> <li><cite>Cars</cite>, 2006</li> <li><cite>Toy Story 2</cite>, 1999</li> <li><cite>Finding Nemo</cite>, 2003</li> <li><cite>The Incredibles</cite>, 2004</li> <li><cite>Ratatouille</cite>, 2007</li> </ol> </figure>
If
the
li
element
is
the
child
of
a
menu
element
and
itself
has
a
child
that
defines
a
command
,
then
the
li
element
will
match
the
:enabled
and
:disabled
pseudo-classes
in
the
same
way
as
the
first
such
child
element
does.
dl
element
dt
elements
followed
by
one
or
more
dd
elements.
interface
HTMLDListElement
:
HTMLElement
{};
The
dl
element
represents
an
association
list
consisting
of
zero
or
more
name-value
groups
(a
description
list).
Each
group
must
consist
of
one
or
more
names
(
dt
elements)
followed
by
one
or
more
values
(
dd
elements).
Name-value groups may be terms and definitions, metadata topics and values, or any other groups of name-value data.
The
values
within
a
group
are
alternatives;
multiple
paragraphs
forming
part
of
the
same
value
must
all
be
given
within
the
same
dd
element.
The order of the list of groups, and of the names and values within each group, may be significant.
If
a
dl
element
is
empty,
it
contains
no
groups.
If
a
dl
element
contains
non-
whitespace
text
nodes
,
or
elements
other
than
dt
and
dd
,
then
those
elements
or
text
nodes
do
not
form
part
of
any
groups
in
that
dl
.
If
a
dl
element
contains
only
dt
elements,
then
it
consists
of
one
group
with
names
but
no
values.
If
a
dl
element
contains
only
dd
elements,
then
it
consists
of
one
group
with
values
but
no
names.
If
a
dl
element
starts
with
one
or
more
dd
elements,
then
the
first
group
has
no
associated
name.
If
a
dl
element
ends
with
one
or
more
dt
elements,
then
the
last
group
has
no
associated
value.
When
a
dl
element
doesn't
match
its
content
model,
it
is
often
due
to
accidentally
using
dd
elements
in
the
place
of
dt
elements
and
vice
versa.
Conformance
checkers
can
spot
such
mistakes
and
might
be
able
to
advise
authors
how
to
correctly
use
the
markup.
In the following example, one entry ("Authors") is linked to two values ("John" and "Luke").
<dl> <dt> Authors <dd> John <dd> Luke <dt> Editor <dd> Frank </dl>
In the following example, one definition is linked to two terms.
<dl> <dt lang="en-US"> <dfn>color</dfn> </dt> <dt lang="en-GB"> <dfn>colour</dfn> </dt> <dd> A sensation which (in humans) derives from the ability of the fine structure of the eye to distinguish three differently filtered analyses of a view. </dd> </dl>
The
following
example
illustrates
the
use
of
the
dl
element
to
mark
up
metadata
of
sorts.
At
the
end
of
the
example,
one
group
has
two
metadata
labels
("Authors"
and
"Editors")
and
two
values
("Robert
Rothman"
and
"Daniel
Jackson").
<dl> <dt> Last modified time </dt> <dd> 2004-12-23T23:33Z </dd> <dt> Recommended update interval </dt> <dd> 60s </dd> <dt> Authors </dt> <dt> Editors </dt> <dd> Robert Rothman </dd> <dd> Daniel Jackson </dd> </dl>
The
following
example
shows
the
dl
element
used
to
give
a
set
of
instructions.
The
order
of
the
instructions
here
is
important
(in
the
other
examples,
the
order
of
the
blocks
was
not
important).
<p>Determine the victory points as follows (use the first matching case):</p> <dl> <dt> If you have exactly five gold coins </dt> <dd> You get five victory points </dd> <dt> If you have one or more gold coins, and you have one or more silver coins </dt> <dd> You get two victory points </dd> <dt> If you have one or more silver coins </dt> <dd> You get one victory point </dd> <dt> Otherwise </dt> <dd> You get no victory points </dd> </dl>
The
following
snippet
shows
a
dl
element
being
used
as
a
glossary.
Note
the
use
of
dfn
to
indicate
the
word
being
defined.
<dl> <dt><dfn>Apartment</dfn>, n.</dt> <dd>An execution context grouping one or more threads with one or more COM objects.</dd> <dt><dfn>Flat</dfn>, n.</dt> <dd>A deflated tire.</dd> <dt><dfn>Home</dfn>, n.</dt> <dd>The user's login directory.</dd> </dl>
The
dl
element
is
inappropriate
for
marking
up
dialogue.
For
an
example
of
how
to
mark
up
dialogue,
see
the
dialog
element.
dt
element
dd
or
dt
elements
inside
dl
elements.
dd
element
inside
a
dialog
element.
HTMLElement
.
The
dt
element
represents
the
term,
or
name,
part
of
a
term-description
group
in
a
description
list
(
dl
element),
and
the
talker,
or
speaker,
part
of
a
talker-discourse
pair
in
a
conversation
(
dialog
element).
The
dt
element
itself,
when
used
in
a
dl
element,
does
not
indicate
that
its
contents
are
a
term
being
defined,
but
this
can
be
indicated
using
the
dfn
element.
If
the
dt
element
is
the
child
of
a
dialog
element,
and
it
further
contains
a
time
element,
then
that
time
element
represents
a
timestamp
for
when
the
associated
discourse
(
dd
element)
was
said,
and
is
not
part
of
the
name
of
the
talker.
The following extract shows how an IM conversation log could be marked up.
<dialog> <dt> <time>14:22</time> egof <dd> I'm not that nerdy, I've only seen 30% of the star trek episodes <dt> <time>14:23</time> kaj <dd> if you know what percentage of the star trek episodes you have seen, you are inarguably nerdy <dt> <time>14:23</time> egof <dd> it's unarguably <dt> <time>14:24</time> kaj <dd> you are not helping your case </dialog>
dd
element
dt
or
dd
elements
inside
dl
elements.
dt
element
inside
a
dialog
element.
HTMLElement
.
The
dd
element
represents
the
description,
definition,
or
value,
part
of
a
term-description
group
in
a
description
list
(
dl
element),
and
the
discourse,
or
quote,
part
in
a
conversation
(
dialog
element).
A
dl
can
be
used
to
define
a
vocabulary
list,
like
in
a
dictionary.
In
the
following
example,
each
entry,
given
by
a
dt
with
a
dfn
,
has
several
dd
s,
showing
the
various
parts
of
the
definition.
<dl> <dt><dfn>happiness</dfn></dt> <dd class="pronunciation">/'hæ p. nes/</dd> <dd class="part-of-speech"><i><abbr>n.</abbr></i></dd> <dd>The state of being happy.</dd> <dd>Good fortune; success. <q>Oh <b>happiness</b>! It worked!</q></dd> <dt><dfn>rejoice</dfn></dt> <dd class="pronunciation">/ri jois'/</dd> <dd><i class="part-of-speech"><abbr>v.intr.</abbr></i> To be delighted oneself.</dd> <dd><i class="part-of-speech"><abbr>v.tr.</abbr></i> To cause one to be delighted.</dd> </dl>
This
specification
does
not
define
any
markup
specifically
for
marking
up
lists
of
keywords
that
apply
to
a
group
of
pages
(also
known
as
tag
clouds
).
In
general,
authors
are
encouraged
to
either
mark
up
such
lists
using
ul
elements
with
explicit
inline
counts
that
are
then
hidden
and
turned
into
a
presentational
effect
using
a
style
sheet,
or
to
use
SVG.
Here, three tags are included in a short tag cloud:
<style>
@media screen, print, handheld, tv {
/* should be ignored by non-visual browsers */
.tag-cloud > li > span { display: none; }
.tag-cloud > li { display: inline; }
.tag-cloud-1 { font-size: 0.7em; }
.tag-cloud-2 { font-size: 0.9em; }
.tag-cloud-3 { font-size: 1.1em; }
.tag-cloud-4 { font-size: 1.3em; }
.tag-cloud-5 { font-size: 1.5em; }
}
</style>
...
<ul class="tag-cloud">
<li class="tag-cloud-4"><a title="28 instances" href="/t/apple">apple</a> <span>(popular)</span>
<li class="tag-cloud-2"><a title="6 instances" href="/t/kiwi">kiwi</a> <span>(rare)</span>
<li class="tag-cloud-5"><a title="41 instances" href="/t/pear">pear</a> <span>(very popular)</span>
</ul>
The
actual
frequency
of
each
tag
is
given
using
the
title
attribute.
A
CSS
style
sheet
is
provided
to
convert
the
markup
into
a
cloud
of
differently-sized
words,
but
for
user
agents
that
do
not
support
CSS
or
are
not
visual,
the
markup
contains
annotations
like
"(popular)"
or
"(rare)"
to
categorize
the
various
tags
by
frequency,
thus
enabling
all
users
to
benefit
from
the
information.
The
ul
element
is
used
(rather
than
ol
)
because
the
order
is
not
particular
important:
while
the
list
is
in
fact
ordered
alphabetically,
it
would
convey
the
same
information
if
ordered
by,
say,
the
length
of
the
tag.
The
tag
rel
-keyword
is
not
used
on
these
a
elements
because
they
do
not
represent
tags
that
apply
to
the
page
itself;
they
are
just
part
of
an
index
listing
the
tags
themselves.
a
element
href
target
ping
rel
media
hreflang
type
interface HTMLAnchorElement : HTMLElement {
stringifier attribute DOMString href;
attribute DOMString target;
attribute DOMString ping;
attribute DOMString rel;
readonly attribute DOMTokenList relList;
attribute DOMString media;
attribute DOMString hreflang;
attribute DOMString type;
// URL decomposition attributes
attribute DOMString protocol;
attribute DOMString host;
attribute DOMString hostname;
attribute DOMString port;
attribute DOMString pathname;
attribute DOMString search;
attribute DOMString hash;
};
If
the
a
element
has
an
href
attribute,
then
it
represents
a
hyperlink
(a
hypertext
anchor).
If
the
a
element
has
no
href
attribute,
then
the
element
represents
a
placeholder
for
where
a
link
might
otherwise
have
been
placed,
if
it
had
been
relevant.
The
target
,
ping
,
rel
,
media
,
hreflang
,
and
type
attributes
must
be
omitted
if
the
href
attribute
is
not
present.
If
a
site
uses
a
consistent
navigation
tool
bar
on
every
page,
then
the
link
that
would
normally
link
to
the
page
itself
could
be
marked
up
using
an
a
element:
<nav> <ul> <li> <a href="/">Home</a> </li> <li> <a href="/news">News</a> </li> <li> <a>Examples</a> </li> <li> <a href="/legal">Legal</a> </li> </ul> </nav>
Interactive
user
agents
should
allow
users
to
follow
hyperlinks
created
using
the
a
element.
The
href
,
target
and
ping
attributes
decide
how
the
link
is
followed.
The
rel
,
media
,
hreflang
,
and
type
attributes
may
be
used
to
indicate
to
the
user
the
likely
nature
of
the
target
resource
before
the
user
follows
the
link.
The
activation
behavior
of
a
elements
that
represent
hyperlinks
is
to
run
the
following
steps:
If
the
DOMActivate
event
in
question
is
not
trusted
(i.e.
a
click()
method
call
was
the
reason
for
the
event
being
dispatched),
and
the
a
element's
target
attribute
is
such
that
applying
the
rules
for
choosing
a
browsing
context
given
a
browsing
context
name
,
using
the
value
of
the
target
attribute
as
the
browsing
context
name,
would
result
in
there
not
being
a
chosen
browsing
context,
then
raise
an
INVALID_ACCESS_ERR
exception
and
abort
these
steps.
If
the
target
of
the
click
event
is
an
img
element
with
an
ismap
attribute
specified,
then
server-side
image
map
processing
must
be
performed,
as
follows:
DOMActivate
event
was
dispatched
as
the
result
of
a
real
pointing-device-triggered
click
event
on
the
img
element,
then
let
x
be
the
distance
in
CSS
pixels
from
the
left
edge
of
the
image's
left
border,
if
it
has
one,
or
the
left
edge
of
the
image
otherwise,
to
the
location
of
the
click,
and
let
y
be
the
distance
in
CSS
pixels
from
the
top
edge
of
the
image's
top
border,
if
it
has
one,
or
the
top
edge
of
the
image
otherwise,
to
the
location
of
the
click.
Otherwise,
let
x
and
y
be
zero.
Finally,
the
user
agent
must
follow
the
hyperlink
defined
by
the
a
element.
If
the
steps
above
defined
a
hyperlink
suffix
,
then
take
that
into
account
when
following
the
hyperlink.
The
DOM
attributes
href
,
ping
,
target
,
rel
,
media
,
hreflang
,
and
type
,
must
reflect
the
respective
content
attributes
of
the
same
name.
The
DOM
attribute
relList
must
reflect
the
rel
content
attribute.
The
a
element
also
suports
the
complement
of
URL
decomposition
attributes
,
protocol
,
host
,
port
,
hostname
,
pathname
,
search
,
and
hash
.
These
must
follow
the
rules
given
for
URL
decomposition
attributes,
with
the
input
being
the
result
of
resolving
the
element's
href
attribute
relative
to
the
element,
if
there
is
such
an
attribute
and
resolving
it
is
successful,
or
the
empty
string
otherwise;
and
the
common
setter
action
being
the
same
as
setting
the
element's
href
attribute
to
the
new
output
value.
The
a
element
may
be
wrapped
around
entire
paragraphs,
lists,
tables,
and
so
forth,
even
entire
sections,
so
long
as
there
is
no
interactive
content
within
(e.g.
buttons
or
other
links).
This
example
shows
how
this
can
be
used
to
make
an
entire
advertising
block
into
a
link:
<aside class="advertising"> <h1>Advertising</h1> <a href="http://ad.example.com/?adid=1929&pubid=1422"> <section> <h1>Mellblomatic 9000!</h1> <p>Turn all your widgets into mellbloms!</p> <p>Only $9.99 plus shipping and handling.</p> </section> </a> <a href="http://ad.example.com/?adid=375&pubid=1422"> <section> <h1>The Mellblom Browser</h1> <p>Web browsing at the speed of light.</p> <p>No other browser goes faster!</p> </section> </a> </aside>
q
element
cite
q
element
uses
the
HTMLQuoteElement
interface.
The
q
element
represents
some
phrasing
content
quoted
from
another
source.
Quotation
punctuation
(such
as
quotation
marks)
must
not
appear
immediately
before,
after,
or
inside
q
elements;
they
will
be
inserted
into
the
rendering
by
the
user
agent.
Content
inside
a
q
element
must
be
quoted
from
another
source,
whose
address,
if
it
has
one,
should
be
cited
in
the
cite
attribute.
The
source
may
be
fictional,
as
when
quoting
characters
in
a
novel
or
screenplay.
If
the
cite
attribute
is
present,
it
must
be
a
valid
URL
.
To
obtain
the
corresponding
citation
link,
the
value
of
the
attribute
must
be
resolved
relative
to
the
element.
User
agents
should
allow
users
to
follow
such
citation
links.
The
q
element
must
not
be
used
in
place
of
quotation
marks
that
do
not
represent
quotes;
for
example,
it
is
inappropriate
to
use
the
q
element
for
marking
up
sarcastic
statements.
The
use
of
q
elements
to
mark
up
quotations
is
entirely
optional;
using
explicit
quotation
punctuation
without
q
elements
is
just
as
correct.
Here
is
a
simple
example
of
the
use
of
the
q
element:
<p>The man said <q>Things that are impossible just take longer</q>. I disagreed with him.</p>
Here
is
an
example
with
both
an
explicit
citation
link
in
the
q
element,
and
an
explicit
citation
outside:
<p>The W3C page <cite>About W3C</cite> says the W3C's mission is <q cite="http://www.w3.org/Consortium/">To lead the World Wide Web to its full potential by developing protocols and guidelines that ensure long-term growth for the Web</q>. I disagree with this mission.</p>
In the following example, the quotation itself contains a quotation:
<p>In <cite>Example One</cite>, he writes <q>The man said <q>Things that are impossible just take longer</q>. I disagreed with him</q>. Well, I disagree even more!</p>
In
the
following
example,
quotation
marks
are
used
instead
of
the
q
element:
<p>His best argument was ❝I disagree❞, which I thought was laughable.</p>
In
the
following
example,
there
is
no
quote
—
the
quotation
marks
are
used
to
name
a
word.
Use
of
the
q
element
in
this
case
would
be
inappropriate.
<p>The word "ineffable" could have been used to describe the disaster resulting from the campaign's mismanagement.</p>
cite
element
HTMLElement
.
The
cite
element
represents
the
title
of
a
work
(e.g.
a
book,
a
paper,
an
essay,
a
poem,
a
score,
a
song,
a
script,
a
film,
a
TV
show,
a
game,
a
sculpture,
a
painting,
a
theatre
production,
a
play,
an
opera,
a
musical,
an
exhibition,
etc).
This
can
be
a
work
that
is
being
quoted
or
referenced
in
detail
(i.e.
a
citation),
or
it
can
just
be
a
work
that
is
mentioned
in
passing.
A
person's
name
is
not
the
title
of
a
work
—
even
if
people
call
that
person
a
piece
of
work
—
and
the
element
must
therefore
not
be
used
to
mark
up
people's
names.
(In
some
cases,
the
b
element
might
be
appropriate
for
names;
e.g.
in
a
gossip
article
where
the
names
of
famous
people
are
keywords
rendered
with
a
different
style
to
draw
attention
to
them.
In
other
cases,
if
an
element
is
really
needed,
the
span
element
can
be
used.)
A
ship
is
similarly
not
a
work,
and
the
element
must
not
be
used
to
mark
up
ship
names
(the
i
element
can
be
used
for
that
purpose).
This
next
example
shows
a
typical
use
of
the
cite
element:
<p>My favorite book is <cite>The Reality Dysfunction</cite> by Peter F. Hamilton. My favorite comic is <cite>Pearls Before Swine</cite> by Stephan Pastis. My favorite track is <cite>Jive Samba</cite> by the Cannonball Adderley Sextet.</p>
This is correct usage:
<p>According to the Wikipedia article <cite>HTML</cite>, as it stood in mid-February 2008, leaving attribute values unquoted is unsafe. This is obviously an over-simplification.</p>
The
following,
however,
is
incorrect
usage,
as
the
cite
element
here
is
containing
far
more
than
the
title
of
the
work:
<!-- do not copy this example, it is an example of bad usage! --> <p>According to <cite>the Wikipedia article on HTML</cite>, as it stood in mid-February 2008, leaving attribute values unquoted is unsafe. This is obviously an over-simplification.</p>
The
cite
element
is
obviously
a
key
part
of
any
citation
in
a
bibliography,
but
it
is
only
used
to
mark
the
title:
<p><cite>Universal Declaration of Human Rights</cite>, United Nations, December 1948. Adopted by General Assembly resolution 217 A (III).</p>
A
citation
is
not
a
quote
(for
which
the
q
element
is
appropriate).
This
is
incorrect
usage,
because
cite
is
not
for
quotes:
<p><cite>This is wrong!</cite>, said Ian.</p>
This is also incorrect usage, because a person is not a work:
<p><q>This is still wrong!</q>, said <cite>Ian</cite>.</p>
The
correct
usage
does
not
use
a
cite
element:
<p><q>This is correct</q>, said Ian.</p>
As
mentioned
above,
the
b
element
might
be
relevant
for
marking
names
as
being
keywords
in
certain
kinds
of
documents:
<p>And then <b>Ian</b> said <q>this might be right, in a gossip column, maybe!</q>.</p>
em
element
HTMLElement
.
The
em
element
represents
stress
emphasis
of
its
contents.
The
level
of
emphasis
that
a
particular
piece
of
content
has
is
given
by
its
number
of
ancestor
em
elements.
The placement of emphasis changes the meaning of the sentence. The element thus forms an integral part of the content. The precise way in which emphasis is used in this way depends on the language.
These examples show how changing the emphasis changes the meaning. First, a general statement of fact, with no emphasis:
<p>Cats are cute animals.</p>
By emphasizing the first word, the statement implies that the kind of animal under discussion is in question (maybe someone is asserting that dogs are cute):
<p><em>Cats</em> are cute animals.</p>
Moving the emphasis to the verb, one highlights that the truth of the entire sentence is in question (maybe someone is saying cats are not cute):
<p>Cats <em>are</em> cute animals.</p>
By moving it to the adjective, the exact nature of the cats is reasserted (maybe someone suggested cats were mean animals):
<p>Cats are <em>cute</em> animals.</p>
Similarly, if someone asserted that cats were vegetables, someone correcting this might emphasize the last word:
<p>Cats are cute <em>animals</em>.</p>
By emphasizing the entire sentence, it becomes clear that the speaker is fighting hard to get the point across. This kind of emphasis also typically affects the punctuation, hence the exclamation mark here.
<p><em>Cats are cute animals!</em></p>
Anger mixed with emphasizing the cuteness could lead to markup such as:
<p><em>Cats are <em>cute</em> animals!</em></p>
The
em
element
isn't
a
generic
"italics"
element.
Sometimes,
text
is
intended
to
stand
out
from
the
rest
of
the
paragraph,
as
if
it
was
in
a
different
mood
or
voice.
For
this,
the
i
element
is
more
appropriate.
The
em
element
also
isn't
intended
to
convey
importance;
for
that
purpose,
the
strong
element
is
more
appropriate.
strong
element
HTMLElement
.
The
strong
element
represents
strong
importance
for
its
contents.
The
relative
level
of
importance
of
a
piece
of
content
is
given
by
its
number
of
ancestor
strong
elements;
each
strong
element
increases
the
importance
of
its
contents.
Changing
the
importance
of
a
piece
of
text
with
the
strong
element
does
not
change
the
meaning
of
the
sentence.
Here is an example of a warning notice in a game, with the various parts marked up according to how important they are:
<p><strong>Warning.</strong> This dungeon is dangerous. <strong>Avoid the ducks.</strong> Take any gold you find. <strong><strong>Do not take any of the diamonds</strong>, they are explosive and <strong>will destroy anything within ten meters.</strong></strong> You have been warned.</p>
small
element
HTMLElement
.
The
small
element
represents
small
print
or
other
side
comments.
Small print typically features disclaimers, caveats, legal restrictions, or copyrights. Small print is also sometimes used for attribution, or for satisfying licensing requirements.
The
small
element
does
not
"de-emphasize"
or
lower
the
importance
of
text
emphasized
by
the
em
element
or
marked
as
important
with
the
strong
element.
In this example the footer contains contact information and a copyright notice.
<footer> <address> For more details, contact <a href="mailto:js@example.com">John Smith</a>. </address> <p><small>© copyright 2038 Example Corp.</small></p> </footer>
In
this
second
example,
the
small
element
is
used
for
a
side
comment
in
an
article.
<p>Example Corp today announced record profits for the second quarter <small>(Full Disclosure: Foo News is a subsidiary of Example Corp)</small>, leading to speculation about a third quarter merger with Demo Group.</p>
This is distinct from a sidebar, which might be multiple paragraphs long and is removed from the main flow of text. In the following example, we see a sidebar from the same article. This sidebar also has small print, indicating the source of the information in the sidebar.
<aside> <h1>Example Corp</h1> <p>This company mostly creates small software and Web sites.</p> <p>The Example Corp company mission is "To provide entertainment and news on a sample basis".</p> <p><small>Information obtained from <a href="http://example.com/about.html">example.com</a> home page.</small></p> </aside>
In
this
last
example,
the
small
element
is
marked
as
being
important
small
print.
<p><strong><small>Continued use of this service will result in a kiss.</small></strong></p>
mark
element
HTMLElement
.
The
mark
element
represents
a
run
of
text
in
one
document
marked
or
highlighted
for
reference
purposes,
due
to
its
relevance
in
another
context.
When
used
in
a
quotation
or
other
block
of
text
referred
to
from
the
prose,
it
indicates
a
highlight
that
was
not
originally
present
but
which
has
been
added
to
bring
the
reader's
attention
to
a
part
of
the
text
that
might
not
have
been
considered
important
by
the
original
author
when
the
block
was
originally
written,
but
which
is
now
under
previously
unexpected
scrutiny.
When
used
in
the
main
prose
of
a
document,
it
indicates
a
part
of
the
document
that
has
been
highlighted
due
to
its
likely
relevance
to
the
user's
current
activity.
This
example
shows
how
the
mark
element
can
be
used
to
bring
attention
to
a
particular
part
of
a
quotation:
<p lang="en-US">Consider the following quote:</p> <blockquote lang="en-GB"> <p>Look around and you will find, no-one's really <mark>colour</mark> blind.</p> </blockquote> <p lang="en-US">As we can tell from the <em>spelling</em> of the word, the person writing this quote is clearly not American.</p>
Another
example
of
the
mark
element
is
highlighting
parts
of
a
document
that
are
matching
some
search
string.
If
someone
looked
at
a
document,
and
the
server
knew
that
the
user
was
searching
for
the
word
"kitten",
then
the
server
might
return
the
document
with
one
paragraph
modified
as
follows:
<p>I also have some <mark>kitten</mark>s who are visiting me these days. They're really cute. I think they like my garden! Maybe I should adopt a <mark>kitten</mark>.</p>
In the following snippet, a paragraph of text refers to a specific part of a code fragment.
<p>The highlighted part below is where the error lies:</p> <pre><code>var i: Integer; begin i := <mark>1.1</mark>; end.</code></pre>
This
is
another
example
showing
the
use
of
mark
to
highlight
a
part
of
quoted
text
that
was
originally
not
emphasized.
In
this
example,
common
typographic
conventions
have
led
the
author
to
explicitly
style
mark
elements
in
quotes
to
render
in
italics.
<article>
<style>
blockquote mark, q mark {
font: inherit; font-style: italic;
text-decoration: none;
background: transparent; color: inherit;
}
.bubble em {
font: inherit; font-size: larger;
text-decoration: underline;
}
</style>
<h1>She knew</h1>
<p>Did you notice the subtle joke in the joke on panel 4?</p>
<blockquote>
<p class="bubble">I didn't <em>want</em> to believe. <mark>Of course
on some level I realized it was a known-plaintext attack.</mark> But I
couldn't admit it until I saw for myself.</p>
</blockquote>
<p>(Emphasis mine.) I thought that was great. It's so pedantic, yet it
explains everything neatly.</p>
</article>
Note,
incidentally,
the
distinction
between
the
em
element
in
this
example,
which
is
part
of
the
original
text
being
quoted,
and
the
mark
element,
which
is
highlighting
a
part
for
comment.
The
following
example
shows
the
difference
between
denoting
the
importance
of
a
span
of
text
(
strong
)
as
opposed
to
denoting
the
relevance
of
a
span
of
text
(
mark
).
It
is
an
extract
from
a
textbook,
where
the
extract
has
had
the
parts
relevant
to
the
exam
highlighted.
The
safety
warnings,
important
though
they
may
be,
are
apparently
not
relevant
to
the
exam.
<h3>Wormhole Physics Introduction</h3> <p><mark>A wormhole in normal conditions can be held open for a maximum of just under 39 minutes.</mark> Conditions that can increase the time include a powerful energy source coupled to one or both of the gates connecting the wormhole, and a large gravity well (such as a black hole).</p> <p><mark>Momentum is preserved across the wormhole. Electromagnetic radiation can travel in both directions through a wormhole, but matter cannot.</mark></p> <p>When a wormhole is created, a vortex normally forms. <strong>Warning: The vortex caused by the wormhole opening will annihilate anything in its path.</strong> Vortexes can be avoided when using sufficiently advanced dialing technology.</p> <p><mark>An obstruction in a gate will prevent it from accepting a wormhole connection.</mark></p>
dfn
element
dfn
elements.
title
attribute
has
special
semantics
on
this
element.
HTMLElement
.
The
dfn
element
represents
the
defining
instance
of
a
term.
The
paragraph
,
description
list
group
,
or
section
that
is
the
nearest
ancestor
of
the
dfn
element
must
also
contain
the
definition(s)
for
the
term
given
by
the
dfn
element.
Defining
term
:
If
the
dfn
element
has
a
title
attribute,
then
the
exact
value
of
that
attribute
is
the
term
being
defined.
Otherwise,
if
it
contains
exactly
one
element
child
node
and
no
child
text
nodes
,
and
that
child
element
is
an
abbr
element
with
a
title
attribute,
then
the
exact
value
of
that
attribute
is
the
term
being
defined.
Otherwise,
it
is
the
exact
textContent
of
the
dfn
element
that
gives
the
term
being
defined.
If
the
title
attribute
of
the
dfn
element
is
present,
then
it
must
contain
only
the
term
being
defined.
The
title
attribute
of
ancestor
elements
does
not
affect
dfn
elements.
An
a
element
that
links
to
a
dfn
element
represents
an
instance
of
the
term
defined
by
the
dfn
element.
In the following fragment, the term "GDO" is first defined in the first paragraph, then used in the second.
<p>The <dfn><abbr title="Garage Door Opener">GDO</abbr></dfn> is a device that allows off-world teams to open the iris.</p> <!-- ... later in the document: --> <p>Teal'c activated his <abbr title="Garage Door Opener">GDO</abbr> and so Hammond ordered the iris to be opened.</p>
With
the
addition
of
an
a
element,
the
reference
can
be
made
explicit:
<p>The <dfn id=gdo><abbr title="Garage Door Opener">GDO</abbr></dfn> is a device that allows off-world teams to open the iris.</p> <!-- ... later in the document: --> <p>Teal'c activated his <a href=#gdo><abbr title="Garage Door Opener">GDO</abbr></a> and so Hammond ordered the iris to be opened.</p>
abbr
element
title
attribute
has
special
semantics
on
this
element.
HTMLElement
.
The
abbr
element
represents
an
abbreviation
or
acronym,
optionally
with
its
expansion.
The
title
attribute
may
be
used
to
provide
an
expansion
of
the
abbreviation.
The
attribute,
if
specified,
must
contain
an
expansion
of
the
abbreviation,
and
nothing
else.
The
paragraph
below
contains
an
abbreviation
marked
up
with
the
abbr
element.
This
paragraph
defines
the
term
"Web
Hypertext
Application
Technology
Working
Group".
<p>The <dfn id=whatwg><abbr title="Web Hypertext Application Technology Working Group">WHATWG</abbr></dfn> is a loose unofficial collaboration of Web browser manufacturers and interested parties who wish to develop new technologies designed to allow authors to write and deploy Applications over the World Wide Web.</p>
An alternative way to write this would be:
<p>The <dfn id=whatwg>Web Hypertext Application Technology Working Group</dfn> (<abbr title="Web Hypertext Application Technology Working Group">WHATWG</abbr>) is a loose unofficial collaboration of Web browser manufacturers and interested parties who wish to develop new technologies designed to allow authors to write and deploy Applications over the World Wide Web.</p>
This
paragraph
has
two
abbreviations.
Notice
how
only
one
is
defined;
the
other,
with
no
expansion
associated
with
it,
does
not
use
the
abbr
element.
<p>The <abbr title="Web Hypertext Application Technology Working Group">WHATWG</abbr> started working on HTML 5 in 2004.</p>
This paragraph links an abbreviation to its definition.
<p>The <a href="#whatwg"><abbr title="Web Hypertext Application Technology Working Group">WHATWG</abbr></a> community does not have much representation from Asia.</p>
This paragraph marks up an abbreviation without giving an expansion, possibly as a hook to apply styles for abbreviations (e.g. smallcaps).
<p>Philip` and Dashiva both denied that they were going to get the issue counts from past revisions of the specification to backfill the <abbr>WHATWG</abbr> issue graph.</p>
If an abbreviation is pluralized, the expansion's grammatical number (plural vs singular) must match the grammatical number of the contents of the element.
Here the plural is outside the element, so the expansion is in the singular:
<p>Two <abbr title="Working Group">WG</abbr>s worked on this specification: the <abbr>WHATWG</abbr> and the <abbr>HTMLWG</abbr>.</p>
Here the plural is inside the element, so the expansion is in the plural:
<p>Two <abbr title="Working Groups">WGs</abbr> worked on this specification: the <abbr>WHATWG</abbr> and the <abbr>HTMLWG</abbr>.</p>
Abbreviations do not have to be marked up using this element. It is expected to be useful in the following cases:
abbr
element
with
a
title
attribute
is
an
alternative
to
including
the
expansion
inline
(e.g.
in
parentheses).
abbr
element
with
a
title
attribute
or
include
the
expansion
inline
in
the
text
the
first
time
the
abbreviation
is
used.
abbr
element
can
be
used
without
a
title
attribute.
Providing
an
expansion
in
a
title
attribute
once
will
not
necessarily
cause
other
abbr
elements
in
the
same
document
with
the
same
contents
but
without
a
title
attribute
to
behave
as
if
they
had
the
same
expansion.
Every
abbr
element
is
independent.
time
element
datetime
interface HTMLTimeElement : HTMLElement {
attribute DOMString dateTime;
readonly attribute Date date;
readonly attribute Date time;
readonly attribute Date timezone;
};
The
time
element
represents
a
precise
date
and/or
a
time
in
the
proleptic
Gregorian
calendar.
[GREGORIAN]
This element is intended as a way to encode modern dates and times in a machine-readable way so that user agents can offer to add them to the user's calendar. For example, adding birthday reminders or scheduling events.
The
time
element
is
not
intended
for
encoding
times
for
which
a
precise
date
or
time
cannot
be
established.
For
example,
it
would
be
inappropriate
for
encoding
times
like
"one
millisecond
after
the
big
bang",
"the
early
part
of
the
Jurassic
period",
or
"a
winter
around
250
BCE".
For
dates
before
the
introduction
of
the
Gregorian
calendar,
authors
are
encouraged
to
not
use
the
time
element,
or
else
to
be
very
careful
about
converting
dates
and
times
from
the
period
to
the
Gregorian
calendar.
This
is
complicated
by
the
manner
in
which
the
Gregorian
calendar
was
phased
in,
which
occurred
at
different
times
in
different
countries,
ranging
from
partway
through
the
16th
century
all
the
way
to
early
in
the
20th.
The
datetime
attribute,
if
present,
must
contain
a
valid
date
or
time
string
that
identifies
the
date
or
time
being
specified.
If
the
datetime
attribute
is
not
present,
then
the
date
or
time
must
be
specified
in
the
content
of
the
element,
such
that
the
element's
textContent
is
a
valid
date
or
time
string
in
content
,
and
the
date,
if
any,
must
be
expressed
using
the
Gregorian
calendar.
If
the
datetime
attribute
is
present,
then
the
element
may
be
empty,
in
which
case
the
user
agent
should
convey
the
attribute's
value
to
the
user
when
rendering
the
element.
The
time
element
can
be
used
to
encode
dates,
for
example
in
Microformats.
The
following
shows
a
hypothetical
way
of
encoding
an
event
using
a
variant
on
hCalendar
that
uses
the
time
element:
<div class="vevent"> <a class="url" href="http://www.web2con.com/">http://www.web2con.com/</a> <span class="summary">Web 2.0 Conference</span>: <time class="dtstart" datetime="2007-10-05">October 5</time> - <time class="dtend" datetime="2007-10-20">19</time>, at the <span class="location">Argent Hotel, San Francisco, CA</span> </div>
The
time
element
is
not
necessary
for
encoding
dates
or
times.
In
the
following
snippet,
the
time
is
encoded
using
time
,
so
that
it
can
be
restyled
(e.g.
using
XBL2)
to
match
local
conventions,
while
the
year
is
not
marked
up
at
all,
since
marking
it
up
would
not
be
particularly
useful.
<p>I usually have a snack at <time>16:00</time>.</p> <p>I've liked model trains since at least 1983.</p>
Using a styling technology that supports restyling times, the first paragraph from the above snippet could be rendered as follows:
I usually have a snack at 4pm.
Or it could be rendered as follows:
I usually have a snack at 16h00.
The
dateTime
DOM
attribute
must
reflect
the
datetime
content
attribute.
User
agents,
to
obtain
the
date
,
time
,
and
time
zone
represented
by
a
time
element,
must
follow
these
steps:
datetime
attribute
is
present,
then
use
the
rules
to
parse
a
date
or
time
string
with
the
flag
in
attribute
from
the
value
of
that
attribute,
and
let
the
result
be
result
.
textContent
,
and
let
the
result
be
result
.
date
Returns
a
Date
object
representing
the
date
component
of
the
element's
value,
at
midnight
in
the
UTC
time
zone.
Returns null if there is no date.
time
Returns
a
Date
object
representing
the
time
component
of
the
element's
value,
on
1970-01-01
in
the
UTC
time
zone.
Returns null if there is no time.
timezone
Returns
a
Date
object
representing
the
time
corresponding
to
1970-01-01
00:00
UTC
in
the
time
zone
given
by
the
element's
value.
Returns null if there is no time zone.
The
date
DOM
attribute
must
return
null
if
the
date
is
unknown,
and
otherwise
must
return
the
time
corresponding
to
midnight
UTC
(i.e.
the
first
second)
of
the
given
date
.
The
time
DOM
attribute
must
return
null
if
the
time
is
unknown,
and
otherwise
must
return
the
time
corresponding
to
the
given
time
of
1970-01-01,
with
the
time
zone
UTC.
The
timezone
DOM
attribute
must
return
null
if
the
time
zone
is
unknown,
and
otherwise
must
return
the
time
corresponding
to
1970-01-01
00:00
UTC
in
the
given
time
zone
,
with
the
time
zone
set
to
UTC
(i.e.
the
time
corresponding
to
1970-01-01
at
00:00
UTC
plus
the
offset
corresponding
to
the
time
zone).
In the following snippet:
<p>Our first date was <time datetime="2006-09-23">a Saturday</time>.</p>
...the
time
element's
date
attribute
would
have
the
value
1,158,969,600,000ms,
and
the
time
and
timezone
attributes
would
return
null.
In the following snippet:
<p>We stopped talking at <time datetime="2006-09-24T05:00-07:00">5am the next morning</time>.</p>
...the
time
element's
date
attribute
would
have
the
value
1,159,056,000,000ms,
the
time
attribute
would
have
the
value
18,000,000ms,
and
the
timezone
attribute
would
return
−25,200,000ms.
To
obtain
the
actual
time,
the
three
attributes
can
be
added
together,
obtaining
1,159,048,800,000,
which
is
the
specified
date
and
time
in
UTC.
Finally, in the following snippet:
<p>Many people get up at <time>08:00</time>.</p>
...the
time
element's
date
attribute
would
have
the
value
null,
the
time
attribute
would
have
the
value
28,800,000ms,
and
the
timezone
attribute
would
return
null.
progress
element
value
max
interface HTMLProgressElement : HTMLElement {
attribute float value;
attribute float max;
readonly attribute float position;
};
The
progress
element
represents
the
completion
progress
of
a
task.
The
progress
is
either
indeterminate,
indicating
that
progress
is
being
made
but
that
it
is
not
clear
how
much
more
work
remains
to
be
done
before
the
task
is
complete
(e.g.
because
the
task
is
waiting
for
a
remote
host
to
respond),
or
the
progress
is
a
number
in
the
range
zero
to
a
maximum,
giving
the
fraction
of
work
that
has
so
far
been
completed.
There are two attributes that determine the current task completion represented by the element.
The
value
attribute
specifies
how
much
of
the
task
has
been
completed,
and
the
max
attribute
specifies
how
much
work
the
task
requires
in
total.
The
units
are
arbitrary
and
not
specified.
Instead of using the attributes, authors are recommended to include the current value and the maximum value inline as text inside the element.
Here is a snippet of a Web application that shows the progress of some automated task:
<section>
<h2>Task Progress</h2>
<p>Progress: <progress><span id="p">0</span>%</progress></p>
<script>
var progressBar = document.getElementById('p');
function updateProgress(newValue) {
progressBar.textContent = newValue;
}
</script>
</section>
(The
updateProgress()
method
in
this
example
would
be
called
by
some
other
code
on
the
page
to
update
the
actual
progress
bar
as
the
task
progressed.)
Author
requirements
:
The
max
and
value
attributes,
when
present,
must
have
values
that
are
valid
floating
point
numbers
.
The
max
attribute,
if
present,
must
have
a
value
greater
than
zero.
The
value
attribute,
if
present,
must
have
a
value
equal
to
or
greater
than
zero,
and
less
than
or
equal
to
the
value
of
the
max
attribute,
if
present,
or
1,
otherwise.
The
progress
element
is
the
wrong
element
to
use
for
something
that
is
just
a
gauge,
as
opposed
to
task
progress.
For
instance,
indicating
disk
space
usage
using
progress
would
be
inappropriate.
Instead,
the
meter
element
is
available
for
such
use
cases.
User
agent
requirements
:
User
agents
must
parse
the
max
and
value
attributes'
values
according
to
the
rules
for
parsing
floating
point
number
values
.
If
the
value
attribute
is
omitted,
then
user
agents
must
also
parse
the
textContent
of
the
progress
element
in
question
using
the
steps
for
finding
one
or
two
numbers
of
a
ratio
in
a
string
.
These
steps
will
return
nothing,
one
number,
one
number
with
a
denominator
punctuation
character,
or
two
numbers.
Using the results of this processing, user agents must determine whether the progress bar is an indeterminate progress bar, or whether it is a determinate progress bar, and in the latter case, what its current and maximum values are, all as follows:
max
attribute
is
omitted,
and
the
value
is
omitted,
and
the
results
of
parsing
the
textContent
was
nothing,
then
the
progress
bar
is
an
indeterminate
progress
bar.
Abort
these
steps.
max
attribute
is
included,
then,
if
a
value
could
be
parsed
out
of
it,
then
the
maximum
value
is
that
value.
max
attribute
is
absent
but
the
value
attribute
is
present,
or,
if
the
max
attribute
is
present
but
no
value
could
be
parsed
from
it,
then
the
maximum
is
1.
textContent
contained
one
number
with
an
associated
denominator
punctuation
character,
then
the
maximum
value
is
the
value
associated
with
that
denominator
punctuation
character
;
otherwise,
if
the
textContent
contained
two
numbers,
the
maximum
value
is
the
higher
of
the
two
values;
otherwise,
the
maximum
value
is
1.
value
attribute
is
present
on
the
element
and
a
value
could
be
parsed
out
of
it,
that
value
is
the
current
value
of
the
progress
bar.
Otherwise,
if
the
attribute
is
present
but
no
value
could
be
parsed
from
it,
the
current
value
is
zero.
value
attribute
is
absent
and
the
max
attribute
is
present,
then,
if
the
textContent
was
parsed
and
found
to
contain
just
one
number,
with
no
associated
denominator
punctuation
character,
then
the
current
value
is
that
number.
Otherwise,
if
the
value
attribute
is
absent
and
the
max
attribute
is
present
then
the
current
value
is
zero.
textContent
of
the
element.
UA
requirements
for
showing
the
progress
bar
:
When
representing
a
progress
element
to
the
user,
the
UA
should
indicate
whether
it
is
a
determinate
or
indeterminate
progress
bar,
and
in
the
former
case,
should
indicate
the
relative
position
of
the
current
value
relative
to
the
maximum
value.
The
max
and
value
DOM
attributes
must
reflect
the
respective
content
attributes
of
the
same
name.
When
the
relevant
content
attributes
are
absent,
the
DOM
attributes
must
return
zero.
The
value
parsed
from
the
textContent
never
affects
the
DOM
values.
position
For a determinate progress bar (one with known current and maximum values), returns the result of dividing the current value by the maximum value.
For an indeterminate progress bar, returns −1.
If
the
progress
bar
is
an
indeterminate
progress
bar,
then
the
position
DOM
attribute
must
return
−1.
Otherwise,
it
must
return
the
result
of
dividing
the
current
value
by
the
maximum
value.
meter
element
value
min
low
high
max
optimum
interface HTMLMeterElement : HTMLElement {
attribute float value;
attribute float min;
attribute float max;
attribute float low;
attribute float high;
attribute float optimum;
};
The
meter
element
represents
a
scalar
measurement
within
a
known
range,
or
a
fractional
value;
for
example
disk
usage,
the
relevance
of
a
query
result,
or
the
fraction
of
a
voting
population
to
have
selected
a
particular
candidate.
This is also known as a gauge.
The
meter
element
should
not
be
used
to
indicate
progress
(as
in
a
progress
bar).
For
that
role,
HTML
provides
a
separate
progress
element.
The
meter
element
also
does
not
represent
a
scalar
value
of
arbitrary
range
—
for
example,
it
would
be
wrong
to
use
this
to
report
a
weight,
or
height,
unless
there
is
a
known
maximum
value.
There are six attributes that determine the semantics of the gauge represented by the element.
The
min
attribute
specifies
the
lower
bound
of
the
range,
and
the
max
attribute
specifies
the
upper
bound.
The
value
attribute
specifies
the
value
to
have
the
gauge
indicate
as
the
"measured"
value.
The
other
three
attributes
can
be
used
to
segment
the
gauge's
range
into
"low",
"medium",
and
"high"
parts,
and
to
indicate
which
part
of
the
gauge
is
the
"optimum"
part.
The
low
attribute
specifies
the
range
that
is
considered
to
be
the
"low"
part,
and
the
high
attribute
specifies
the
range
that
is
considered
to
be
the
"high"
part.
The
optimum
attribute
gives
the
position
that
is
"optimum";
if
that
is
higher
than
the
"high"
value
then
this
indicates
that
the
higher
the
value,
the
better;
if
it's
lower
than
the
"low"
mark
then
it
indicates
that
lower
values
are
better,
and
naturally
if
it
is
in
between
then
it
indicates
that
neither
high
nor
low
values
are
good.
Authoring requirements : The recommended way of giving the value is to include it as contents of the element, either as two numbers (the higher number represents the maximum, the other number the current value, and the minimum is assumed to be zero), or as a percentage or similar (using one of the characters such as "%"), or as a fraction. However, it is also possible to use the attributes to specify these values.
One of the following conditions, along with all the requirements that are listed with that condition, must be met:
value
,
min
,
and
max
attributes
are
all
omitted
If
specified,
the
low
,
high
,
and
optimum
attributes
must
have
values
greater
than
or
equal
to
zero
and
less
than
or
equal
to
the
bigger
of
the
two
numbers
in
the
contents
of
the
element.
If
both
the
low
and
high
attributes
are
specified,
then
the
low
attribute's
value
must
be
less
than
or
equal
to
the
value
of
the
high
attribute.
value
,
min
,
and
max
attributes
are
all
omitted
If
specified,
the
low
,
high
,
and
optimum
attributes
must
have
values
greater
than
or
equal
to
zero
and
less
than
or
equal
to
the
value
associated
with
the
denominator
punctuation
character
.
If
both
the
low
and
high
attributes
are
specified,
then
the
low
attribute's
value
must
be
less
than
or
equal
to
the
value
of
the
high
attribute.
value
attribute
is
omitted
value
attribute
is
specified
If
the
min
attribute
attribute
is
specified,
then
the
minimum
is
that
attribute's
value;
otherwise,
it
is
0.
If
the
max
attribute
attribute
is
specified,
then
the
maximum
is
that
attribute's
value;
otherwise,
it
is
1.
If
there
is
exactly
one
number
in
the
contents
of
the
element,
then
value
is
that
number;
otherwise,
value
is
the
value
of
the
value
attribute.
The following inequalities must hold, as applicable:
low
≤
maximum
(if
low
is
specified)
high
≤
maximum
(if
high
is
specified)
optimum
≤
maximum
(if
optimum
is
specified)
If
both
the
low
and
high
attributes
are
specified,
then
the
low
attribute's
value
must
be
less
than
or
equal
to
the
value
of
the
high
attribute.
For the purposes of these requirements, a number is a sequence of characters in the range U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9), optionally including with a single U+002E FULL STOP character (.), and separated from other numbers by at least one character that isn't any of those; interpreted as a base ten number.
The
value
,
min
,
low
,
high
,
max
,
and
optimum
attributes,
when
present,
must
have
values
that
are
valid
floating
point
numbers
.
If no minimum or maximum is specified, then the range is assumed to be 0..1, and the value thus has to be within that range.
The following examples all represent a measurement of three quarters (of the maximum of whatever is being measured):
<meter>75%</meter> <meter>750‰</meter> <meter>3/4</meter> <meter>6 blocks used (out of 8 total)</meter> <meter>max: 100; current: 75</meter> <meter><object data="graph75.png">0.75</object></meter> <meter min="0" max="100" value="75"></meter>
The following example is incorrect use of the element, because it doesn't give a range (and since the default maximum is 1, both of the gauges would end up looking maxed out):
<p>The grapefruit pie had a radius of <meter>12cm</meter> and a height of <meter>2cm</meter>.</p> <!-- BAD! -->
Instead, one would either not include the meter element, or use the meter element with a defined range to give the dimensions in context compared to other pies:
<p>The grapefruit pie had a radius of 12cm and a height of 2cm.</p> <dl> <dt>Radius: <dd> <meter min=0 max=20 value=12>12cm</meter> <dt>Height: <dd> <meter min=0 max=10 value=2>2cm</meter> </dl>
There
is
no
explicit
way
to
specify
units
in
the
meter
element,
but
the
units
may
be
specified
in
the
title
attribute
in
free-form
text.
The example above could be extended to mention the units:
<dl> <dt>Radius: <dd> <meter min=0 max=20 value=12 title="centimeters">12cm</meter> <dt>Height: <dd> <meter min=0 max=10 value=2 title="centimeters">2cm</meter> </dl>
User
agent
requirements
:
User
agents
must
parse
the
min
,
max
,
value
,
low
,
high
,
and
optimum
attributes
using
the
rules
for
parsing
floating
point
number
values
.
If
the
value
attribute
has
been
omitted,
the
user
agent
must
also
process
the
textContent
of
the
element
according
to
the
steps
for
finding
one
or
two
numbers
of
a
ratio
in
a
string
.
These
steps
will
return
nothing,
one
number,
one
number
with
a
denominator
punctuation
character,
or
two
numbers.
User agents must then use all these numbers to obtain values for six points on the gauge, as follows. (The order in which these are evaluated is important, as some of the values refer to earlier ones.)
If
the
min
attribute
is
specified
and
a
value
could
be
parsed
out
of
it,
then
the
minimum
value
is
that
value.
Otherwise,
the
minimum
value
is
zero.
If
the
max
attribute
is
specified
and
a
value
could
be
parsed
out
of
it,
the
maximum
value
is
that
value.
Otherwise,
if
the
max
attribute
is
specified
but
no
value
could
be
parsed
out
of
it,
or
if
it
was
not
specified,
but
either
or
both
of
the
min
or
value
attributes
were
specified,
then
the
maximum
value
is
1.
Otherwise,
none
of
the
max
,
min
,
and
value
attributes
were
specified.
If
the
result
of
processing
the
textContent
of
the
element
was
either
nothing
or
just
one
number
with
no
denominator
punctuation
character,
then
the
maximum
value
is
1;
if
the
result
was
one
number
but
it
had
an
associated
denominator
punctuation
character,
then
the
maximum
value
is
the
value
associated
with
that
denominator
punctuation
character
;
and
finally,
if
there
were
two
numbers
parsed
out
of
the
textContent
,
then
the
maximum
is
the
higher
of
those
two
numbers.
If the above machinations result in a maximum value less than the minimum value, then the maximum value is actually the same as the minimum value.
If
the
value
attribute
is
specified
and
a
value
could
be
parsed
out
of
it,
then
that
value
is
the
actual
value.
If
the
value
attribute
is
not
specified
but
the
max
attribute
is
specified
and
the
result
of
processing
the
textContent
of
the
element
was
one
number
with
no
associated
denominator
punctuation
character,
then
that
number
is
the
actual
value.
If
neither
of
the
value
and
max
attributes
are
specified,
then,
if
the
result
of
processing
the
textContent
of
the
element
was
one
number
(with
or
without
an
associated
denominator
punctuation
character),
then
that
is
the
actual
value,
and
if
the
result
of
processing
the
textContent
of
the
element
was
two
numbers,
then
the
actual
value
is
the
lower
of
the
two
numbers
found.
Otherwise, if none of the above apply, the actual value is zero.
If the above procedure results in an actual value less than the minimum value, then the actual value is actually the same as the minimum value.
If, on the other hand, the result is an actual value greater than the maximum value, then the actual value is the maximum value.
If
the
low
attribute
is
specified
and
a
value
could
be
parsed
out
of
it,
then
the
low
boundary
is
that
value.
Otherwise,
the
low
boundary
is
the
same
as
the
minimum
value.
If the low boundary is then less than the minimum value, then the low boundary is actually the same as the minimum value. Similarly, if the low boundary is greater than the maximum value, then it is actually the maximum value instead.
If
the
high
attribute
is
specified
and
a
value
could
be
parsed
out
of
it,
then
the
high
boundary
is
that
value.
Otherwise,
the
high
boundary
is
the
same
as
the
maximum
value.
If the high boundary is then less than the low boundary, then the high boundary is actually the same as the low boundary. Similarly, if the high boundary is greater than the maximum value, then it is actually the maximum value instead.
If
the
optimum
attribute
is
specified
and
a
value
could
be
parsed
out
of
it,
then
the
optimum
point
is
that
value.
Otherwise,
the
optimum
point
is
the
midpoint
between
the
minimum
value
and
the
maximum
value.
If the optimum point is then less than the minimum value, then the optimum point is actually the same as the minimum value. Similarly, if the optimum point is greater than the maximum value, then it is actually the maximum value instead.
All of which will result in the following inequalities all being true:
UA requirements for regions of the gauge : If the optimum point is equal to the low boundary or the high boundary, or anywhere in between them, then the region between the low and high boundaries of the gauge must be treated as the optimum region, and the low and high parts, if any, must be treated as suboptimal. Otherwise, if the optimum point is less than the low boundary, then the region between the minimum value and the low boundary must be treated as the optimum region, the region between the low boundary and the high boundary must be treated as a suboptimal region, and the region between the high boundary and the maximum value must be treated as an even less good region. Finally, if the optimum point is higher than the high boundary, then the situation is reversed; the region between the high boundary and the maximum value must be treated as the optimum region, the region between the high boundary and the low boundary must be treated as a suboptimal region, and the remaining region between the low boundary and the minimum value must be treated as an even less good region.
UA
requirements
for
showing
the
gauge
:
When
representing
a
meter
element
to
the
user,
the
UA
should
indicate
the
relative
position
of
the
actual
value
to
the
minimum
and
maximum
values,
and
the
relationship
between
the
actual
value
and
the
three
regions
of
the
gauge.
The following markup:
<h3>Suggested groups</h3>
<menu type="toolbar">
<a href="?cmd=hsg" onclick="hideSuggestedGroups()">Hide suggested groups</a>
</menu>
<ul>
<li>
<p><a href="/group/comp.infosystems.www.authoring.stylesheets/view">comp.infosystems.www.authoring.stylesheets</a> -
<a href="/group/comp.infosystems.www.authoring.stylesheets/subscribe">join</a></p>
<p>Group description: <strong>Layout/presentation on the WWW.</strong></p>
<p><meter value="0.5">Moderate activity,</meter> Usenet, 618 subscribers</p>
</li>
<li>
<p><a href="/group/netscape.public.mozilla.xpinstall/view">netscape.public.mozilla.xpinstall</a> -
<a href="/group/netscape.public.mozilla.xpinstall/subscribe">join</a></p>
<p>Group description: <strong>Mozilla XPInstall discussion.</strong></p>
<p><meter value="0.25">Low activity,</meter> Usenet, 22 subscribers</p>
</li>
<li>
<p><a href="/group/mozilla.dev.general/view">mozilla.dev.general</a> -
<a href="/group/mozilla.dev.general/subscribe">join</a></p>
<p><meter value="0.25">Low activity,</meter> Usenet, 66 subscribers</p>
</li>
</ul>
Might be rendered as follows:
User
agents
may
combine
the
value
of
the
title
attribute
and
the
other
attributes
to
provide
context-sensitive
help
or
inline
text
detailing
the
actual
values.
For example, the following snippet:
<meter min=0 max=60 value=23.2 title=seconds></meter>
...might cause the user agent to display a gauge with a tooltip saying "Value: 23.2 out of 60." on one line and "seconds" on a second line.
The
min
,
max
,
value
,
low
,
high
,
and
optimum
DOM
attributes
must
reflect
the
respective
content
attributes
of
the
same
name.
When
the
relevant
content
attributes
are
absent,
the
DOM
attributes
must
return
zero.
The
value
parsed
from
the
textContent
never
affects
the
DOM
values.
code
element
HTMLElement
.
The
code
element
represents
a
fragment
of
computer
code.
This
could
be
an
XML
element
name,
a
filename,
a
computer
program,
or
any
other
string
that
a
computer
would
recognize.
Although
there
is
no
formal
way
to
indicate
the
language
of
computer
code
being
marked
up,
authors
who
wish
to
mark
code
elements
with
the
language
used,
e.g.
so
that
syntax
highlighting
scripts
can
use
the
right
rules,
may
do
so
by
adding
a
class
prefixed
with
"
language-
"
to
the
element.
The following example shows how the element can be used in a paragraph to mark up element names and computer code, including punctuation.
<p>The <code>code</code> element represents a fragment of computer code.</p> <p>When you call the <code>activate()</code> method on the <code>robotSnowman</code> object, the eyes glow.</p> <p>The example below uses the <code>begin</code> keyword to indicate the start of a statement block. It is paired with an <code>end</code> keyword, which is followed by the <code>.</code> punctuation character (full stop) to indicate the end of the program.</p>
The
following
example
shows
how
a
block
of
code
could
be
marked
up
using
the
pre
and
code
elements.
<pre><code class="language-pascal">var i: Integer; begin i := 1; end.</code></pre>
A class is used in that example to indicate the language used.
See
the
pre
element
for
more
details.
var
element
HTMLElement
.
The
var
element
represents
a
variable.
This
could
be
an
actual
variable
in
a
mathematical
expression
or
programming
context,
or
it
could
just
be
a
term
used
as
a
placeholder
in
prose.
In the paragraph below, the letter "n" is being used as a variable in prose:
<p>If there are <var>n</var> pipes leading to the ice cream factory then I expect at <em>least</em> <var>n</var> flavors of ice cream to be available for purchase!</p>
For
mathematics,
in
particular
for
anything
beyond
the
simplest
of
expressions,
MathML
is
more
appropriate.
However,
the
var
element
can
still
be
used
to
refer
to
specific
variables
that
are
then
mentioned
in
MathML
expressions.
In
this
example,
an
equation
is
shown,
with
a
legend
that
references
the
variables
in
the
equation.
The
expression
itself
is
marked
up
with
MathML,
but
the
variables
are
mentioned
in
the
figure's
legend
using
var
.
<figure> <math> <mi>a</mi> <mo>=</mo> <msqrt> <msup><mi>b</mi><mn>2</mn></msup> <mi>+</mi> <msup><mi>c</mi><mn>2</mn></msup> </msqrt> </math> <legend> Using Pythagoras' theorem to solve for the hypotenuse <var>a</var> of a triangle with sides <var>b</var> and <var>c</var> </legend> </figure>
samp
element
HTMLElement
.
The
samp
element
represents
(sample)
output
from
a
program
or
computing
system.
See
the
pre
and
kbd
elements
for
more
details.
This
example
shows
the
samp
element
being
used
inline:
<p>The computer said <samp>Too much cheese in tray two</samp> but I didn't know what that meant.</p>
This
second
example
shows
a
block
of
sample
output.
Nested
samp
and
kbd
elements
allow
for
the
styling
of
specific
elements
of
the
sample
output
using
a
style
sheet.
<pre><samp><span class="prompt">jdoe@mowmow:~$</span> <kbd>ssh demo.example.com</kbd> Last login: Tue Apr 12 09:10:17 2005 from mowmow.example.com on pts/1 Linux demo 2.6.10-grsec+gg3+e+fhs6b+nfs+gr0501+++p3+c4a+gr2b-reslog-v6.189 #1 SMP Tue Feb 1 11:22:36 PST 2005 i686 unknown <span class="prompt">jdoe@demo:~$</span> <span class="cursor">_</span></samp></pre>
kbd
element
HTMLElement
.
The
kbd
element
represents
user
input
(typically
keyboard
input,
although
it
may
also
be
used
to
represent
other
input,
such
as
voice
commands).
When
the
kbd
element
is
nested
inside
a
samp
element,
it
represents
the
input
as
it
was
echoed
by
the
system.
When
the
kbd
element
contains
a
samp
element,
it
represents
input
based
on
system
output,
for
example
invoking
a
menu
item.
When
the
kbd
element
is
nested
inside
another
kbd
element,
it
represents
an
actual
key
or
other
single
unit
of
input
as
appropriate
for
the
input
mechanism.
Here
the
kbd
element
is
used
to
indicate
keys
to
press:
<p>To make George eat an apple, press <kbd><kbd>Shift</kbd>+<kbd>F3</kbd></kbd></p>
In
this
second
example,
the
user
is
told
to
pick
a
particular
menu
item.
The
outer
kbd
element
marks
up
a
block
of
input,
with
the
inner
kbd
elements
representing
each
individual
step
of
the
input,
and
the
samp
elements
inside
them
indicating
that
the
steps
are
input
based
on
something
being
displayed
by
the
system,
in
this
case
menu
labels:
<p>To make George eat an apple, select
<kbd><kbd><samp>File</samp></kbd>|<kbd><samp>Eat Apple...</samp></kbd></kbd>
</p>
sub
and
sup
elements
HTMLElement
.
The
sup
element
represents
a
superscript
and
the
sub
element
represents
a
subscript.
These
elements
must
be
used
only
to
mark
up
typographical
conventions
with
specific
meanings,
not
for
typographical
presentation
for
presentation's
sake.
For
example,
it
would
be
inappropriate
for
the
sub
and
sup
elements
to
be
used
in
the
name
of
the
LaTeX
document
preparation
system.
In
general,
authors
should
use
these
elements
only
if
the
absence
of
those
elements
would
change
the
meaning
of
the
content.
When
the
sub
element
is
used
inside
a
var
element,
it
represents
the
subscript
that
identifies
the
variable
in
a
family
of
variables.
<p>The coordinate of the <var>i</var>th point is (<var>x<sub><var>i</var></sub></var>, <var>y<sub><var>i</var></sub></var>). For example, the 10th point has coordinate (<var>x<sub>10</sub></var>, <var>y<sub>10</sub></var>).</p>
In certain languages, superscripts are part of the typographical conventions for some abbreviations.
<p>The most beautiful women are <span lang="fr"><abbr>M<sup>lle</sup></abbr> Gwendoline</span> and <span lang="fr"><abbr>M<sup>me</sup></abbr> Denise</span>.</p>
Mathematical
expressions
often
use
subscripts
and
superscripts.
Authors
are
encouraged
to
use
MathML
for
marking
up
mathematics,
but
authors
may
opt
to
use
sub
and
sup
if
detailed
mathematical
markup
is
not
desired.
[MATHML]
<var>E</var>=<var>m</var><var>c</var><sup>2</sup>
f(<var>x</var>, <var>n</var>) = log<sub>4</sub><var>x</var><sup><var>n</var></sup>
span
element
interface
HTMLSpanElement
:
HTMLElement
{};
The
span
element
doesn't
mean
anything
on
its
own,
but
can
be
useful
when
used
together
with
other
attributes,
e.g.
class
,
lang
,
or
dir
.
It
represents
its
children.
i
element
HTMLElement
.
The
i
element
represents
a
span
of
text
in
an
alternate
voice
or
mood,
or
otherwise
offset
from
the
normal
prose,
such
as
a
taxonomic
designation,
a
technical
term,
an
idiomatic
phrase
from
another
language,
a
thought,
a
ship
name,
or
some
other
prose
whose
typical
typographic
presentation
is
italicized.
Terms
in
languages
different
from
the
main
text
should
be
annotated
with
lang
attributes
(or,
in
XML,
lang
attributes
in
the
XML
namespace
).
The
examples
below
show
uses
of
the
i
element:
<p>The <i class="taxonomy">Felis silvestris catus</i> is cute.</p> <p>The term <i>prose content</i> is defined above.</p> <p>There is a certain <i lang="fr">je ne sais quoi</i> in the air.</p>
In
the
following
example,
a
dream
sequence
is
marked
up
using
i
elements.
<p>Raymond tried to sleep.</p> <p><i>The ship sailed away on Thursday</i>, he dreamt. <i>The ship had many people aboard, including a beautiful princess called Carey. He watched her, day-in, day-out, hoping she would notice him, but she never did.</i></p> <p><i>Finally one night he picked up the courage to speak with her—</i></p> <p>Raymond woke with a start as the fire alarm rang out.</p>
Authors
are
encouraged
to
use
the
class
attribute
on
the
i
element
to
identify
why
the
element
is
being
used,
so
that
if
the
style
of
a
particular
use
(e.g.
dream
sequences
as
opposed
to
taxonomic
terms)
is
to
be
changed
at
a
later
date,
the
author
doesn't
have
to
go
through
the
entire
document
(or
series
of
related
documents)
annotating
each
use.
Similarly,
authors
are
encouraged
to
consider
whether
other
elements
might
be
more
applicable
than
the
i
element,
for
instance
the
em
element
for
marking
up
stress
emphasis,
or
the
dfn
element
to
mark
up
the
defining
instance
of
a
term.
Style
sheets
can
be
used
to
format
i
elements,
just
like
any
other
element
can
be
restyled.
Thus,
it
is
not
the
case
that
content
in
i
elements
will
necessarily
be
italicized.
b
element
HTMLElement
.
The
b
element
represents
a
span
of
text
to
be
stylistically
offset
from
the
normal
prose
without
conveying
any
extra
importance,
such
as
key
words
in
a
document
abstract,
product
names
in
a
review,
or
other
spans
of
text
whose
typical
typographic
presentation
is
boldened.
The
following
example
shows
a
use
of
the
b
element
to
highlight
key
words
without
marking
them
up
as
important:
<p>The <b>frobonitor</b> and <b>barbinator</b> components are fried.</p>
In
the
following
example,
objects
in
a
text
adventure
are
highlighted
as
being
special
by
use
of
the
b
element.
<p>You enter a small room. Your <b>sword</b> glows brighter. A <b>rat</b> scurries past the corner wall.</p>
Another
case
where
the
b
element
is
appropriate
is
in
marking
up
the
lede
(or
lead)
sentence
or
paragraph.
The
following
example
shows
how
a
BBC
article
about
kittens
adopting
a
rabbit
as
their
own
could
be
marked
up:
<article> <h2>Kittens 'adopted' by pet rabbit</h2> <p><b>Six abandoned kittens have found an unexpected new mother figure — a pet rabbit.</b></p> <p>Veterinary nurse Melanie Humble took the three-week-old kittens to her Aberdeen home.</p> [...]
The
b
element
should
be
used
as
a
last
resort
when
no
other
element
is
more
appropriate.
In
particular,
headings
should
use
the
h1
to
h6
elements,
stress
emphasis
should
use
the
em
element,
importance
should
be
denoted
with
the
strong
element,
and
text
marked
or
highlighted
should
use
the
mark
element.
The following would be incorrect usage:
<p><b>WARNING!</b> Do not frob the barbinator!</p>
In
the
previous
example,
the
correct
element
to
use
would
have
been
strong
,
not
b
.
Style
sheets
can
be
used
to
format
b
elements,
just
like
any
other
element
can
be
restyled.
Thus,
it
is
not
the
case
that
content
in
b
elements
will
necessarily
be
boldened.
bdo
element
dir
global
attribute
has
special
semantics
on
this
element.
HTMLElement
.
The
bdo
element
represents
explicit
text
directionality
formatting
control
for
its
children.
It
allows
authors
to
override
the
Unicode
bidi
algorithm
by
explicitly
specifying
a
direction
override.
[BIDI]
Authors
must
specify
the
dir
attribute
on
this
element,
with
the
value
ltr
to
specify
a
left-to-right
override
and
with
the
value
rtl
to
specify
a
right-to-left
override.
If
the
element
has
the
dir
attribute
set
to
the
exact
value
ltr
,
then
for
the
purposes
of
the
bidi
algorithm,
the
user
agent
must
act
as
if
there
was
a
U+202D
LEFT-TO-RIGHT
OVERRIDE
character
at
the
start
of
the
element,
and
a
U+202C
POP
DIRECTIONAL
FORMATTING
at
the
end
of
the
element.
If
the
element
has
the
dir
attribute
set
to
the
exact
value
rtl
,
then
for
the
purposes
of
the
bidi
algorithm,
the
user
agent
must
act
as
if
there
was
a
U+202E
RIGHT-TO-LEFT
OVERRIDE
character
at
the
start
of
the
element,
and
a
U+202C
POP
DIRECTIONAL
FORMATTING
at
the
end
of
the
element.
The
requirements
on
handling
the
bdo
element
for
the
bidi
algorithm
may
be
implemented
indirectly
through
the
style
layer.
For
example,
an
HTML+CSS
user
agent
should
implement
these
requirements
by
implementing
the
CSS
'unicode-bidi'
property.
[CSS]
ruby
element
rt
element,
or
an
rp
element,
an
rt
element,
and
another
rp
element.
HTMLElement
.
The
ruby
element
allows
one
or
more
spans
of
phrasing
content
to
be
marked
with
ruby
annotations.
Ruby
annotations
are
short
runs
of
text
presented
alongside
base
text,
primarily
used
in
East
Asian
typography
as
a
guide
for
pronunciation
or
to
include
other
annotations.
In
Japanese,
this
form
of
typography
is
also
known
as
furigana
.
A
ruby
element
represents
the
spans
of
phrasing
content
it
contains,
ignoring
all
the
child
rt
and
rp
elements
and
their
descendants.
Those
spans
of
phrasing
content
have
associated
annotations
created
using
the
rt
element.
In this example, each ideograph in the Japanese text 漢字 is annotated with its kanji reading.
...
<ruby>
漢 <rt> かん </rt>
字 <rt> じ </rt>
</ruby>
...
This might be rendered as:
In this example, each ideograph in the traditional Chinese text 漢字 is annotated with its bopomofo reading.
<ruby>
漢 <rt> ㄏㄢˋ </rt>
字 <rt> ㄗˋ </rt>
</ruby>
This might be rendered as:
In this example, each ideograph in the simplified Chinese text 汉字 is annotated with its pinyin reading.
...
<ruby>
汉 <rt> hàn </rt>
字 <rt> zì </rt>
</ruby>
...
This might be rendered as:
rt
element
ruby
element.
HTMLElement
.
The
rt
element
marks
the
ruby
text
component
of
a
ruby
annotation.
An
rt
element
that
is
a
child
of
a
ruby
element
represents
an
annotation
(given
by
its
children)
for
the
zero
or
more
nodes
of
phrasing
content
that
immediately
precedes
it
in
the
ruby
element,
ignoring
rp
elements.
rp
element
ruby
element,
either
immediately
before
or
immediately
after
an
rt
element.
HTMLElement
.
The
rp
element
can
be
used
to
provide
parentheses
around
a
ruby
text
component
of
a
ruby
annotation,
to
be
shown
by
user
agents
that
don't
support
ruby
annotations.
An
rp
element
that
is
a
child
of
a
ruby
element
represents
nothing
and
its
contents
must
be
ignored
.
An
rp
element
whose
parent
element
is
not
a
ruby
element
represents
its
children.
The
example
above,
in
which
each
ideograph
in
the
text
漢字
is
annotated
with
its
kanji
reading,
could
be
expanded
to
use
rp
so
that
in
legacy
user
agents
the
readings
are
in
parentheses:
...
<ruby>
漢 <rp>(</rp><rt>かん</rt><rp>)</rp>
字 <rp>(</rp><rt>じ</rt><rp>)</rp>
</ruby>
...
In conforming user agents the rendering would be as above, but in user agents that do not support ruby, the rendering would be:
...
漢
(かん)
字
(じ)
...
We need to summarize the various elements, in particular to distinguish b/i/em/strong/var/q/mark/cite.
HTML does not have a dedicated mechanism for marking up footnotes. Here are the recommended alternatives.
For
short
inline
annotations,
the
title
attribute
should
be
used.
In this example, two parts of a dialog are annotated.
<dialog> <dt>Customer <dd>Hello! I wish to register a complaint. Hello. Miss? <dt>Shopkeeper <dd><span title="Colloquial pronunciation of 'What do you'" >Watcha</span> mean, miss? <dt>Customer <dd>Uh, I'm sorry, I have a cold. I wish to make a complaint. <dt>Shopkeeper <dd>Sorry, <span title="This is, of course, a lie.">we're closing for lunch</span>. </dialog>
For
longer
annotations,
the
a
element
should
be
used,
pointing
to
an
element
later
in
the
document.
The
convention
is
that
the
contents
of
the
link
be
a
number
in
square
brackets.
In this example, a footnote in the dialog links to a paragraph below the dialog. The paragraph then reciprocally links back to the dialog, allowing the user to return to the location of the footnote.
<dialog> <dt>Announcer <dd>Number 16: The <i>hand</i>. <dt>Interviewer <dd>Good evening. I have with me in the studio tonight Mr Norman St John Polevaulter, who for the past few years has been contradicting people. Mr Polevaulter, why <em>do</em> you contradict people? <dt>Norman <dd>I don't. <a href="#fn1" id="r1">[1]</a> <dt>Interviewer <dd>You told me you did! </dialog> <section> <p id="fn1"><a href="#r1">[1]</a> This is, naturally, a lie, but paradoxically if it were true he could not say so without contradicting the interviewer and thus making it false.</p> </section>
For
side
notes,
longer
annotations
that
apply
to
entire
sections
of
the
text
rather
than
just
specific
words
or
sentences,
the
aside
element
should
be
used.
In this example, a sidebar is given after a dialog, giving some context to the dialog.
<dialog> <dt>Customer <dd>I will not buy this record, it is scratched. <dt>Shopkeeper <dd>I'm sorry? <dt>Customer <dd>I will not buy this record, it is scratched. <dt>Shopkeeper <dd>No no no, this's'a tobacconist's. </dialog> <aside> <p>In 1970, the British Empire lay in ruins, and foreign nationalists frequented the streets — many of them Hungarians (not the streets — the foreign nationals). Sadly, Alexander Yalt has been publishing incompetently-written phrase books. </aside>
For
figures
or
tables,
footnotes
can
be
included
in
the
relevant
legend
or
caption
element,
or
in
surrounding
prose.
In
this
example,
a
table
has
cells
with
footnotes
that
are
given
in
prose.
A
figure
element
is
used
to
give
a
single
legend
to
the
combination
of
the
table
and
its
footnotes.
<figure> <legend>Table 1. Alternative activities for knights.</legend> <table> <tr> <th> Activity <th> Location <th> Cost <tr> <td> Dance <td> Wherever possible <td> £0<sup><a href="#fn1">1</a></sup> <tr> <td> Routines, chorus scenes<sup><a href="#fn2">2</a></sup> <td> Undisclosed <td> Undisclosed <tr> <td> Dining<sup><a href="#fn3">3</a></sup> <td> Camelot <td> Cost of ham, jam, and spam<sup><a href="#fn4">4</a></sup> </table> <p id="fn1">1. Assumed.</p> <p id="fn2">2. Footwork impeccable.</p> <p id="fn3">3. Quality described as "well".</p> <p id="fn4">4. A lot.</p> </figure>
The
ins
and
del
elements
represent
edits
to
the
document.
ins
element
cite
datetime
HTMLModElement
interface.
The
ins
element
represents
an
addition
to
the
document.
The following represents the addition of a single paragraph:
<aside> <ins> <p> I like fruit. </p> </ins> </aside>
As
does
this,
because
everything
in
the
aside
element
here
counts
as
phrasing
content
and
therefore
there
is
just
one
paragraph
:
<aside> <ins> Apples are <em>tasty</em>. </ins> <ins> So are pears. </ins> </aside>
ins
elements
should
not
cross
implied
paragraph
boundaries.
The
following
example
represents
the
addition
of
two
paragraphs,
the
second
of
which
was
inserted
in
two
parts.
The
first
ins
element
in
this
example
thus
crosses
a
paragraph
boundary,
which
is
considered
poor
form.
<aside> <ins datetime="2005-03-16T00:00Z"> <p> I like fruit. </p> Apples are <em>tasty</em>. </ins> <ins datetime="2007-12-19T00:00Z"> So are pears. </ins> </aside>
Here is a better way of marking this up. It uses more elements, but none of the elements cross implied paragraph boundaries.
<aside> <ins datetime="2005-03-16T00:00Z"> <p> I like fruit. </p> </ins> <ins datetime="2005-03-16T00:00Z"> Apples are <em>tasty</em>. </ins> <ins datetime="2007-12-19T00:00Z"> So are pears. </ins> </aside>
del
element
cite
datetime
HTMLModElement
interface.
The
del
element
represents
a
removal
from
the
document.
del
elements
should
not
cross
implied
paragraph
boundaries.
ins
and
del
elements
The
cite
attribute
may
be
used
to
specify
the
address
of
a
document
that
explains
the
change.
When
that
document
is
long,
for
instance
the
minutes
of
a
meeting,
authors
are
encouraged
to
include
a
fragment
identifier
pointing
to
the
specific
part
of
that
document
that
discusses
the
change.
If
the
cite
attribute
is
present,
it
must
be
a
valid
URL
that
explains
the
change.
To
obtain
the
corresponding
citation
link,
the
value
of
the
attribute
must
be
resolved
relative
to
the
element.
User
agents
should
allow
users
to
follow
such
citation
links.
The
datetime
attribute
may
be
used
to
specify
the
time
and
date
of
the
change.
If
present,
the
datetime
attribute
must
be
a
valid
global
date
and
time
string
value.
User
agents
must
parse
the
datetime
attribute
according
to
the
parse
a
global
date
and
time
string
algorithm.
If
that
doesn't
return
a
time,
then
the
modification
has
no
associated
timestamp
(the
value
is
non-conforming;
it
is
not
a
valid
global
date
and
time
string
).
Otherwise,
the
modification
is
marked
as
having
been
made
at
the
given
datetime.
User
agents
should
use
the
associated
time-zone
information
to
determine
which
time
zone
to
present
the
given
datetime
in.
The
ins
and
del
elements
must
implement
the
HTMLModElement
interface:
interface HTMLModElement : HTMLElement {
attribute DOMString cite;
attribute DOMString dateTime;
};
The
cite
DOM
attribute
must
reflect
the
element's
cite
content
attribute.
The
dateTime
DOM
attribute
must
reflect
the
element's
datetime
content
attribute.
Since
the
ins
and
del
elements
do
not
affect
paragraphing
,
it
is
possible,
in
some
cases
where
paragraphs
are
implied
(without
explicit
p
elements),
for
an
ins
or
del
element
to
span
both
an
entire
paragraph
or
other
non-
phrasing
content
elements
and
part
of
another
paragraph.
For example:
<section> <ins> <p> This is a paragraph that was inserted. </p> This is another paragraph whose first sentence was inserted at the same time as the paragraph above. </ins> This is a second sentence, which was there all along. </section>
By
only
wrapping
some
paragraphs
in
p
elements,
one
can
even
get
the
end
of
one
paragraph,
a
whole
second
paragraph,
and
the
start
of
a
third
paragraph
to
be
covered
by
the
same
ins
or
del
element
(though
this
is
very
confusing,
and
not
considered
good
practice):
<section> This is the first paragraph. <ins>This sentence was inserted. <p>This second paragraph was inserted.</p> This sentence was inserted too.</ins> This is the third paragraph in this example. </section>
However,
due
to
the
way
implied
paragraphs
are
defined,
it
is
not
possible
to
mark
up
the
end
of
one
paragraph
and
the
start
of
the
very
next
one
using
the
same
ins
or
del
element.
You
instead
have
to
use
one
(or
two)
p
element(s)
and
two
ins
or
del
elements:
For example:
<section> <p>This is the first paragraph. <del>This sentence was deleted.</del></p> <p><del>This sentence was deleted too.</del> That sentence needed a separate <del> element.</p> </section>
Partly
because
of
the
confusion
described
above,
authors
are
strongly
recommended
to
always
mark
up
all
paragraphs
with
the
p
element,
and
to
not
have
any
ins
or
del
elements
that
cross
across
any
implied
paragraphs
.
The
content
models
of
the
ol
and
ul
elements
do
not
allow
ins
and
del
elements
as
children.
Lists
always
represent
all
their
items,
including
items
that
would
otherwise
have
been
marked
as
deleted.
To
indicate
that
an
item
is
inserted
or
deleted,
an
ins
or
del
element
can
be
wrapped
around
the
contents
of
the
li
element.
To
indicate
that
an
item
has
been
replaced
by
another,
a
single
li
element
can
have
one
or
more
del
elements
followed
by
one
or
more
ins
elements.
In the following example, a list that started empty had items added and removed from it over time. The bits in the example that have been emphasized show the parts that are the "current" state of the list. The list item numbers don't take into account the edits, though.
<h1>Stop-ship bugs</h1> <ol> <li><ins datetime="2008-02-12T15:20Z">Bug 225: Rain detector doesn't work in snow</ins></li> <li><del datetime="2008-03-01T20:22Z"><ins datetime="2008-02-14T12:02Z">Bug 228: Water buffer overflows in April</ins></del></li> <li><ins datetime="2008-02-16T13:50Z">Bug 230: Water heater doesn't use renewable fuels</ins></li> <li><del datetime="2008-02-20T21:15Z"><ins datetime="2008-02-16T14:25Z">Bug 232: Carbon dioxide emissions detected after startup</ins></del></li> </ol>
In the following example, a list that started with just fruit was replaced by a list with just colors.
<h1>List of <del>fruits</del><ins>colors</ins></h1> <ul> <li><del>Lime</del><ins>Green</ins></li> <li><del>Apple</del></li> <li>Orange</li> <li><del>Pear</del></li> <li><ins>Teal</ins></li> <li><del>Lemon</del><ins>Yellow</ins></li> <li>Olive</li> <li><ins>Purple</ins> </ul>
figure
element
legend
element
followed
by
flow
content
.
legend
element.
HTMLElement
.
The
figure
element
represents
some
flow
content
,
optionally
with
a
caption,
that
is
self-contained
and
is
typically
referenced
as
a
single
unit
from
the
main
flow
of
the
document.
The element can thus be used to annotate illustrations, diagrams, photos, code listings, etc, that are referred to from the main content of the document, but that could, without affecting the flow of the document, be moved away from that primary content, e.g. to the side of the page, to dedicated pages, or to an appendix.
The
first
legend
element
child
of
the
element,
if
any,
represents
the
caption
of
the
figure
element's
contents.
If
there
is
no
child
legend
element,
then
there
is
no
caption.
The remainder of the element's contents, if any, represents the content.
This
example
shows
the
figure
element
to
mark
up
a
code
listing.
<p>In <a href="#l4">listing 4</a> we see the primary core interface
API declaration.</p>
<figure id="l4">
<legend>Listing 4. The primary core interface API declaration.</legend>
<pre><code>interface PrimaryCore {
boolean verifyDataLine();
void sendData(in sequence<byte> data);
void initSelfDestruct();
}</code></pre>
</figure>
<p>The
API
is
designed
to
use
UTF-8.</p>
Here
we
see
a
figure
element
to
mark
up
a
photo.
<figure>
<img src="bubbles-work.jpeg"
alt="Bubbles, sitting in his office chair, works on his
latest project intently.">
<legend>Bubbles at work</legend>
</figure>
In this example, we see an image that is not a figure, as well as an image and a video that are.
<h2>Malinko's comics</h2> <p>This case centered on some sort of "intellectual property" infringement related to a comic (see Exhibit A). The suit started after a trailer ending with these words:</p> <img src="promblem-packed-action.png" alt="ROUGH COPY! Promblem-Packed Action!"> <p>...was aired. A lawyer, armed with a Bigger Notebook, launched a preemptive strike using snowballs. A complete copy of the trailer is included with Exhibit B.</p> <figure> <img src="ex-a.png" alt="Two squiggles on a dirty piece of paper."> <legend>Exhibit A. The alleged <cite>rough copy</cite> comic.</legend> </figure> <figure> <video src="ex-b.mov"></video> <legend>Exhibit B. The <code>Rough Copy</cite> trailer.</legend> </figure> <p>The case was resolved out of court.</p>
Here,
a
part
of
a
poem
is
marked
up
using
figure
.
<figure> <p>'Twas brillig, and the slithy toves<br> Did gyre and gimble in the wabe;<br> All mimsy were the borogoves,<br> And the mome raths outgrabe.</p> <legend><cite>Jabberwocky</cite> (first verse). Lewis Carroll, 1832-98</legend> </figure>
In this example, which could be part of a much larger work discussing a castle, the figure has three images in it.
<figure>
<img src="castle1423.jpeg" title="Etching. Anonymous, ca. 1423."
alt="The castle has one tower, and a tall wall around it.">
<img src="castle1858.jpeg" title="Oil-based paint on canvas. Maria Towle, 1858."
alt="The castle now has two towers and two walls.">
<img src="castle1999.jpeg" title="Film photograph. Peter Jankle, 1999."
alt="The castle lies in ruins, the original tower all that remains in one piece.">
<legend>The castle through the ages: 1423, 1858, and 1999 respectively.</legend>
</figure>
img
element
usemap
attribute:
Interactive
content
.
alt
src
usemap
ismap
width
height
[NamedConstructor=Image(),
NamedConstructor=Image(in unsigned long width),
NamedConstructor=Image(in unsigned long width, in unsigned long height)]
interface HTMLImageElement : HTMLElement {
attribute DOMString alt;
attribute DOMString src;
attribute DOMString useMap;
attribute boolean isMap;
attribute unsigned long width;
attribute unsigned long height;
readonly attribute boolean complete;
};
An
img
element
represents
an
image.
The
image
given
by
the
src
attribute
is
the
embedded
content,
and
the
value
of
the
alt
attribute
is
the
img
element's
fallback
content
.
The
src
attribute
must
be
present,
and
must
contain
a
valid
URL
referencing
a
non-interactive,
optionally
animated,
image
resource
that
is
neither
paged
nor
scripted.
If
the
base
URI
of
the
element
is
the
same
as
the
document's
address
,
then
the
src
attribute's
value
must
not
be
the
empty
string.
Images can thus be static bitmaps (e.g. PNGs, GIFs, JPEGs), single-page vector documents (single-page PDFs, XML files with an SVG root element), animated bitmaps (APNGs, animated GIFs), animated vector graphics (XML files with an SVG root element that use declarative SMIL animation), and so forth. However, this also precludes SVG files with script, multipage PDF files, interactive MNG files, HTML documents, plain text documents, and so forth.
The
requirements
on
the
alt
attribute's
value
are
described
in
the
next
section
.
The
img
must
not
be
used
as
a
layout
tool.
In
particular,
img
elements
should
not
be
used
to
display
transparent
images,
as
they
rarely
convey
meaning
and
rarely
add
anything
useful
to
the
document.
Unless
the
user
agent
cannot
support
images,
or
its
support
for
images
has
been
disabled,
or
the
user
agent
only
fetches
elements
on
demand,
or
the
element's
src
attribute
has
a
value
that
is
an
ignored
self-reference
,
then,
when
an
img
is
created
with
a
src
attribute,
and
whenever
the
src
attribute
is
set
subsequently,
the
user
agent
must
resolve
the
value
of
that
attribute,
relative
to
the
element,
and
if
that
is
successful
must
then
fetch
that
resource.
The
src
attribute's
value
is
an
ignored
self-reference
if
its
value
is
the
empty
string,
and
the
base
URI
of
the
element
is
the
same
as
the
document's
address
.
Fetching the image must delay the load event of the element's document until the task that is queued by the networking task source once the resource has been fetched (defined below) has been run.
This, unfortunately, can be used to perform a rudimentary port scan of the user's local network (especially in conjunction with scripting, though scripting isn't actually necessary to carry out such an attack). User agents may implement cross-origin access control policies that mitigate this attack.
If the image is in a supported image type and its dimensions are known, then the image is said to be available (this affects exactly what the element represents, as defined below). This can be true even before the image is completely downloaded, if the user agent supports incremental rendering of images; in such cases, each task that is queued by the networking task source while the image is being fetched must update the presentation of the image appropriately. It can also stop being true, e.g. if the user agent finds, after obtaining the image's dimensions, that the image data is actually fatally corrupted.
If
the
image
was
not
fetched
(e.g.
because
the
UA's
image
support
is
disabled,
or
because
the
src
attribute's
value
is
an
ignored
self-reference
),
or
if
the
conditions
in
the
previous
paragraph
are
not
met,
then
the
image
is
not
available
.
An
image
might
be
available
in
one
view
but
not
another.
For
instance,
a
Document
could
be
rendered
by
a
screen
reader
providing
a
speech
synthesis
view
of
the
output
of
a
Web
browser
using
the
screen
media.
In
this
case,
the
image
would
be
available
in
the
Web
browser's
screen
view,
but
not
available
in
the
screen
reader's
view.
Whether the image is fetched successfully or not (e.g. whether the response code was a 2xx code or equivalent ) must be ignored when determining the image's type and whether it is a valid image.
This allows servers to return images with error responses, and have them displayed.
The user agents should apply the image sniffing rules to determine the type of the image, with the image's associated Content-Type headers giving the official type . If these rules are not applied, then the type of the image must be the type given by the image's associated Content-Type headers .
User
agents
must
not
support
non-image
resources
with
the
img
element
(e.g.
XML
files
whose
root
element
is
an
HTML
element).
User
agents
must
not
run
executable
code
(e.g.
scripts)
embedded
in
the
image
resource.
User
agents
must
only
display
the
first
page
of
a
multipage
resource
(e.g.
a
PDF
file).
User
agents
must
not
allow
the
resource
to
act
in
an
interactive
fashion,
but
should
honor
any
animation
in
the
resource.
This specification does not specify which image types are to be supported.
The task that is queued by the networking task source once the resource has been fetched , must act as appropriate given the following alternatives:
load
at
the
img
element
(this
happens
after
complete
starts
returning
true).
error
on
the
img
element.
The task source for these tasks is the DOM manipulation task source .
What
an
img
element
represents
depends
on
the
src
attribute
and
the
alt
attribute.
src
attribute
is
set
and
the
alt
attribute
is
set
to
the
empty
string
The image is either decorative or supplemental to the rest of the content, redundant with some other information in the document.
If
the
image
is
available
and
the
user
agent
is
configured
to
display
that
image,
then
the
element
represents
the
image
specified
by
the
src
attribute.
Otherwise, the element represents nothing, and may be omitted completely from the rendering. User agents may provide the user with a notification that an image is present but has been omitted from the rendering.
src
attribute
is
set
and
the
alt
attribute
is
set
to
a
value
that
isn't
empty
The
image
is
a
key
part
of
the
content;
the
alt
attribute
gives
a
textual
equivalent
or
replacement
for
the
image.
If
the
image
is
available
and
the
user
agent
is
configured
to
display
that
image,
then
the
element
represents
the
image
specified
by
the
src
attribute.
Otherwise,
the
element
represents
the
text
given
by
the
alt
attribute.
User
agents
may
provide
the
user
with
a
notification
that
an
image
is
present
but
has
been
omitted
from
the
rendering.
src
attribute
is
set
and
the
alt
attribute
is
not
The image might be a key part of the content, and there is no textual equivalent of the image available.
In
a
conforming
document,
the
absence
of
the
alt
attribute
indicates
that
the
image
is
a
key
part
of
the
content
but
that
a
textual
replacement
for
the
image
was
not
available
when
the
image
was
generated.
If
the
image
is
available
,
the
element
represents
the
image
specified
by
the
src
attribute.
If the image is not available or if the user agent is not configured to display the image, then the user agent should display some sort of indicator that there is an image that is not being rendered, and may, if requested by the user, or if so configured, or when required to provide contextual information in response to navigation, provide caption information for the image, derived as follows:
If
the
image
has
a
title
attribute
whose
value
is
not
the
empty
string,
then
the
value
of
that
attribute
is
the
caption
information;
abort
these
steps.
If
the
image
is
the
child
of
a
figure
element
that
has
a
child
legend
element,
then
the
contents
of
the
first
such
legend
element
are
the
caption
information;
abort
these
steps.
Run the algorithm to create the outline for the document.
If
the
img
element
did
not
end
up
associated
with
a
heading
in
the
outline,
or
if
there
are
any
other
images
that
are
lacking
an
alt
attribute
and
that
are
associated
with
the
same
heading
in
the
outline
as
the
img
element
in
question,
then
there
is
no
caption
information;
abort
these
steps.
The caption information is the heading with which the image is associated according to the outline.
src
attribute
is
not
set
and
either
the
alt
attribute
is
set
to
the
empty
string
or
the
alt
attribute
is
not
set
at
all
The element represents nothing.
The
element
represents
the
text
given
by
the
alt
attribute.
The
alt
attribute
does
not
represent
advisory
information.
User
agents
must
not
present
the
contents
of
the
alt
attribute
in
the
same
way
as
content
of
the
title
attribute.
User agents may always provide the user with the option to display any image, or to prevent any image from being displayed. User agents may also apply image analysis heuristics to help the user make sense of the image when the user is unable to make direct use of the image, e.g. due to a visual disability or because they are using a text terminal with no graphics capabilities.
The
contents
of
img
elements,
if
any,
are
ignored
for
the
purposes
of
rendering.
The
usemap
attribute,
if
present,
can
indicate
that
the
image
has
an
associated
image
map
.
The
ismap
attribute,
when
used
on
an
element
that
is
a
descendant
of
an
a
element
with
an
href
attribute,
indicates
by
its
presence
that
the
element
provides
access
to
a
server-side
image
map.
This
affects
how
events
are
handled
on
the
corresponding
a
element.
The
ismap
attribute
is
a
boolean
attribute
.
The
attribute
must
not
be
specified
on
an
element
that
does
not
have
an
ancestor
a
element
with
an
href
attribute.
The
img
element
supports
dimension
attributes
.
The
DOM
attributes
alt
,
src
,
useMap
,
and
isMap
each
must
reflect
the
respective
content
attributes
of
the
same
name.
width
[
=
value
]
height
[
=
value
]
These attributes return the actual rendered dimensions of the image, or zero if the dimensions are not known.
They can be set, to change the corresponding content attributes.
complete
Returns true if the image has been downloaded, decoded, and found to be valid; otherwise, returns false.
Image
(
[
width
[,
height
]
]
)
Returns
a
new
img
element,
with
the
width
and
height
attributes
set
to
the
values
passed
in
the
relevant
arguments,
if
applicable.
The
DOM
attributes
width
and
height
must
return
the
rendered
width
and
height
of
the
image,
in
CSS
pixels,
if
the
image
is
being
rendered,
and
is
being
rendered
to
a
visual
medium;
or
else
the
intrinsic
width
and
height
of
the
image,
in
CSS
pixels,
if
the
image
is
available
but
not
being
rendered
to
a
visual
medium;
or
else
0,
if
the
image
is
not
available
or
its
dimensions
are
not
known.
[CSS]
On setting, they must act as if they reflected the respective content attributes of the same name.
The
DOM
attribute
complete
must
return
true
if
the
user
agent
has
fetched
the
image
specified
in
the
src
attribute,
and
it
is
in
a
supported
image
type
(i.e.
it
was
decoded
without
fatal
errors),
even
if
the
final
task
queued
by
the
networking
task
source
for
the
fetching
of
the
image
resource
has
not
yet
been
processed.
Otherwise,
the
attribute
must
return
false.
The
value
of
complete
can
thus
change
while
a
script
is
executing.
Three
constructors
are
provided
for
creating
HTMLImageElement
objects
(in
addition
to
the
factory
methods
from
DOM
Core
such
as
createElement()
):
Image()
,
Image(
width
)
,
and
Image(
width
,
height
)
.
When
invoked
as
constructors,
these
must
return
a
new
HTMLImageElement
object
(a
new
img
element).
If
the
width
argument
is
present,
the
new
object's
width
content
attribute
must
be
set
to
width
.
If
the
height
argument
is
also
present,
the
new
object's
height
content
attribute
must
be
set
to
height
.
A single image can have different appropriate alternative text depending on the context.
In
each
of
the
following
cases,
the
same
image
is
used,
yet
the
alt
text
is
different
each
time.
The
image
is
the
coat
of
arms
of
the
Canton
Geneva
in
Switzerland.
Here it is used as a supplementary icon:
<p>I lived in <img src="carouge.svg" alt=""> Carouge.</p>
Here it is used as an icon representing the town:
<p>Home town: <img src="carouge.svg" alt="Carouge"></p>
Here it is used as part of a text on the town:
<p>Carouge has a coat of arms.</p> <p><img src="carouge.svg" alt="The coat of arms depicts a lion, sitting in front of a tree."></p> <p>It is used as decoration all over the town.</p>
Here it is used as a way to support a similar text where the description is given as well as, instead of as an alternative to, the image:
<p>Carouge has a coat of arms.</p> <p><img src="carouge.svg" alt=""></p> <p>The coat of arms depicts a lion, sitting in front of a tree. It is used as decoration all over the town.</p>
Here it is used as part of a story:
<p>He picked up the folder and a piece of paper fell out.</p> <p><img src="carouge.svg" alt="Shaped like a shield, the paper had a red background, a green tree, and a yellow lion with its tongue hanging out and whose tail was shaped like an S."></p> <p>He stared at the folder. S! The answer he had been looking for all this time was simply the letter S! How had he not seen that before? It all came together now. The phone call where Hector had referred to a lion's tail, the time Marco had stuck his tongue out...</p>
Here
it
is
not
known
at
the
time
of
publication
what
the
image
will
be,
only
that
it
will
be
a
coat
of
arms
of
some
kind,
and
thus
no
replacement
text
can
be
provided,
and
instead
only
a
brief
caption
for
the
image
is
provided,
in
the
title
attribute:
<p>The last user to have uploaded a coat of arms uploaded this one:</p> <p><img src="last-uploaded-coat-of-arms.cgi" title="User-uploaded coat of arms."></p>
Ideally, the author would find a way to provide real replacement text even in this case, e.g. by asking the previous user. Not providing replacement text makes the document more difficult to use for people who are unable to view images, e.g. blind users, or users or very low-bandwidth connections or who pay by the byte, or users who are forced to use a text-only Web browser.
Here are some more examples showing the same picture used in different contexts, with different appropriate alternate texts each time.
<article> <h1>My cats</h1> <h2>Fluffy</h2> <p>Fluffy is my favorite.</p> <img src="fluffy.jpg" alt="She likes playing with a ball of yarn."> <p>She's just too cute.</p> <h2>Miles</h2> <p>My other cat, Miles just eats and sleeps.</p> </article>
<article> <h1>Photography</h1> <h2>Shooting moving targets indoors</h2> <p>The trick here is to know how to anticipate; to know at what speed and what distance the subject will pass by.</p> <img src="fluffy.jpg" alt="A cat flying by, chasing a ball of yarn, can be photographed quite nicely using this technique."> <h2>Nature by night</h2> <p>To achieve this, you'll need either an extremely sensitive film, or immense flash lights.</p> </article>
<article> <h1>About me</h1> <h2>My pets</h2> <p>I've got a cat named Fluffy and a dog named Miles.</p> <img src="fluffy.jpg" alt="Fluffy, my cat, tends to keep itself busy."> <p>My dog Miles and I like go on long walks together.</p> <h2>music</h2> <p>After our walks, having emptied my mind, I like listening to Bach.</p> </article>
<article> <h1>Fluffy and the Yarn</h1> <p>Fluffy was a cat who liked to play with yarn. He also liked to jump.</p> <aside><img src="fluffy.jpg" alt="" title="Fluffy"></aside> <p>He would play in the morning, he would play in the evening.</p> </article>
The
requirements
for
the
alt
attribute
depend
on
what
the
image
is
intended
to
represent,
as
described
in
the
following
sections.
Some of the following sections are controversial and do not enjoy broad consensus.
The
notion
that
there
may
be
images
in
the
document
that
contain
an
alt
attribute
that
is
blank
may
cause
a
number
of
usability
concerns
for
AT
.
The
WAI
PFWG
has
raised
an
issue
with
the
HTML
WG
regarding
the
usability
issues
created
by
not
making
the
alt
tag
mandatory
on
all
img
elements.
When
an
a
element
that
is
a
hyperlink
,
or
a
button
element,
has
no
textual
content
but
contains
one
or
more
images,
the
alt
attributes
must
contain
text
that
together
convey
the
purpose
of
the
link
or
button.
In this example, a user is asked to pick his preferred color from a list of three. Each color is given by an image, but for users who have configured their user agent not to display images, the color names are used instead:
<h1>Pick your color</h1> <ul> <li><a href="green.html"><img src="green.jpeg" alt="Green"></a></li> <li><a href="blue.html"><img src="blue.jpeg" alt="Blue"></a></li> <li><a href="red.html"><img src="red.jpeg" alt="Red"></a></li> </ul>
In this example, each button has a set of images to indicate the kind of color output desired by the user. The first image is used in each case to give the alternative text.
<button name="rgb"><img src="red" alt="RGB"><img src="green" alt=""><img src="blue" alt=""></button> <button name="cmyk"> <img src="cyan" alt="CMYK"><img src="magenta" alt=""><img src="yellow" alt=""><img src="black" alt=""> </button>
Since each image represents one part of the text, it could also be written like this:
<button name="rgb"><img src="red" alt="R"><img src="green" alt="G"><img src="blue" alt="B"></button> <button name="cmyk"> <img src="cyan" alt="C"><img src="magenta" alt="M"><img src="yellow" alt="Y"><img src="black" alt="K"> </button>
However, with other alternative text, this might not work, and putting all the alternative text into one image in each case might make more sense:
<button name="rgb"><img src="red" alt="sRGB profile"><img src="green" alt=""><img src="blue" alt=""></button> <button name="cmyk"> <img src="cyan" alt="CMYK profile"><img src="magenta" alt=""><img src="yellow" alt=""><img src="black" alt=""> </button>
Sometimes
something
can
be
more
clearly
stated
in
graphical
form,
for
example
as
a
flowchart,
a
diagram,
a
graph,
or
a
simple
map
showing
directions.
In
such
cases,
an
image
can
be
given
using
the
img
element,
but
the
lesser
textual
version
must
still
be
given,
so
that
users
who
are
unable
to
view
the
image
(e.g.
because
they
have
a
very
slow
connection,
or
because
they
are
using
a
text-only
browser,
or
because
they
are
listening
to
the
page
being
read
out
by
a
hands-free
automobile
voice
Web
browser,
or
simply
because
they
are
blind)
are
still
able
to
understand
the
message
being
conveyed.
The
text
must
be
given
in
the
alt
attribute,
and
must
convey
the
same
message
as
the
image
specified
in
the
src
attribute.
It is important to realize that the alternative text is a replacement for the image, not a description of the image.
In
the
following
example
we
have
a
flowchart
in
image
form,
with
text
in
the
alt
attribute
rephrasing
the
flowchart
in
prose
form:
<p>In the common case, the data handled by the tokenization stage comes from the network, but it can also come from script.</p> <p><img src="images/parsing-model-overview.png" alt="The network passes data to the Tokenizer stage, which passes data to the Tree Construction stage. From there, data goes to both the DOM and to Script Execution. Script Execution is linked to the DOM, and, using document.write(), passes data to the Tokenizer."> </p>
Here's another example, showing a good solution and a bad solution to the problem of including an image in a description.
First, here's the good solution. This sample shows how the alternative text should just be what you would have put in the prose if the image had never existed.
<!-- This is the correct way to do things. --> <p> You are standing in an open field west of a house. <img src="house.jpeg" alt="The house is white, with a boarded front door."> There is a small mailbox here. </p>
Second, here's the bad solution. In this incorrect way of doing things, the alternative text is simply a description of the image, instead of a textual replacement for the image. It's bad because when the image isn't shown, the text doesn't flow as well as in the first example.
<!-- This is the wrong way to do things. --> <p> You are standing in an open field west of a house. <img src="house.jpeg" alt="A white house, with a boarded front door."> There is a small mailbox here. </p>
Text
such
as
"Photo
of
white
house
with
boarded
door"
would
be
equally
bad
alternative
text
(though
it
could
be
suitable
for
the
title
attribute
or
in
the
legend
element
of
a
figure
with
this
image).
A document can contain information in iconic form. The icon is intended to help users of visual browsers to recognize features at a glance.
In
some
cases,
the
icon
is
supplemental
to
a
text
label
conveying
the
same
meaning.
In
those
cases,
the
alt
attribute
must
be
present
but
must
be
empty.
Here
the
icons
are
next
to
text
that
conveys
the
same
meaning,
so
they
have
an
empty
alt
attribute:
<nav> <p><a href="/help/"><img src="/icons/help.png" alt=""> Help</a></p> <p><a href="/configure/"><img src="/icons/configuration.png" alt=""> Configuration Tools</a></p> </nav>
In
other
cases,
the
icon
has
no
text
next
to
it
describing
what
it
means;
the
icon
is
supposed
to
be
self-explanatory.
In
those
cases,
an
equivalent
textual
label
must
be
given
in
the
alt
attribute.
Here, posts on a news site are labeled with an icon indicating their topic.
<body> <article> <header> <h1>Ratatouille wins <i>Best Movie of the Year</i> award</h1> <p><img src="movies.png" alt="Movies"></p> </header> <p>Pixar has won yet another <i>Best Movie of the Year</i> award, making this its 8th win in the last 12 years.</p> </article> <article> <header> <h1>Latest TWiT episode is online</h1> <p><img src="podcasts.png" alt="Podcasts"></p> </header> <p>The latest TWiT episode has been posted, in which we hear several tech news stories as well as learning much more about the iPhone. This week, the panelists compare how reflective their iPhones' Apple logos are.</p> </article> </body>
Many pages include logos, insignia, flags, or emblems, which stand for a particular entity such as a company, organization, project, band, software package, country, or some such.
If
the
logo
is
being
used
to
represent
the
entity,
e.g.
as
a
page
heading,
the
alt
attribute
must
contain
the
name
of
the
entity
being
represented
by
the
logo.
The
alt
attribute
must
not
contain
text
like
the
word
"logo",
as
it
is
not
the
fact
that
it
is
a
logo
that
is
being
conveyed,
it's
the
entity
itself.
If
the
logo
is
being
used
next
to
the
name
of
the
entity
that
it
represents,
then
the
logo
is
supplemental,
and
its
alt
attribute
must
instead
be
empty.
If the logo is merely used as decorative material (as branding, or, for example, as a side image in an article that mentions the entity to which the logo belongs), then the entry below on purely decorative images applies. If the logo is actually being discussed, then it is being used as a phrase or paragraph (the description of the logo) with an alternative graphical representation (the logo itself), and the first entry above applies.
In the following snippets, all four of the above cases are present. First, we see a logo used to represent a company:
<h1> <img src="XYZ.gif" alt="The XYZ company"> </h1>
Next, we see a paragraph which uses a logo right next to the company name, and so doesn't have any alternative text:
<article> <h2>News</h2> <p>We have recently been looking at buying the <img src="alpha.gif" alt=""> ΑΒΓ company, a small Greek company specializing in our type of product.</p>
In this third snippet, we have a logo being used in an aside, as part of the larger article discussing the acquisition:
<aside><p><img src="alpha-large.gif" alt=""></p></aside> <p>The ΑΒΓ company has had a good quarter, and our pie chart studies of their accounts suggest a much bigger blue slice than its green and orange slices, which is always a good sign.</p> </article>
Finally, we have an opinion piece talking about a logo, and the logo is therefore described in detail in the alternative text.
<p>Consider for a moment their logo:</p> <p><img src="/images/logo" alt="It consists of a green circle with a green question mark centered inside it."></p> <p>How unoriginal can you get? I mean, oooooh, a question mark, how <em>revolutionary</em>, how utterly <em>ground-breaking</em>, I'm sure everyone will rush to adopt those specifications now! They could at least have tried for some sort of, I don't know, sequence of rounded squares with varying shades of green and bold white outlines, at least that would look good on the cover of a blue book.</p>
This example shows how the alternative text should be written such that if the image isn't available , and the text is used instead, the text flows seamlessly into the surrounding text, as if the image had never been there in the first place.
Sometimes, an image just consists of text, and the purpose of the image is not to highlight the actual typographic effects used to render the text, but just to convey the text itself.
In
such
cases,
the
alt
attribute
must
be
present
but
must
consist
of
the
same
text
as
written
in
the
image
itself.
Consider a graphic containing the text "Earth Day", but with the letters all decorated with flowers and plants. If the text is merely being used as a heading, to spice up the page for graphical users, then the correct alternative text is just the same text "Earth Day", and no mention need be made of the decorations:
<h1> <img src="earthdayheading.png" alt="Earth Day"> </h1>
In
many
cases,
the
image
is
actually
just
supplementary,
and
its
presence
merely
reinforces
the
surrounding
text.
In
these
cases,
the
alt
attribute
must
be
present
but
its
value
must
be
the
empty
string.
In general, an image falls into this category if removing the image doesn't make the page any less useful, but including the image makes it a lot easier for users of visual browsers to understand the concept.
A flowchart that repeats the previous paragraph in graphical form:
<p>The network passes data to the Tokenizer stage, which passes data to the Tree Construction stage. From there, data goes to both the DOM and to Script Execution. Script Execution is linked to the DOM, and, using document.write(), passes data to the Tokenizer.</p> <p><img src="images/parsing-model-overview.png" alt=""></p>
In
these
cases,
it
would
be
wrong
to
include
alternative
text
that
consists
of
just
a
caption.
If
a
caption
is
to
be
included,
then
either
the
title
attribute
can
be
used,
or
the
figure
and
legend
elements
can
be
used.
In
the
latter
case,
the
image
would
in
fact
be
a
phrase
or
paragraph
with
an
alternative
graphical
representation,
and
would
thus
require
alternative
text.
<!-- Using the title="" attribute --> <p>The network passes data to the Tokenizer stage, which passes data to the Tree Construction stage. From there, data goes to both the DOM and to Script Execution. Script Execution is linked to the DOM, and, using document.write(), passes data to the Tokenizer.</p> <p><img src="images/parsing-model-overview.png" alt="" title="Flowchart representation of the parsing model."> </p>
<!-- Using <figure> and <legend> --> <p>The network passes data to the Tokenizer stage, which passes data to the Tree Construction stage. From there, data goes to both the DOM and to Script Execution. Script Execution is linked to the DOM, and, using document.write(), passes data to the Tokenizer.</p> <figure> <img src="images/parsing-model-overview.png" alt="The Network leads to the Tokenizer, which leads to the Tree Construction. The Tree Construction leads to two items. The first is Script Execution, which leads via document.write() back to the Tokenizer. The second item from which Tree Construction leads is the DOM. The DOM is related to the Script Execution."> <legend>Flowchart representation of the parsing model.</legend> </figure>
<!-- This is WRONG. Do not do this. Instead, do what the above examples do. -->
<p>The network passes data to the Tokenizer stage, which
passes data to the Tree Construction stage. From there, data goes
to both the DOM and to Script Execution. Script Execution is
linked to the DOM, and, using document.write(), passes data to
the Tokenizer.</p>
<p><img src="images/parsing-model-overview.png"
alt="Flowchart representation of the parsing model."></p>
<!--
Never
put
the
image's
caption
in
the
alt=""
attribute!
-->
A graph that repeats the previous paragraph in graphical form:
<p>According to a study covering several billion pages, about 62% of documents on the Web in 2007 triggered the Quirks rendering mode of Web browsers, about 30% triggered the Almost Standards mode, and about 9% triggered the Standards mode.</p> <p><img src="rendering-mode-pie-chart.png" alt=""></p>
In general, if an image is decorative but isn't especially page-specific, for example an image that forms part of a site-wide design scheme, the image should be specified in the site's CSS, not in the markup of the document.
However,
a
decorative
image
that
isn't
discussed
by
the
surrounding
text
still
has
some
relevance
can
be
included
in
a
page
using
the
img
element.
Such
images
are
decorative,
but
still
form
part
of
the
content.
In
these
cases,
the
alt
attribute
must
be
present
but
its
value
must
be
the
empty
string.
Examples where the image is purely decorative despite being relevant would include things like a photo of the Black Rock City landscape in a blog post about an event at Burning Man, or an image of a painting inspired by a poem, on a page reciting that poem. The following snippet shows an example of the latter case (only the first verse is included in this snippet):
<h1>The Lady of Shalott</h1> <p><img src="shalott.jpeg" alt=""></p> <p>On either side the river lie<br> Long fields of barley and of rye,<br> That clothe the wold and meet the sky;<br> And through the field the road run by<br> To many-tower'd Camelot;<br> And up and down the people go,<br> Gazing where the lilies blow<br> Round an island there below,<br> The island of Shalott.</p>
When
a
picture
has
been
sliced
into
smaller
image
files
that
are
then
displayed
together
to
form
the
complete
picture
again,
one
of
the
images
must
have
its
alt
attribute
set
as
per
the
relevant
rules
that
would
be
appropriate
for
the
picture
as
a
whole,
and
then
all
the
remaining
images
must
have
their
alt
attribute
set
to
the
empty
string.
In the following example, a picture representing a company logo for XYZ Corp has been split into two pieces, the first containing the letters "XYZ" and the second with the word "Corp". The alternative text ("XYZ Corp") is all in the first image.
<h1> <img src="logo1.png" alt="XYZ Corp"><img src="logo2.png" alt=""> </h1>
In the following example, a rating is shown as three filled stars and two empty stars. While the alternative text could have been "★★★☆☆", the author has instead decided to more helpfully give the rating in the form "3 out of 5". That is the alternative text of the first image, and the rest have blank alternative text.
<p>Rating: <meter max=5 value=3><img src="1" alt="3 out of 5" ><img src="1" alt=""><img src="1" alt=""><img src="0" alt="" ><img src="0" alt=""> </meter></p>
Generally, image maps should be used instead of slicing an image for links.
However,
if
an
image
is
indeed
sliced
and
any
of
the
components
of
the
sliced
picture
are
the
sole
contents
of
links,
then
one
image
per
link
must
have
alternative
text
in
its
alt
attribute
representing
the
purpose
of
the
link.
In the following example, a picture representing the flying spaghetti monster emblem, with each of the left noodly appendages and the right noodly appendages in different images, so that the user can pick the left side or the right side in an adventure.
<h1>The Church</h1> <p>You come across a flying spaghetti monster. Which side of His Noodliness do you wish to reach out for?</p> <p><a href="?go=left" ><img src="fsm-left.png" alt="Left side. "></a ><img src="fsm-middle.png" alt="" ><a href="?go=right"><img src="fsm-right.png" alt="Right side."></a></p>
In some cases, the image is a critical part of the content. This could be the case, for instance, on a page that is part of a photo gallery. The image is the whole point of the page containing it.
How to provide alternative text for an image that is a key part of the content depends on the image's provenance.
When
it
is
possible
for
detailed
alternative
text
to
be
provided,
for
example
if
the
image
is
part
of
a
series
of
screenshots
in
a
magazine
review,
or
part
of
a
comic
strip,
or
is
a
photograph
in
a
blog
entry
about
that
photograph,
text
that
can
serve
as
a
substitute
for
the
image
must
be
given
as
the
contents
of
the
alt
attribute.
A screenshot in a gallery of screenshots for a new OS, with some alternative text:
<figure>
<img src="KDE%20Light%20desktop.png"
alt="The desktop is blue, with icons along the left hand side in
two columns, reading System, Home, K-Mail, etc. A window is
open showing that menus wrap to a second line if they
cannot fit in the window. The window has a list of icons
along the top, with an address bar below it, a list of
icons for tabs along the left edge, a status bar on the
bottom, and two panes in the middle. The desktop has a bar
at the bottom of the screen with a few buttons, a pager, a
list of open applications, and a clock.">
<legend>Screenshot of a KDE desktop.</legend>
</figure>
A graph in a financial report:
<img src="sales.gif"
title="Sales graph"
alt="From 1998 to 2005, sales increased by the following percentages
with
each
year:
624%,
75%,
138%,
40%,
35%,
9%,
21%">
Note that "sales graph" would be inadequate alternative text for a sales graph. Text that would be a good caption is not generally suitable as replacement text.
In certain cases, the nature of the image might be such that providing thorough alternative text is impractical. For example, the image could be indistinct, or could be a complex fractal, or could be a detailed topographical map.
In
these
cases,
the
alt
attribute
must
contain
some
suitable
alternative
text,
but
it
may
be
somewhat
brief.
Sometimes there simply is no text that can do justice to an image. For example, there is little that can be said to usefully describe a Rorschach inkblot test. However, a description, even if brief, is still better than nothing:
<figure> <img src="/commons/a/a7/Rorschach1.jpg" alt="A shape with left-right symmetry with indistinct edges, with a small gap in the center, two larger gaps offset slightly from the center, with two similar gaps under them. The outline is wider in the top half than the bottom half, with the sides extending upwards higher than the center, and the center extending below the sides."> <legend>A black outline of the first of the ten cards in the Rorschach inkblot test.</legend> </figure>
Note that the following would be a very bad use of alternative text:
<!-- This example is wrong. Do not copy it. --> <figure> <img src="/commons/a/a7/Rorschach1.jpg" alt="A black outline of the first of the ten cards in the Rorschach inkblot test."> <legend>A black outline of the first of the ten cards in the Rorschach inkblot test.</legend> </figure>
Including the caption in the alternative text like this isn't useful because it effectively duplicates the caption for users who don't have images, taunting them twice yet not helping them any more than if they had only read or heard the caption once.
Another example of an image that defies full description is a fractal, which, by definition, is infinite in complexity.
The following example shows one possible way of providing alternative text for the full view of an image of the Mandelbrot set.
<img src="ms1.jpeg" alt="The Mandelbrot set appears as a cardioid with its cusp on the real axis in the positive direction, with a smaller bulb aligned along the same center line, touching it in the negative direction, and with these two shapes being surrounded by smaller bulbs of various sizes.">
In some unfortunate cases, there might be no alternative text available at all, either because the image is obtained in some automated fashion without any associated alternative text (e.g. a Webcam), or because the page is being generated by a script using user-provided images where the user did not provide suitable or usable alternative text (e.g. photograph sharing sites), or because the author does not himself know what the images represent (e.g. a blind photographer sharing an image on his blog).
In
such
cases,
the
alt
attribute's
value
may
be
omitted,
but
one
of
the
following
conditions
must
be
met
as
well:
title
attribute
is
present
and
has
a
non-empty
value.
img
element
is
in
a
figure
element
that
contains
a
legend
element
that
contains
content
other
than
inter-element
whitespace
.
img
element
is
part
of
the
only
paragraph
directly
in
its
section
,
and
is
the
only
img
element
without
an
alt
attribute
in
its
section,
and
its
section
has
an
associated
heading.
Such
cases
are
to
be
kept
to
an
absolute
minimum.
If
there
is
even
the
slightest
possibility
of
the
author
having
the
ability
to
provide
real
alternative
text,
then
it
would
not
be
acceptable
to
omit
the
alt
attribute.
A photo on a photo-sharing site, if the site received the image with no metadata other than the caption:
<figure> <img src="1100670787_6a7c664aef.jpg"> <legend>Bubbles traveled everywhere with us.</legend> </figure>
It could also be marked up like this:
<article> <h1>Bubbles traveled everywhere with us.</h1> <img src="1100670787_6a7c664aef.jpg"> </article>
In either case, though, it would be better if a detailed description of the important parts of the image obtained from the user and included on the page.
A blind user's blog in which a photo taken by the user is shown. Initially, the user might not have any idea what the photo he took shows:
<article> <h1>I took a photo</h1> <p>I went out today and took a photo!</p> <figure> <img src="photo2.jpeg"> <legend>A photograph taken blindly from my front porch.</legend> </figure> </article>
Eventually though, the user might obtain a description of the image from his friends and could then include alternative text:
<article> <h1>I took a photo</h1> <p>I went out today and took a photo!</p> <figure> <img src="photo2.jpeg" alt="The photograph shows my hummingbird feeder hanging from the edge of my roof. It is half full, but there are no birds around. In the background, out-of-focus trees fill the shot. The feeder is made of wood with a metal grate, and it contains peanuts. The edge of the roof is wooden too, and is painted white with light blue streaks."> <legend>A photograph taken blindly from my front porch.</legend> </figure> </article>
Sometimes
the
entire
point
of
the
image
is
that
a
textual
description
is
not
available,
and
the
user
is
to
provide
the
description.
For
instance,
the
point
of
a
CAPTCHA
image
is
to
see
if
the
user
can
literally
read
the
graphic.
Here
is
one
way
to
mark
up
a
CAPTCHA
(note
the
title
attribute):
<p><label>What does this image say? <img src="captcha.cgi?id=8934" title="CAPTCHA"> <input type=text name=captcha></label> (If you cannot see the image, you can use an <a href="?audio">audio</a> test instead.)</p>
Another example would be software that displays images and asks for alternative text precisely for the purpose of then writing a page with correct alternative text. Such a page could have a table of images, like this:
<table> <thead> <tr> <th> Image <th> Description <tbody> <tr> <td> <img src="2421.png" title="Image 640 by 100, filename 'banner.gif'"> <td> <input name="alt2421"> <tr> <td> <img src="2422.png" title="Image 200 by 480, filename 'ad3.gif'"> <td> <input name="alt2422"> </table>
Notice
that
even
in
this
example,
as
much
useful
information
as
possible
is
still
included
in
the
title
attribute.
Since
some
users
cannot
use
images
at
all
(e.g.
because
they
have
a
very
slow
connection,
or
because
they
are
using
a
text-only
browser,
or
because
they
are
listening
to
the
page
being
read
out
by
a
hands-free
automobile
voice
Web
browser,
or
simply
because
they
are
blind),
the
alt
attribute
is
only
allowed
to
be
omitted
rather
than
being
provided
with
replacement
text
when
no
alternative
text
is
available
and
none
can
be
made
available,
as
in
the
above
examples.
Lack
of
effort
from
the
part
of
the
author
is
not
an
acceptable
reason
for
omitting
the
alt
attribute.
Generally
authors
should
avoid
using
img
elements
for
purposes
other
than
showing
images.
If
an
img
element
is
being
used
for
purposes
other
than
showing
an
image,
e.g.
as
part
of
a
service
to
count
page
views,
then
the
alt
attribute
must
be
the
empty
string.
In
such
cases,
the
width
and
height
attributes
should
both
be
set
to
zero.
This section does not apply to documents that are publicly accessible, or whose target audience is not necessarily personally known to the author, such as documents on a Web site, e-mails sent to public mailing lists, or software documentation.
When
an
image
is
included
in
a
private
communication
(such
as
an
HTML
e-mail)
aimed
at
a
specific
person
who
is
known
to
be
able
to
view
images,
the
alt
attribute
may
be
omitted.
However,
even
in
such
cases
it
is
strongly
recommended
that
alternative
text
be
included
(as
appropriate
according
to
the
kind
of
image
involved,
as
described
in
the
above
entries),
so
that
the
e-mail
is
still
usable
should
the
user
use
a
mail
client
that
does
not
support
images,
or
should
the
document
be
forwarded
on
to
other
users
whose
abilities
might
not
include
easily
seeing
images.
The
most
general
rule
to
consider
when
writing
alternative
text
is
the
following:
the
intent
is
that
replacing
every
image
with
the
text
of
its
alt
attribute
not
change
the
meaning
of
the
page
.
So, in general, alternative text can be written by considering what one would have written had one not been able to include the image.
A
corollary
to
this
is
that
the
alt
attribute's
value
should
never
contain
text
that
could
be
considered
the
image's
caption
,
title
,
or
legend
.
It
is
supposed
to
contain
replacement
text
that
could
be
used
by
users
instead
of
the
image;
it
is
not
meant
to
supplement
the
image.
The
title
attribute
can
be
used
for
supplemental
information.
One way to think of alternative text is to think about how you would read the page containing the image to someone over the phone, without mentioning that there is an image present. Whatever you say instead of the image is typically a good start for writing the alternative text.
Markup generators (such as WYSIWYG authoring tools) should, wherever possible, obtain alternative text from their users. However, it is recognized that in many cases, this will not be possible.
For images that are the sole contents of links, markup generators should examine the link target to determine the title of the target, or the URL of the target, and use information obtained in this manner as the alternative text.
As
a
last
resort,
implementors
should
either
set
the
alt
attribute
to
the
empty
string,
under
the
assumption
that
the
image
is
a
purely
decorative
image
that
doesn't
add
any
information
but
is
still
specific
to
the
surrounding
content,
or
omit
the
alt
attribute
altogether,
under
the
assumption
that
the
image
is
a
key
part
of
the
content.
Markup generators should generally avoid using the image's own file name as the alternative text.
Conformance
checkers
must
report
the
lack
of
an
alt
attribute
as
an
error
unless
the
conditions
listed
above
for
images
whose
contents
are
not
known
or
they
have
been
configured
to
assume
that
the
document
is
an
e-mail
or
document
intended
for
a
specific
person
who
is
known
to
be
able
to
view
images.
iframe
element
src
name
sandbox
seamless
width
height
interface HTMLIFrameElement : HTMLElement {
attribute DOMString src;
attribute DOMString name;
attribute DOMString sandbox;
attribute boolean seamless;
attribute DOMString width;
attribute DOMString height;
readonly attribute Document contentDocument;
readonly attribute WindowProxy contentWindow;
};
The
iframe
element
represents
a
nested
browsing
context
.
The
src
attribute
gives
the
address
of
a
page
that
the
nested
browsing
context
is
to
contain.
The
attribute,
if
present,
must
be
a
valid
URL
.
When
the
browsing
context
is
created,
if
the
attribute
is
present,
the
user
agent
must
resolve
the
value
of
that
attribute,
relative
to
the
element,
and
if
that
is
successful,
must
then
navigate
the
element's
browsing
context
to
the
resulting
absolute
URL
,
with
replacement
enabled
,
and
with
the
iframe
element's
document's
browsing
context
as
the
source
browsing
context
.
If
the
user
navigates
away
from
this
page,
the
iframe
's
corresponding
WindowProxy
object
will
proxy
new
Window
objects
for
new
Document
objects,
but
the
src
attribute
will
not
change.
Whenever
the
src
attribute
is
set,
the
user
agent
must
resolve
the
value
of
that
attribute,
relative
to
the
element,
and
if
that
is
successful,
the
nested
browsing
context
must
be
navigated
to
the
resulting
absolute
URL
,
with
the
iframe
element's
document's
browsing
context
as
the
source
browsing
context
.
If
the
src
attribute
is
not
set
when
the
element
is
created,
or
if
its
value
cannot
be
resolved
,
the
browsing
context
will
remain
at
the
initial
about:blank
page.
The
name
attribute,
if
present,
must
be
a
valid
browsing
context
name
.
The
given
value
is
used
to
name
the
nested
browsing
context
.
When
the
browsing
context
is
created,
if
the
attribute
is
present,
the
browsing
context
name
must
be
set
to
the
value
of
this
attribute;
otherwise,
the
browsing
context
name
must
be
set
to
the
empty
string.
Whenever
the
name
attribute
is
set,
the
nested
browsing
context
's
name
must
be
changed
to
the
new
value.
If
the
attribute
is
removed,
the
browsing
context
name
must
be
set
to
the
empty
string.
When
content
loads
in
an
iframe
,
after
any
load
events
are
fired
within
the
content
itself,
the
user
agent
must
fire
a
simple
event
called
load
at
the
iframe
element.
When
content
fails
to
load
(e.g.
due
to
a
network
error),
then
the
user
agent
must
fire
a
simple
event
called
error
at
the
element
instead.
When
there
is
an
active
parser
in
the
iframe
,
and
when
anything
in
the
iframe
is
delaying
the
load
event
of
the
iframe
's
browsing
context
's
active
document
,
the
iframe
must
delay
the
load
event
of
its
document.
If,
during
the
handling
of
the
load
event,
the
browsing
context
in
the
iframe
is
again
navigated
,
that
will
further
delay
the
load
event
.
The
sandbox
attribute,
when
specified,
enables
a
set
of
extra
restrictions
on
any
content
hosted
by
the
iframe
.
Its
value
must
be
an
unordered
set
of
unique
space-separated
tokens
.
The
allowed
values
are
allow-same-origin
,
allow-forms
,
and
allow-scripts
.
When
the
attribute
is
set,
the
content
is
treated
as
being
from
a
unique
origin
,
forms
and
scripts
are
disabled,
links
are
prevented
from
targeting
other
browsing
contexts
,
and
plugins
are
disabled.
The
allow-same-origin
token
allows
the
content
to
be
treated
as
being
from
the
same
origin
instead
of
forcing
it
into
a
unique
origin,
and
the
allow-forms
and
allow-scripts
tokens
re-enable
forms
and
scripts
respectively
(though
scripts
are
still
prevented
from
creating
popups).
While
the
sandbox
attribute
is
specified,
the
iframe
element's
nested
browsing
context
,
and
all
the
browsing
contexts
nested
within
it
(either
directly
or
indirectly
through
other
nested
browsing
contexts)
must
have
the
following
flags
set:
This flag prevents content from navigating browsing contexts other than the sandboxed browsing context itself (or browsing contexts further nested inside it).
This
flag
also
prevents
content
from
creating
new
auxiliary
browsing
contexts
,
e.g.
using
the
target
attribute
or
the
window.open()
method.
This
flag
prevents
content
from
instantiating
plugins
,
whether
using
the
embed
element
,
the
object
element
,
the
applet
element
,
or
through
navigation
of
a
nested
browsing
context
.
sandbox
attribute's
value,
when
split
on
spaces
,
is
found
to
have
the
allow-same-origin
keyword
set
This flag forces content into a unique origin for the purposes of the same-origin policy .
This
flag
also
prevents
script
from
reading
the
document.cookie
DOM
attribute
.
The
allow-same-origin
attribute
is
intended
for
two
cases.
First, it can be used to allow content from the same site to be sandboxed to disable scripting, while still allowing access to the DOM of the sandboxed content.
Second, it can be used to embed content from a third-party site, sandboxed to prevent that site from opening popup windows, etc, without preventing the embedded page from communicating back to its originating site, using the database APIs to store data, etc.
This
flag
only
takes
effect
when
the
nested
browsing
context
of
the
iframe
is
navigated
.
sandbox
attribute's
value,
when
split
on
spaces
,
is
found
to
have
the
allow-forms
keyword
set
This flag blocks form submission .
sandbox
attribute's
value,
when
split
on
spaces
,
is
found
to
have
the
allow-scripts
keyword
set
This flag blocks script execution .
If
the
sandbox
attribute
is
dynamically
added
after
the
iframe
has
loaded
a
page,
scripts
already
compiled
by
that
page
(whether
in
script
elements,
or
in
event
handler
attributes
,
or
elsewhere)
will
continue
to
run.
Only
new
scripts
will
be
prevented
from
executing
by
this
flag.
These flags must not be set unless the conditions listed above define them as being set.
In this example, some completely-unknown, potentially hostile, user-provided HTML content is embedded in a page. Because it is sandboxed, it is treated by the user agent as being from a unique origin, despite the content being served from the same site. Thus it is affected by all the normal cross-site restrictions. In addition, the embedded page has scripting disabled, plugins disabled, forms disabled, and it cannot navigate any frames or windows other than itself (or any frames or windows it itself embeds).
<p>We're not scared of you! Here is your content, unedited:</p> <iframe sandbox src="getusercontent.cgi?id=12193"></iframe>
Note
that
cookies
are
still
sent
to
the
server
in
the
getusercontent.cgi
request,
though
they
are
not
visible
in
the
document.cookie
DOM
attribute.
In this example, a gadget from another site is embedded. The gadget has scripting and forms enabled, and the origin sandbox restrictions are lifted, allowing the gadget to communicate with its originating server. The sandbox is still useful, however, as it disables plugins and popups, thus reducing the risk of the user being exposed to malware and other annoyances.
<iframe sandbox="allow-same-origin allow-forms allow-scripts" src="http://maps.example.com/embedded.html"></iframe>
The
seamless
attribute
is
a
boolean
attribute.
When
specified,
it
indicates
that
the
iframe
element's
browsing
context
is
to
be
rendered
in
a
manner
that
makes
it
appear
to
be
part
of
the
containing
document
(seamlessly
included
in
the
parent
document).
Specifically,
when
the
attribute
is
set
on
an
element
and
while
the
browsing
context
's
active
document
has
the
same
origin
as
the
iframe
element's
document,
or
the
browsing
context
's
active
document
's
address
has
the
same
origin
as
the
iframe
element's
document,
the
following
requirements
apply:
The user agent must set the seamless browsing context flag to true for that browsing context . This will cause links to open in the parent browsing context .
In
a
CSS-supporting
user
agent:
the
user
agent
must
add
all
the
style
sheets
that
apply
to
the
iframe
element
to
the
cascade
of
the
active
document
of
the
iframe
element's
nested
browsing
context
,
at
the
appropriate
cascade
levels,
before
any
style
sheets
specified
by
the
document
itself.
In
a
CSS-supporting
user
agent:
the
user
agent
must,
for
the
purpose
of
CSS
property
inheritance
only,
treat
the
root
element
of
the
active
document
of
the
iframe
element's
nested
browsing
context
as
being
a
child
of
the
iframe
element.
(Thus
inherited
properties
on
the
root
element
of
the
document
in
the
iframe
will
inherit
the
computed
values
of
those
properties
on
the
iframe
element
instead
of
taking
their
initial
values.)
In
visual
media,
in
a
CSS-supporting
user
agent:
the
user
agent
should
set
the
intrinsic
width
of
the
iframe
to
the
width
that
the
element
would
have
if
it
was
a
non-replaced
block-level
element
with
'width:
auto'.
In
visual
media,
in
a
CSS-supporting
user
agent:
the
user
agent
should
set
the
intrinsic
height
of
the
iframe
to
the
height
of
the
bounding
box
around
the
content
rendered
in
the
iframe
at
its
current
width
(as
given
in
the
previous
bullet
point),
as
it
would
be
if
the
scrolling
position
was
such
that
the
top
of
the
viewport
for
the
content
rendered
in
the
iframe
was
aligned
with
the
origin
of
that
content's
canvas.
In
visual
media,
in
a
CSS-supporting
user
agent:
the
user
agent
must
force
the
height
of
the
initial
containing
block
of
the
active
document
of
the
nested
browsing
context
of
the
iframe
to
zero.
This is intended to get around the otherwise circular dependency of percentage dimensions that depend on the height of the containing block, thus affecting the height of the document's bounding box, thus affecting the height of the viewport, thus affecting the size of the initial containing block.
In speech media, the user agent should render the nested browsing context without announcing that it is a separate document.
User
agents
should,
in
general,
act
as
if
the
active
document
of
the
iframe
's
nested
browsing
context
was
part
of
the
document
that
the
iframe
is
in.
For example if the user agent supports listing all the links in a document, links in "seamlessly" nested documents would be included in that list without being significantly distinguished from links in the document itself.
If the attribute is not specified, or if the origin conditions listed above are not met, then the user agent should render the nested browsing context in a manner that is clearly distinguishable as a separate browsing context , and the seamless browsing context flag must be set to false for that browsing context .
It
is
important
that
user
agents
recheck
the
above
conditions
whenever
the
active
document
of
the
nested
browsing
context
of
the
iframe
changes,
such
that
the
seamless
browsing
context
flag
gets
unset
if
the
nested
browsing
context
is
navigated
to
another
origin.
The attribute can be set or removed dynamically, with the rendering updating in tandem.
In
this
example,
the
site's
navigation
is
embedded
using
a
client-side
include
using
an
iframe
.
Any
links
in
the
iframe
will,
in
new
user
agents,
be
automatically
opened
in
the
iframe
's
parent
browsing
context;
for
legacy
user
agents,
the
site
could
also
include
a
base
element
with
a
target
attribute
with
the
value
_parent
.
Similarly,
in
new
user
agents
the
styles
of
the
parent
page
will
be
automatically
applied
to
the
contents
of
the
frame,
but
to
support
legacy
user
agents
authors
might
wish
to
include
the
styles
explicitly.
<nav><iframe seamless src="nav.include.html"></iframe></nav>
The
iframe
element
supports
dimension
attributes
for
cases
where
the
embedded
content
has
specific
dimensions
(e.g.
ad
units
have
well-defined
dimensions).
An
iframe
element
never
has
fallback
content
,
as
it
will
always
create
a
nested
browsing
context
,
regardless
of
whether
the
specified
initial
contents
are
successfully
used.
Descendants
of
iframe
elements
represent
nothing.
(In
legacy
user
agents
that
do
not
support
iframe
elements,
the
contents
would
be
parsed
as
markup
that
could
act
as
fallback
content.)
When
used
in
HTML
documents
,
the
allowed
content
model
of
iframe
elements
is
text,
except
that
invoking
the
HTML
fragment
parsing
algorithm
with
the
iframe
element
as
the
context
element
and
the
text
contents
as
the
input
must
result
in
a
list
of
nodes
that
are
all
phrasing
content
,
with
no
parse
errors
having
occurred,
with
no
script
elements
being
anywhere
in
the
list
or
as
descendants
of
elements
in
the
list,
and
with
all
the
elements
in
the
list
(including
their
descendants)
being
themselves
conforming.
The
iframe
element
must
be
empty
in
XML
documents
.
The
HTML
parser
treats
markup
inside
iframe
elements
as
text.
The
DOM
attributes
src
,
name
,
sandbox
,
and
seamless
must
reflect
the
respective
content
attributes
of
the
same
name.
The
contentDocument
DOM
attribute
must
return
the
Document
object
of
the
active
document
of
the
iframe
element's
nested
browsing
context
.
The
contentWindow
DOM
attribute
must
return
the
WindowProxy
object
of
the
iframe
element's
nested
browsing
context
.
embed
element
src
type
width
height
interface HTMLEmbedElement : HTMLElement {
attribute DOMString src;
attribute DOMString type;
attribute DOMString width;
attribute DOMString height;
};
Depending
on
the
type
of
content
instantiated
by
the
embed
element,
the
node
may
also
support
other
interfaces.
The
embed
element
represents
an
integration
point
for
an
external
(typically
non-HTML)
application
or
interactive
content.
The
src
attribute
gives
the
address
of
the
resource
being
embedded.
The
attribute,
if
present,
must
contain
a
valid
URL
.
The
type
attribute,
if
present,
gives
the
MIME
type
of
the
plugin
to
instantiate.
The
value
must
be
a
valid
MIME
type,
optionally
with
parameters.
If
both
the
type
attribute
and
the
src
attribute
are
present,
then
the
type
attribute
must
specify
the
same
type
as
the
explicit
Content-Type
metadata
of
the
resource
given
by
the
src
attribute.
[RFC2046]
When
the
element
is
created
with
neither
a
src
attribute
nor
a
type
attribute,
and
when
attributes
are
removed
such
that
neither
attribute
is
present
on
the
element
anymore,
and
when
the
element
has
a
media
element
ancestor,
and
when
the
element
has
an
ancestor
object
element
that
is
not
showing
its
fallback
content
,
any
plugins
instantiated
for
the
element
must
be
removed,
and
the
embed
element
represents
nothing.
When
the
sandboxed
plugins
browsing
context
flag
is
set
on
the
browsing
context
for
which
the
embed
element's
document
is
the
active
document
,
then
the
user
agent
must
render
the
embed
element
in
a
manner
that
conveys
that
the
plugin
was
disabled.
The
user
agent
may
offer
the
user
the
option
to
override
the
sandbox
and
instantiate
the
plugin
anyway;
if
the
user
invokes
such
an
option,
the
user
agent
must
act
as
if
the
sandboxed
plugins
browsing
context
flag
was
not
set
for
the
purposes
of
this
element.
Plugins are disabled in sandboxed browsing contexts because they might not honor the restrictions imposed by the sandbox (e.g. they might allow scripting even when scripting in the sandbox is disabled). User agents should convey the danger of overriding the sandbox to the user if an option to do so is provided.
When
the
element
is
created
with
a
src
attribute,
and
whenever
the
src
attribute
is
subsequently
set,
and
whenever
the
type
attribute
is
set
or
removed
while
the
element
has
a
src
attribute,
if
the
element
is
not
in
a
sandboxed
browsing
context,
not
a
descendant
of
a
media
element
,
and
not
a
descendant
of
an
object
element
that
is
not
showing
its
fallback
content
,
the
user
agent
must
resolve
the
value
of
the
attribute,
relative
to
the
element,
and
if
that
is
successful,
should
fetch
the
resulting
absolute
URL
.
The
task
that
is
queued
by
the
networking
task
source
once
the
resource
has
been
fetched
must
find
and
instantiate
an
appropriate
plugin
based
on
the
content's
type
,
and
hand
that
plugin
the
content
of
the
resource,
replacing
any
previously
instantiated
plugin
for
the
element.
Fetching the resource must delay the load event of the element's document.
The type of the content being embedded is defined as follows:
If
the
element
has
a
type
attribute,
and
that
attribute's
value
is
a
type
that
a
plugin
supports,
then
the
value
of
the
type
attribute
is
the
content's
type
.
Otherwise, if the <path> component of the URL of the specified resource (after any redirects) matches a pattern that a plugin supports, then the content's type is the type that that plugin can handle.
For
example,
a
plugin
might
say
that
it
can
handle
resources
with
<path>
components
that
end
with
the
four
character
string
"
.swf
".
Otherwise, if the specified resource has explicit Content-Type metadata , then that is the content's type .
Otherwise, the content has no type and there can be no appropriate plugin for it.
Whether the resource is fetched successfully or not (e.g. whether the response code was a 2xx code or equivalent ) must be ignored when determining the resource's type and when handing the resource to the plugin.
This allows servers to return data for plugins even with error responses (e.g. HTTP 500 Internal Server Error codes can still contain plugin data).
When
the
element
is
created
with
a
type
attribute
and
no
src
attribute,
and
whenever
the
type
attribute
is
subsequently
set,
so
long
as
no
src
attribute
is
set,
and
whenever
the
src
attribute
is
removed
when
the
element
has
a
type
attribute,
if
the
element
is
not
in
a
sandboxed
browsing
context,
user
agents
should
find
and
instantiate
an
appropriate
plugin
based
on
the
value
of
the
type
attribute.
Any
(namespace-less)
attribute
may
be
specified
on
the
embed
element,
so
long
as
its
name
is
XML-compatible
and
contains
no
characters
in
the
range
U+0041
..
U+005A
(LATIN
CAPITAL
LETTER
A
LATIN
CAPITAL
LETTER
Z).
All attributes in HTML documents get lowercased automatically, so the restriction on uppercase letters doesn't affect such documents.
The
user
agent
should
pass
the
names
and
values
of
all
the
attributes
of
the
embed
element
that
have
no
namespace
to
the
plugin
used,
when
it
is
instantiated.
If
the
plugin
instantiated
for
the
embed
element
supports
a
scriptable
interface,
the
HTMLEmbedElement
object
representing
the
element
should
expose
that
interface
while
the
element
is
instantiated.
The
embed
element
has
no
fallback
content
.
If
the
user
agent
can't
find
a
suitable
plugin,
then
the
user
agent
must
use
a
default
plugin.
(This
default
could
be
as
simple
as
saying
"Unsupported
Format".)
The
embed
element
supports
dimension
attributes
.
The
DOM
attributes
src
and
type
each
must
reflect
the
respective
content
attributes
of
the
same
name.
object
element
usemap
attribute:
Interactive
content
.
param
elements,
then,
transparent
.
data
type
name
usemap
form
width
height
interface HTMLObjectElement : HTMLElement {
attribute DOMString data;
attribute DOMString type;
attribute DOMString name;
attribute DOMString useMap;
readonly attribute HTMLFormElement form;
attribute DOMString width;
attribute DOMString height;
readonly attribute Document contentDocument;
readonly attribute WindowProxy contentWindow;
};
Depending
on
the
type
of
content
instantiated
by
the
object
element,
the
node
also
supports
other
interfaces.
The
object
element
can
represent
an
external
resource,
which,
depending
on
the
type
of
the
resource,
will
either
be
treated
as
an
image,
as
a
nested
browsing
context
,
or
as
an
external
resource
to
be
processed
by
a
plugin
.
The
data
attribute,
if
present,
specifies
the
address
of
the
resource.
If
present,
the
attribute
must
be
a
valid
URL
.
The
type
attribute,
if
present,
specifies
the
type
of
the
resource.
If
present,
the
attribute
must
be
a
valid
MIME
type,
optionally
with
parameters.
[RFC2046]
One
or
both
of
the
data
and
type
attributes
must
be
present.
The
name
attribute,
if
present,
must
be
a
valid
browsing
context
name
.
The
given
value
is
used
to
name
the
nested
browsing
context
,
if
applicable.
When
the
element
is
created,
and
subsequently
whenever
the
classid
attribute
changes
or
is
removed,
or,
if
the
classid
attribute
is
not
present,
whenever
the
data
attribute
changes
or
is
removed,
or,
if
neither
classid
attribute
nor
the
data
attribute
are
present,
whenever
the
type
attribute
changes
or
is
removed,
the
user
agent
must
run
the
following
steps
to
determine
what
the
object
element
represents:
If
the
element
has
an
ancestor
media
element
,
or
has
an
ancestor
object
element
that
is
not
showing
its
fallback
content
,
then
jump
to
the
last
step
in
the
overall
set
of
steps
(fallback).
If
the
classid
attribute
is
present,
and
has
a
value
that
isn't
the
empty
string,
then:
if
the
user
agent
can
find
a
plugin
suitable
according
to
the
value
of
the
classid
attribute,
and
plugins
aren't
being
sandboxed
,
then
that
plugin
should
be
used
,
and
the
value
of
the
data
attribute,
if
any,
should
be
passed
to
the
plugin
.
If
no
suitable
plugin
can
be
found,
or
if
the
plugin
reports
an
error,
jump
to
the
last
step
in
the
overall
set
of
steps
(fallback).
If
the
data
attribute
is
present,
then:
If
the
type
attribute
is
present
and
its
value
is
not
a
type
that
the
user
agent
supports,
and
is
not
a
type
that
the
user
agent
can
find
a
plugin
for,
then
the
user
agent
may
jump
to
the
last
step
in
the
overall
set
of
steps
(fallback)
without
fetching
the
content
to
examine
its
real
type.
Resolve
the
URL
specified
by
the
data
attribute,
relative
to
the
element.
If that is successful, fetch the resulting absolute URL .
Fetching the resource must delay the load event of the element's document until the task that is queued by the networking task source once the resource has been fetched (defined next) has been run.
If the resource is not yet available (e.g. because the resource was not available in the cache, so that loading the resource required making a request over the network), then jump to the last step in the overall set of steps (fallback). The task that is queued by the networking task source once the resource is available must restart this algorithm from this step. Resources can load incrementally; user agents may opt to consider a resource "available" whenever enough data has been obtained to begin processing the resource.
If
the
load
failed
(e.g.
the
URL
could
not
be
resolved
,
there
was
an
HTTP
404
error,
there
was
a
DNS
error),
fire
a
simple
event
called
error
at
the
element,
then
jump
to
the
last
step
in
the
overall
set
of
steps
(fallback).
Determine the resource type , as follows:
Let the resource type be unknown.
If the resource has associated Content-Type metadata , then let the resource type be the type specified in the resource's Content-Type metadata .
If
the
resource
type
is
unknown
or
"
application/octet-stream
"