Entities are a mechanism for assigning names to chunks of content. As an XML parser processes a document, any entities it finds are replaced by the content of the entity.
This is a good way to have re-usable, easily changeable chunks of content in XML documents. It is also the only way to include one marked up file inside another using XML.
There are two types of entities for two different situations: general entities and parameter entities.
General entities are used to assign names to reusable chunks of text. These entities can only be used in the document. They cannot be used in an XML context.
To include the text of a general entity in the document,
include
&
in the text. For example, consider a general entity called
entity-name
;current.version
which expands to the
current version number of a product. To use it in the
document, write:
<para>
The current version of our product is ¤t.version;.</para>
When the version number changes, edit the definition of the general entity, replacing the value. Then reprocess the document.
General entities can also be used to enter characters that
could not otherwise be included in an XML
document. For example, <
and
&
cannot normally appear in an
XML document. The XML
parser sees the <
symbol as the start of
a tag. Likewise, when the &
symbol is
seen, the next text is expected to be an entity name.
These symbols can be included by using two predefined
general entities: <
and
&
.
General entities can only be defined within an XML context. Such definitions are usually done immediately after the DOCTYPE declaration.
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd" [ <!ENTITY current.version "3.0-RELEASE"> <!ENTITY last.version "2.2.7-RELEASE"> ]>
The DOCTYPE declaration has been extended by adding a square bracket at the end of the first line. The two entities are then defined over the next two lines, the square bracket is closed, and then the DOCTYPE declaration is closed.
The square brackets are necessary to indicate that the DTD indicated by the DOCTYPE declaration is being extended.
Parameter entities, like general entities, are used to assign names to reusable chunks of text. But parameter entities can only be used within an XML context.
Parameter entity definitions are similar to those for
general entities. However, parameter entries are included
with
%
.
The definition also includes the entity-name
;%
between
the ENTITY
keyword and the name of the
entity.
For a mnemonic, think “Parameter entities use the Percent symbol”.
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd" [ <!ENTITY % param.some "some"> <!ENTITY % param.text "text"> <!ENTITY % param.new "%param.some more %param.text"> <!-- %param.new now contains "some more text" --> ]>
Add a general entity to
example.xml
.
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd" [ <!ENTITY version "1.1"> ]><html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>
An Example XHTML File</title>
</head>
<!-- There may be some comments in here as well --><body>
<p>
This is a paragraph containing some text.</p>
<p>
This paragraph contains some more text.</p>
<p align="right">
This paragraph might be right-justified.</p>
<p>
The current version of this document is: &version;</p>
</body>
</html>
Validate the document using
xmllint
.
Load example.xml
into a web
browser. It may have to be copied to
example.html
before the browser
recognizes it as an XHTML
document.
Older browsers with simple parsers may not render this
file as expected. The entity reference
&version;
may not be replaced by
the version number, or the XML context
closing ]>
may not be recognized and
instead shown in the output.
The solution is to normalize the document with an XML normalizer. The normalizer reads valid XML and writes equally valid XML which has been transformed in some way. One way the normalizer transforms the input is by expanding all the entity references in the document, replacing the entities with the text that they represent.
xmllint
can be used for this. It
also has an option to drop the initial
DTD section so that the closing
]>
does not confuse browsers:
%
xmllint --noent --dropdtd example.xml > example.html
A normalized copy of the document with entities
expanded is produced in example.html
,
ready to load into a web browser.
All FreeBSD documents are available for download at http://ftp.FreeBSD.org/pub/FreeBSD/doc/
Questions that are not answered by the
documentation may be
sent to <[email protected]>.
Send questions about this document to <[email protected]>.