When you are including graphic images in your documents, you need to manage the locations of the graphics files. It helps to know that the handling of graphics files is quite different for HTML and FO outputs.
An XSLT processor cannot copy graphics files to an output location. Any file copying that needs to be done must be done outside of the stylesheet process, using a tool such as Make or Ant. To help identify the filenames to be copied, you can use a contributed utility stylesheet named xmldepend.xsl
available from the DocBook SourceForge SVN repository. When you process a DocBook document with this stylesheet, it lists all the image pathnames in the file.
When a DocBook XML file with an imagedata
or image
element is processed with one of the HTML stylesheets, the graphics file is not opened and read; only the file pathname is passed through to the HTML IMG
tag. The image file itself does not have to be present when the HTML is generated, so no error is generated during processing if the graphics file is not present. But it does need to be present at the address specified in the IMG
tag when the HTML file is viewed.
For this reason, managing graphics files for HTML output means managing their locations in the output, relative to the HTML files that are generated. When you generate HTML and place them on a server or other accessible location, you also need to manually place the graphics files with them. The XSLT processor will not copy image files to the output location.
Where you place a graphics file in the output area depends on the pathname used to reference it in the HTML IMG
tag. That pathname comes from the imagedata
or graphic
element in the XML document. Those elements let you specify an image path in two ways: with a fileref
attribute or an entityref
attribute.
A fileref
attribute value is interpreted as a literal pathname string. It can be modified in three ways before it is output as the src
attribute.
If the fileref
value does not have a filename extension that indicates the format, then one is appended to the filename. The graphic element must have format
attribute for this to work.
If the img.src.path
parameter is set, its value is prepended to each
fileref
value if it is not an
absolute path. This parameter lets you specify the path to the image
files when you build the HTML. If its value is images/
then a fileref
value
of caution.png
is written to the HTML file as src="images/caution.png
. But sure to include the trailing slash. This parameter
permits you to specify just the filename in your graphics elements,
without specifying the details of the location. If you later move
the output directory, you can just change the parameter value, and
not have to edit every graphics instance in your document.
If your document uses XIncludes, then the path may be altered by xml:base
attributes inserted by the XInclude processor. See the section “XIncludes and graphics files” for details.
When you build your HTML, you must place the image file in the location specified by the fileref
, as modified by the above points. If the result is a relative pathname, then the graphics file must be placed relative to the final output location of the HTML files. If it is an absolute pathname, then the graphics file should be placed relative to the document root of the HTTP server for the HTML files. The fileref
attribute value or the img.src.path
parameter can also be an absolute URI, to the same or
different website.
If you require more flexibility in handling a graphics file, then consider using an entityref
attribute with an XML catalog instead. An entityref
attribute has an XML attribute type of ENTITY
in its declaration. This means the attribute value is not interpreted as a literal pathname string, but as an entity name. The entity name must correspond to a system entity declared in the current document's DTD.
Typically, such system entities are declared in the internal subset of the DTD within the DOCTYPE declaration of the document. The following is an example.
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd" [ <!ENTITY screenshot3 SYSTEM "/usr/local/graphics/tutorial3.png" NDATA PNG> ... ]> <book> ... <imagedata entityref="screenshot3"/>
The HTML output from processing this example file will include:
<IMG src="/usr/local/graphics/tutorial3.png">
An important difference from fileref
is that an entityref
is always resolved to an absolute URI. If you enter a relative path, then it is resolved relative to the absolute path of the document that declares the entity. That could be the current document or a DTD customization file. This behavior comes from the use of the unparsed-entity-uri()
XSL function in the DocBook template, and the XSL standard says that function always returns an absolute URI.
Absolute paths in HTML src
attributes are a problem if you put the HTML files on a webserver. It is likely the absolute path will not match the document root of the HTTP server, so such references will result in missing graphics when the HTML file is viewed. Relative paths are preferred, but there is no way to get relative paths when using entityref
. For this reason, the img.src.path
parameter has no effect on entityref
paths, because it cannot be
prepended to absolute paths.
However, if you put your entity declarations in a separate file, and use an XML catalog to find the declarations file, then you can substitute different pathnames at runtime by using a different catalog. For example, if you move the above entity declaration to a file named mygraphics.ent
, you can reference it as follows:
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd" [ <!ENTITY % graphicset SYSTEM "graphics/mygraphics.ent"> %graphicset; ... ]> <book> ... <imagedata entityref="screenshot3"/>
This arrangement uses a parameter entity to specify the location of the file containing the declarations, then it immediately uses a parameter entity reference %graphicset;
to pull in the file's contents at that point in the DTD.
You can swap declarations files at runtime by using a catalog entry such as the following:
<system systemId="graphics/mygraphics.ent" uri="../graphics/myothergraphics.ent"/>
You just need to make sure that your alternate graphics declarations file declares the same set of entity names, and that they resolve to full pathnames that work for the HTML output.
You might think that since a system entity uses a SYSTEM identifier and an optional PUBLIC identifier to specify the pathname to the graphics file, that you could use a catalog entry for each graphics file. Unfortunately, this does not work for HTML output. A catalog resolver is triggered when a requested file is to be opened. During HTML processing, the graphics files themselves are never opened. Only their pathname is passed through to the HTML, so such catalog entries would not be used.
Generating PDF from a DocBook file is a two-step process. First the DocBook FO stylesheet is applied to the XML document to generate an intermediate XSL-FO file. Then the XSL-FO is converted to PDF by an XSL-FO processor such as FOP. In the first step, each imagedata
and graphic
element is handled in a manner similar to the HTML processing described above. That is, the pathname in a fileref
or entityref
attribute is passed through to a XSL-FO graphics element:
<fo:external-graphic src="url(graphics/tutorial.png)">
The path can be modified in two ways before output:
If the fileref
value does not have a filename extension that indicates the format, then one is appended to the filename. The graphic element must have format
attribute for this to work.
If you set the stylesheet parameter img.src.path
, then its value is prepended to anyfileref
that is not an absolute path. This allows you to store your images in a central location rather than with individual documents, for example.
If your document uses XIncludes, then the path may be altered by xml:base
attributes inserted by the XInclude processor. See the section “XIncludes and graphics files” for details.
As with HTML processing, the graphics file itself is not opened during the stylesheet processing, so the graphics file does not actually need to be present. However, in the second phase, the XSL-FO processor must open such graphics references to incorporate the graphics data into the PDF file. So it is during the XSL-FO processing phase that the file must be readable at the graphics element's address, possibly modified by the above points.
Once the second stage is completed, the PDF file contains the graphics data, so access to the graphics files is no longer needed. The PDF file can be moved as needed without losing the graphics.
DocBook XSL: The Complete Guide - 4th Edition | PDF version available | Copyright © 2002-2007 Sagehill Enterprises |