3.3 The DOCTYPE declaration

The beginning of each document that you write must specify the name of the DTD that the document conforms to. This is so that SGML parsers can determine the DTD and ensure that the document does conform to it.

This information is generally expressed on one line, in the DOCTYPE declaration.

A typical declaration for a document written to conform with version 4.0 of the HTML DTD looks like this:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0//EN">

That line contains a number of different components.

<!

Is the indicator that indicates that this is an SGML declaration. This line is declaring the document type.

DOCTYPE

Shows that this is an SGML declaration for the document type.

html

Names the first element that will appear in the document.

PUBLIC "-//W3C//DTD HTML 4.0//EN"

Lists the Formal Public Identifier (FPI) for the DTD that this document conforms to. Your SGML parser will use this to find the correct DTD when processing this document.

PUBLIC is not a part of the FPI, but indicates to the SGML processor how to find the DTD referenced in the FPI. Other ways of telling the SGML parser how to find the DTD are shown later.

>

Returns to the document.

3.3.1 Formal Public Identifiers (FPIs)

Note: You do not need to know this, but it is useful background, and might help you debug problems when your SGML processor can not locate the DTD you are using.

FPIs must follow a specific syntax. This syntax is as follows:

"Owner//Keyword Description//Language"
Owner

This indicates the owner of the FPI.

If this string starts with “ISO” then this is an ISO owned FPI. For example, the FPI "ISO 8879:1986//ENTITIES Greek Symbols//EN" lists ISO 8879:1986 as being the owner for the set of entities for Greek symbols. ISO 8879:1986 is the ISO number for the SGML standard.

Otherwise, this string will either look like -//Owner or +//Owner (notice the only difference is the leading + or -).

If the string starts with - then the owner information is unregistered, with a + it identifies it as being registered.

ISO 9070:1991 defines how registered names are generated; it might be derived from the number of an ISO publication, an ISBN code, or an organization code assigned according to ISO 6523. In addition, a registration authority could be created in order to assign registered names. The ISO council delegated this to the American National Standards Institute (ANSI).

Because the FreeBSD Project has not been registered the owner string is -//FreeBSD. And as you can see, the W3C are not a registered owner either.

Keyword

There are several keywords that indicate the type of information in the file. Some of the most common keywords are DTD, ELEMENT, ENTITIES, and TEXT. DTD is used only for DTD files, ELEMENT is usually used for DTD fragments that contain only entity or element declarations. TEXT is used for SGML content (text and tags).

Description

Any description you want to supply for the contents of this file. This may include version numbers or any short text that is meaningful to you and unique for the SGML system.

Language

This is an ISO two-character code that identifies the native language for the file. EN is used for English.

3.3.1.1 catalog files

If you use the syntax above and process this document using an SGML processor, the processor will need to have some way of turning the FPI into the name of the file on your computer that contains the DTD.

In order to do this it can use a catalog file. A catalog file (typically called catalog) contains lines that map FPIs to filenames. For example, if the catalog file contained the line:

PUBLIC "-//W3C//DTD HTML 4.0//EN"             "4.0/strict.dtd"

The SGML processor would know to look up the DTD from strict.dtd in the 4.0 subdirectory of whichever directory held the catalog file that contained that line.

Look at the contents of /usr/local/share/sgml/html/catalog. This is the catalog file for the HTML DTDs that will have been installed as part of the textproc/docproj port.

3.3.1.2 SGML_CATALOG_FILES

In order to locate a catalog file, your SGML processor will need to know where to look. Many of them feature command line parameters for specifying the path to one or more catalogs.

In addition, you can set SGML_CATALOG_FILES to point to the files. This environment variable should consist of a colon-separated list of catalog files (including their full path).

Typically, you will want to include the following files:

  • /usr/local/share/sgml/docbook/4.1/catalog

  • /usr/local/share/sgml/html/catalog

  • /usr/local/share/sgml/iso8879/catalog

  • /usr/local/share/sgml/jade/catalog

You should already have done this.

3.3.2 Alternatives to FPIs

Instead of using an FPI to indicate the DTD that the document conforms to (and therefore, which file on the system contains the DTD) you can explicitly specify the name of the file.

The syntax for this is slightly different:

<!DOCTYPE html SYSTEM "/path/to/file.dtd">

The SYSTEM keyword indicates that the SGML processor should locate the DTD in a system specific fashion. This typically (but not always) means the DTD will be provided as a filename.

Using FPIs is preferred for reasons of portability. You do not want to have to ship a copy of the DTD around with your document, and if you used the SYSTEM identifier then everyone would need to keep their DTDs in the same place.

This, and other documents, can be downloaded from ftp://ftp.FreeBSD.org/pub/FreeBSD/doc/.

For questions about FreeBSD, read the documentation before contacting <[email protected]>.
For questions about this documentation, e-mail <[email protected]>.