1.9. The Structure of a DocBook File

DocBook document always has a top level element. In most cases it will be book, or article, but it can also be set, part, chapter and a few more.

The structural content of these two most-occurring cases for top (more correctly called root) elements of book, or article (or perhaps chapter when you are contributing to a book) are detailed next.

1.9.1. Structure of a Book (or chapter)

A book is structured in the following way:


   book
      meta information
      chapter
        sect1
        sect1
      chapter
        sect1
      appendix
        sect1
      appendix
        sect1
       …
      glossary

1.9.2. Structure of an Article

An article has a somewhat simpler structure than a book:


    article
      meta information
      sect1
      sect1
        sect2
      sect1
      …

1.9.3. Book elements step by step

Example 1.7. Chapters and sections

  1 <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
       "/afs/cern.ch/sw/XML/XMLBIN/share/www.oasis-open.org/docbook/xmldtd-4.2/docbookx.dtd"
    >
    <book id="hello-world" lang="en">
  5  
    <bookinfo>
    <title>Hello, world</title>
    </bookinfo>
     
 10 <chapter id="introduction">
    <title>Introduction</title>
     
    <para>This is the introduction. It has two sections</para>
     
 15 <sect1 id="about-this-book">
    <title>About this book</title>
    
    <para>This is my first DocBook file.</para>
    
 20 </sect1>
    
    <sect1 id="work-in-progress">
    <title>Warning</title>
    
 25 <para>This is still under construction.</para>
    
    </sect1>
    
    </chapter>
 30 </book>

Example 1.7 shows a skeleton of the structural tags. Lines 1 and 2 contain the document type declaration (via a public identifier on line 1, which needs a calalog to operate and via an explicit system path on line 2, which, of course, will only work at CERN when connected to AFS). This information is described more fully in Section Section 1.10.

Next comes the root element (line 3), which contains the complete document. The name of the root element must be identical to the element name following the DOCTYPE specifier on line 1. Note the use of the lang attribute inside the <book> start tag. It is good practive to always use a language attribute to clearly indicate the (main) language in which the document is written.

After the <book> tag comes the meta information for the document, which is enclosed in the <bookinfo> element (lines 5-7). This information is described in more detail in Chapter Chapter 2.

Then come the chapters of the book (lines 9 to 28), which can contain one or more section tags (<sect1> through <sect5>). It is important to associate a label to each of the structural elements of your document using the id attribute for <chapter>, <sect>, etc. This makes cross-references between structural elements possible and it allows the output processors to use the value of the id attribute to name the generated output files. In order to make the development and maintenance of your documents easier, it is advisable to assign meaningful values (e.g., not merely numbers) to these id attributes.

Structural elements, such as chapters and sections, must contain at least a <title> (lines 10, 15, 22), and a (possibly empty) <para> element (lines 12, 17, 24).

The content model for the various elements, i.e., where and how many times each one can occur at a given point in a document is defined by the DocBook DTD (or Schema).

Content in DocBook is contained within a <para> tag, which is very similar to the <p> tag in HTML and LinuxDoc except that it must always have a closing </para> tag. Each time there can be a line break in some text (like in a list item), it means that the text will have to be enclosed in <para> tags.