Red Hat Enterprise Linux 4: Using ld, the Gnu Linker
Prev	Chapter 4. Linker Scripts	Next

4.6. SECTIONS Command

The SECTIONS command tells the linker how to map input sections into output sections, and how to place the output sections in memory.

The format of the SECTIONS command is:

SECTIONS
{
  sections-command
  sections-command
  …
}

Each sections-command may of be one of the following:

an ENTRY command (refer to Section 4.4.1 Setting the Entry Point)
a symbol assignment (refer to Section 4.5 Assigning Values to Symbols)
an output section description
an overlay description

The ENTRY command and symbol assignments are permitted inside the SECTIONS command for convenience in using the location counter in those commands. This can also make the linker script easier to understand because you can use those commands at meaningful points in the layout of the output file.

Output section descriptions and overlay descriptions are described below.

If you do not use a SECTIONS command in your linker script, the linker will place each input section into an identically named output section in the order that the sections are first encountered in the input files. If all input sections are present in the first file, for example, the order of sections in the output file will match the order in the first input file. The first section will be at address zero.

4.6.1. Output Section Description

The full description of an output section looks like this:

section [address] [(type)] : [AT(lma)]
  {
    output-section-command
    output-section-command
    …
  } [>region] [AT>lma_region] [:phdr :phdr …] [=fillexp]

Most output sections do not use most of the optional section attributes.

The whitespace around section is required, so that the section name is unambiguous. The colon and the curly braces are also required. The line breaks and other white space are optional.

Each output-section-command may be one of the following:

a symbol assignment (refer to Section 4.5 Assigning Values to Symbols)
an input section description (refer to Section 4.6.4 Input Section Description)
data values to include directly (refer to Section 4.6.5 Output Section Data)
a special output section keyword (refer to Section 4.6.6 Output Section Keywords)

4.6.2. Output Section Name

The name of the output section is section. section must meet the constraints of your output format. In formats which only support a limited number of sections, such as a.out, the name must be one of the names supported by the format (a.out, for example, allows only .text, .data or .bss). If the output format supports any number of sections, but with numbers and not names (as is the case for Oasys), the name should be supplied as a quoted numeric string. A section name may consist of any sequence of characters, but a name which contains any unusual characters such as commas must be quoted.

The output section name /DISCARD/ is special; refer to Section 4.6.7 Output Section Discarding.

4.6.3. Output Section Description

The address is an expression for the VMA (the virtual memory address) of the output section. If you do not provide address, the linker will set it based on region if present, or otherwise based on the current value of the location counter.

If you provide address, the address of the output section will be set to precisely that. If you provide neither address nor region, then the address of the output section will be set to the current value of the location counter aligned to the alignment requirements of the output section. The alignment requirement of the output section is the strictest alignment of any input section contained within the output section.

For example,

.text . : { *(.text) }

and

.text : { *(.text) }

are subtly different. The first will set the address of the .text output section to the current value of the location counter. The second will set it to the current value of the location counter aligned to the strictest alignment of a .text input section.

The address may be an arbitrary expression; refer to Section 4.10 Expressions in Linker Scripts. For example, if you want to align the section on a 0x10 byte boundary, so that the lowest four bits of the section address are zero, you could do something like this:

.text ALIGN(0x10) : { *(.text) }

This works because ALIGN returns the current location counter aligned upward to the specified value.

Specifying address for a section will change the value of the location counter.

4.6.4. Input Section Description

The most common output section command is an input section description.

The input section description is the most basic linker script operation. You use output sections to tell the linker how to lay out your program in memory. You use input section descriptions to tell the linker how to map the input files into your memory layout.

4.6.4.1. Input Section Basics

An input section description consists of a file name optionally followed by a list of section names in parentheses.

The file name and the section name may be wildcard patterns, which we describe further below (refer to Section 4.6.4.2 Input Section Wildcard Patterns).

The most common input section description is to include all input sections with a particular name in the output section. For example, to include all input .text sections, you would write:

*(.text)

Here the * is a wildcard which matches any file name. To exclude a list of files from matching the file name wildcard, EXCLUDE_FILE may be used to match all files except the ones specified in the EXCLUDE_FILE list. For example:

(*(EXCLUDE_FILE (*crtend.o *otherfile.o) .ctors))

will cause all .ctors sections from all files except crtend.o and otherfile.o to be included.

There are two ways to include more than one section:

*(.text .rdata)
*(.text) *(.rdata)

The difference between these is the order in which the .text and .rdata input sections will appear in the output section. In the first example, they will be intermingled, appearing in the same order as they are found in the linker input. In the second example, all .text input sections will appear first, followed by all .rdata input sections.

You can specify a file name to include sections from a particular file. You would do this if one or more of your files contain special data that needs to be at a particular location in memory. For example:

data.o(.data)

If you use a file name without a list of sections, then all sections in the input file will be included in the output section. This is not commonly done, but it may by useful on occasion. For example:

data.o

When you use a file name which does not contain any wild card characters, the linker will first see if you also specified the file name on the linker command line or in an INPUT command. If you did not, the linker will attempt to open the file as an input file, as though it appeared on the command line. Note that this differs from an INPUT command, because the linker will not search for the file in the archive search path.

4.6.4.2. Input Section Wildcard Patterns

In an input section description, either the file name or the section name or both may be wildcard patterns.

The file name of * seen in many examples is a simple wildcard pattern for the file name.

The wildcard patterns are like those used by the Unix shell.

*: matches any number of characters
?: matches any single character
[chars]: matches a single instance of any of the chars; the - character may be used to specify a range of characters, as in [a-z] to match any lower case letter
\: quotes the following character

When a file name is matched with a wildcard, the wildcard characters will not match a / character (used to separate directory names on Unix). A pattern consisting of a single * character is an exception; it will always match any file name, whether it contains a / or not. In a section name, the wildcard characters will match a / character.

File name wildcard patterns only match files which are explicitly specified on the command line or in an INPUT command. The linker does not search directories to expand wildcards.

If a file name matches more than one wildcard pattern, or if a file name appears explicitly and is also matched by a wildcard pattern, the linker will use the first match in the linker script. For example, this sequence of input section descriptions is probably in error, because the data.o rule will not be used:

.data : { *(.data) }
.data1 : { data.o(.data) }

Normally, the linker will place files and sections matched by wildcards in the order in which they are seen during the link. You can change this by using the SORT keyword, which appears before a wildcard pattern in parentheses (e.g., SORT(.text*)). When the SORT keyword is used, the linker will sort the files or sections into ascending order by name before placing them in the output file.

If you ever get confused about where input sections are going, use the -M linker option to generate a map file. The map file shows precisely how input sections are mapped to output sections.

This example shows how wildcard patterns might be used to partition files. This linker script directs the linker to place all .text sections in .text and all .bss sections in .bss. The linker will place the .data section from all files beginning with an upper case character in .DATA; for all other files, the linker will place the .data section in .data.

SECTIONS {
  .text : { *(.text) }
  .DATA : { [A-Z]*(.data) }
  .data : { *(.data) }
  .bss : { *(.bss) }
}

4.6.4.3. Input Section for Common Symbols

A special notation is needed for common symbols, because in many object file formats common symbols do not have a particular input section. The linker treats common symbols as though they are in an input section named COMMON.

You may use file names with the COMMON section just as with any other input sections. You can use this to place common symbols from a particular input file in one section while common symbols from other input files are placed in another section.

In most cases, common symbols in input files will be placed in the .bss section in the output file. For example:

.bss { *(.bss) *(COMMON) }

Some object file formats have more than one type of common symbol. For example, the MIPS ELF object file format distinguishes standard common symbols and small common symbols. In this case, the linker will use a different special section name for other types of common symbols. In the case of MIPS ELF, the linker uses COMMON for standard common symbols and .scommon for small common symbols. This permits you to map the different types of common symbols into memory at different locations.

You will sometimes see [COMMON] in old linker scripts. This notation is now considered obsolete. It is equivalent to *(COMMON).

4.6.4.4. Input Section and Garbage Collection

When link-time garbage collection is in use (-gc-sections), it is often useful to mark sections that should not be eliminated. This is accomplished by surrounding an input section's wildcard entry with KEEP(), as in KEEP(*(.init)) or KEEP(SORT(*)(.ctors)).

4.6.4.5. Input Section Example

The following example is a complete linker script. It tells the linker to read all of the sections from file all.o and place them at the start of output section outputa which starts at location 0x10000. All of section .input1 from file foo.o follows immediately, in the same output section. All of section .input2 from foo.o goes into output section outputb, followed by section .input1 from foo1.o. All of the remaining .input1 and .input2 sections from any files are written to output section outputc.

SECTIONS {
  outputa 0x10000 :
    {
    all.o
    foo.o (.input1)
    }
       outputb :
    {
    foo.o (.input2)
    foo1.o (.input1)
    }
  outputc :
    {
    *(.input1)
    *(.input2)
    }
}

4.6.5. Output Section Data

You can include explicit bytes of data in an output section by using BYTE, SHORT, LONG, QUAD, or SQUAD as an output section command. Each keyword is followed by an expression in parentheses providing the value to store (refer to Section 4.10 Expressions in Linker Scripts). The value of the expression is stored at the current value of the location counter.

The BYTE, SHORT, LONG, and QUAD commands store one, two, four, and eight bytes (respectively). After storing the bytes, the location counter is incremented by the number of bytes stored.

For example, this will store the byte 1 followed by the four byte value of the symbol addr:

BYTE(1)
LONG(addr)

When using a 64 bit host or target, QUAD and SQUAD are the same; they both store an 8 byte, or 64 bit, value. When both host and target are 32 bits, an expression is computed as 32 bits. In this case QUAD stores a 32 bit value zero extended to 64 bits, and SQUAD stores a 32 bit value sign extended to 64 bits.

If the object file format of the output file has an explicit endianness, which is the normal case, the value will be stored in that endianness. When the object file format does not have an explicit endianness, as is true of, for example, S-records, the value will be stored in the endianness of the first input object file.

Note--these commands only work inside a section description and not between them, so the following will produce an error from the linker:

SECTIONS { .text : { *(.text) } LONG(1) .data : { *(.data) } }

whereas this will work:

SECTIONS { .text : { *(.text) ; LONG(1) } .data : { *(.data) } }

You may use the FILL command to set the fill pattern for the current section. It is followed by an expression in parentheses. Any otherwise unspecified regions of memory within the section (for example, gaps left due to the required alignment of input sections) are filled with the value of the expression, repeated as necessary. A FILL statement covers memory locations after the point at which it occurs in the section definition; by including more than one FILL statement, you can have different fill patterns in different parts of an output section.

This example shows how to fill unspecified regions of memory with the value 0x90:

FILL(0x90909090)

The FILL command is similar to the =fillexp output section attribute, but it only affects the part of the section following the FILL command, rather than the entire section. If both are used, the FILL command takes precedence. Refer to Section 4.6.8.5 Output Section Fill for details on the fill expression.

4.6.6. Output Section Keywords

There are a couple of keywords which can appear as output section commands.

CREATE_OBJECT_SYMBOLS

The command tells the linker to create a symbol for each input file. The name of each symbol will be the name of the corresponding input file. The section of each symbol will be the output section in which the CREATE_OBJECT_SYMBOLS command appears.

This is conventional for the a.out object file format. It is not normally used for any other object file format.

CONSTRUCTORS

When linking using the a.out object file format, the linker uses an unusual set construct to support C++ global constructors and destructors. When linking object file formats which do not support arbitrary sections, such as ECOFF and XCOFF, the linker will automatically recognize C++ global constructors and destructors by name. For these object file formats, the CONSTRUCTORS command tells the linker to place constructor information in the output section where the CONSTRUCTORS command appears. The CONSTRUCTORS command is ignored for other object file formats.

The symbol __CTOR_LIST__ marks the start of the global constructors, and the symbol __DTOR_LIST marks the end. The first word in the list is the number of entries, followed by the address of each constructor or destructor, followed by a zero word. The compiler must arrange to actually run the code. For these object file formats gnu C++ normally calls constructors from a subroutine __main; a call to __main is automatically inserted into the startup code for main. gnu C++ normally runs destructors either by using atexit, or directly from the function exit.

For object file formats such as COFF or ELF which support arbitrary section names, gnu C++ will normally arrange to put the addresses of global constructors and destructors into the .ctors and .dtors sections. Placing the following sequence into your linker script will build the sort of table which the gnu C++ runtime code expects to see.

      __CTOR_LIST__ = .;
      LONG((__CTOR_END__ - __CTOR_LIST__) / 4 - 2)
      *(.ctors)
      LONG(0)
      __CTOR_END__ = .;
      __DTOR_LIST__ = .;
      LONG((__DTOR_END__ - __DTOR_LIST__) / 4 - 2)
      *(.dtors)
      LONG(0)
      __DTOR_END__ = .;

If you are using the gnu C++ support for initialization priority, which provides some control over the order in which global constructors are run, you must sort the constructors at link time to ensure that they are executed in the correct order. When using the CONSTRUCTORS command, use SORT(CONSTRUCTORS) instead. When using the .ctors and .dtors sections, use *(SORT(.ctors)) and *(SORT(.dtors)) instead of just *(.ctors) and *(.dtors).

Normally the compiler and linker will handle these issues automatically, and you will not need to concern yourself with them. However, you may need to consider this if you are using C++ and writing your own linker scripts.

4.6.7. Output Section Discarding

The linker will not create output section which do not have any contents. This is for convenience when referring to input sections that may or may not be present in any of the input files. For example:

.foo { *(.foo) }

will only create a .foo section in the output file if there is a .foo section in at least one input file.

If you use anything other than an input section description as an output section command, such as a symbol assignment, then the output section will always be created, even if there are no matching input sections.

The special output section name /DISCARD/ may be used to discard input sections. Any input sections which are assigned to an output section named /DISCARD/ are not included in the output file.

4.6.8. Output Section Attributes

We showed above that the full description of an output section looked like this:

section [address] [(type)] : [AT(lma)]
  {
    output-section-command
    output-section-command
    …
  } [>region] [AT>lma_region] [:phdr :phdr …] [=fillexp]

We've already described section, address, and output-section-command. In this section we will describe the remaining section attributes.

4.6.8.1. Output Section Type

Each output section may have a type. The type is a keyword in parentheses. The following types are defined:

NOLOAD: The section should be marked as not loadable, so that it will not be loaded into memory when the program is run.
DSECT, COPY, INFO, OVERLAY: These type names are supported for backward compatibility, and are rarely used. They all have the same effect: the section should be marked as not allocatable, so that no memory is allocated for the section when the program is run.

The linker normally sets the attributes of an output section based on the input sections which map into it. You can override this by using the section type. For example, in the script sample below, the ROM section is addressed at memory location 0 and does not need to be loaded when the program is run. The contents of the ROM section will appear in the linker output file as usual.

SECTIONS {
  ROM 0 (NOLOAD) : { … }
  …
}

4.6.8.2. Output Section LMA

Every section has a virtual address (VMA) and a load address (LMA); see Section 4.1 Basic Linker Script Concepts. The address expression which may appear in an output section description sets the VMA (refer to Section 4.6.3 Output Section Description).

The linker will normally set the LMA equal to the VMA. You can change that by using the AT keyword. The expression lma that follows the AT keyword specifies the load address of the section. Alternatively, with AT>lma_region expression, you may specify a memory region for the section's load address. Refer to Section 4.7 MEMORY Command.

This feature is designed to make it easy to build a ROM image. For example, the following linker script creates three output sections: one called .text, which starts at 0x1000, one called .mdata, which is loaded at the end of the .text section even though its VMA is 0x2000, and one called .bss to hold uninitialized data at address 0x3000. The symbol _data is defined with the value 0x2000, which shows that the location counter holds the VMA value, not the LMA value.

SECTIONS
  {
  .text 0x1000 : { *(.text) _etext = . ; }
  .mdata 0x2000 :
    AT ( ADDR (.text) + SIZEOF (.text) )
    { _data = . ; *(.data); _edata = . ;  }
  .bss 0x3000 :
    { _bstart = . ;  *(.bss) *(COMMON) ; _bend = . ;}
}

The run-time initialization code for use with a program generated with this linker script would include something like the following, to copy the initialized data from the ROM image to its runtime address. Notice how this code takes advantage of the symbols defined by the linker script.

extern char _etext, _data, _edata, _bstart, _bend;
char *src = &_etext;
char *dst = &_data;

/* ROM has data at end of text; copy it. */
while (dst < &_edata) {
  *dst++ = *src++;
}

/* Zero bss */
for (dst = &_bstart; dst< &_bend; dst++)
  *dst = 0;

4.6.8.3. Output Section Region

You can assign a section to a previously defined region of memory by using >region. Refer to Section 4.7 MEMORY Command.

Here is a simple example:

MEMORY { rom : ORIGIN = 0x1000, LENGTH = 0x1000 }
SECTIONS { ROM : { *(.text) } >rom }

4.6.8.4. Output Section Phdr

You can assign a section to a previously defined program segment by using :phdr. Refer to Section 4.8 PHDRS Command. If a section is assigned to one or more segments, then all subsequent allocated sections will be assigned to those segments as well, unless they use an explicitly :phdr modifier. You can use :NONE to tell the linker to not put the section in any segment at all.

Here is a simple example:

PHDRS { text PT_LOAD ; }
SECTIONS { .text : { *(.text) } :text }

4.6.8.5. Output Section Fill

You can set the fill pattern for an entire section by using =fillexp. fillexp is an expression (refer to Section 4.10 Expressions in Linker Scripts). Any otherwise unspecified regions of memory within the output section (for example, gaps left due to the required alignment of input sections) will be filled with the value, repeated as necessary. If the fill expression is a simple hex number, ie. a string of hex digit starting with 0x and without a trailing k or M, then an arbitrarily long sequence of hex digits can be used to specify the fill pattern; Leading zeros become part of the pattern too. For all other cases, including extra parentheses or a unary +, the fill pattern is the four least significant bytes of the value of the expression. In all cases, the number is big-endian.

You can also change the fill value with a FILL command in the output section commands; (refer to Section 4.6.5 Output Section Data).

Here is a simple example:

SECTIONS { .text : { *(.text) } =0x90909090 }

4.6.9. Overlay Description

An overlay description provides an easy way to describe sections which are to be loaded as part of a single memory image but are to be run at the same memory address. At run time, some sort of overlay manager will copy the overlaid sections in and out of the runtime memory address as required, perhaps by simply manipulating addressing bits. This approach can be useful, for example, when a certain region of memory is faster than another.

Overlays are described using the OVERLAY command. The OVERLAY command is used within a SECTIONS command, like an output section description. The full syntax of the OVERLAY command is as follows:

OVERLAY [start] : [NOCROSSREFS] [AT ( ldaddr )]
  {
    secname1
      {
        output-section-command
        output-section-command
        …
      } [:phdr…] [=fill]
    secname2
      {
        output-section-command
        output-section-command
        …
      } [:phdr…] [=fill]
    …
  } [>region] [:phdr…] [=fill]

Everything is optional except OVERLAY (a keyword), and each section must have a name (secname1 and secname2 above). The section definitions within the OVERLAY construct are identical to those within the general SECTIONS contruct (refer to Section 4.6 SECTIONS Command), except that no addresses and no memory regions may be defined for sections within an OVERLAY.

The sections are all defined with the same starting address. The load addresses of the sections are arranged such that they are consecutive in memory starting at the load address used for the OVERLAY as a whole (as with normal section definitions, the load address is optional, and defaults to the start address; the start address is also optional, and defaults to the current value of the location counter).

If the NOCROSSREFS keyword is used, and there any references among the sections, the linker will report an error. Since the sections all run at the same address, it normally does not make sense for one section to refer directly to another. NOCROSSREFS.

For each section within the OVERLAY, the linker automatically defines two symbols. The symbol __load_start_secname is defined as the starting load address of the section. The symbol __load_stop_secname is defined as the final load address of the section. Any characters within secname which are not legal within C identifiers are removed. C (or assembler) code may use these symbols to move the overlaid sections around as necessary.

At the end of the overlay, the value of the location counter is set to the start address of the overlay plus the size of the largest section.

Here is an example. Remember that this would appear inside a SECTIONS construct.

  OVERLAY 0x1000 : AT (0x4000)
   {
     .text0 { o1/*.o(.text) }
     .text1 { o2/*.o(.text) }
   }

This will define both .text0 and .text1 to start at address 0x1000. .text0 will be loaded at address 0x4000, and .text1 will be loaded immediately after .text0. The following symbols will be defined: __load_start_text0, __load_stop_text0, __load_start_text1, __load_stop_text1.

C code to copy overlay .text1 into the overlay area might look like the following.

  extern char __load_start_text1, __load_stop_text1;
  memcpy ((char *) 0x1000, &__load_start_text1,
          &__load_stop_text1 - &__load_start_text1);

Note that the OVERLAY command is just syntactic sugar, since everything it does can be done using the more basic commands. The above example could have been written identically as follows.

  .text0 0x1000 : AT (0x4000) { o1/*.o(.text) }
  __load_start_text0 = LOADADDR (.text0);
  __load_stop_text0 = LOADADDR (.text0) + SIZEOF (.text0);
  .text1 0x1000 : AT (0x4000 + SIZEOF (.text0)) { o2/*.o(.text) }
  __load_start_text1 = LOADADDR (.text1);
  __load_stop_text1 = LOADADDR (.text1) + SIZEOF (.text1);
  . = 0x1000 + MAX (SIZEOF (.text0), SIZEOF (.text1));