Creating a Database Cluster

Before you can do anything, you must initialize a database storage area on disk. This is known as a database cluster. (SQL speaks of a catalog cluster instead.) A database cluster is a collection of databases that will be accessible through a single instance of a running database server. After initialization, a database cluster will contain one database named template1. As the name suggests, this will be used as a template for any subsequently created database; do not use it for actual work and do not drop this database.

In file system terms, a database cluster is a single directory under which all data will be stored. We call this the data directory or data area. You can store your data anywhere, there is no default, although locations such as /usr/local/pgsql/data or /var/lib/pgsql/data are popular.

The instructions that follow describe how to create a database cluster area.

  1. Log in as root.

  2. Create a directory that will be the database cluster area and transfer ownership of it to the Postgres user account:
    root# mkdir /usr/local/pgsql/data
    root# chown postgres /usr/local/pgsql/data

  3. Log in as the Postgres user account.
    root# su postgres

    If you will need to perform LIKE and regular-expression searches, set your current locale to "C" (instead of the default en_US). To set the current locale, change the value of the environment variable LC_ALL or LANG.

    If you choose to use a locale setting other than "C", you will see the following message while running initdb:
    NOTICE:  Initializing database with en_US collation order.
             This locale setting will prevent use of index
             optimization for LIKE and regexp searches.  
             If you are concerned about speed of
             such queries, you may wish to set LC_COLLATE 
             to "C" and re-initdb.  For more information see 
             the Red Hat Database Administrator and User's Guide.
    This notice warns you that the currently selected locale causes indexes to be sorted in an order that prevents them from being used for LIKE and regular-expression searches. The sort order used within a particular database cluster is set by initdb and can only be changed later if you dump all data, rerun initdb, and reload the data.

  4. To initialize a database cluster, you use the command initdb. You can indicate the desired location of your database system with the -D option, or by setting the PGDATA environment variable. For example:
    postgres> initdb -D /usr/local/pgsql/data

    Tip

    As an alternative to the -D option, you can set the environment variable PGDATA.

    initdb will refuse to run if the data directory appears to belong to an already initialized installation.

    Because the data directory contains all the data stored in the database, it is essential that it be well secured from unauthorized access. initdb therefore revokes access permissions from everyone but the Postgres user account.