Populating a Database

One may need to do a large number of table insertions when first populating a database. Here are some tips and techniques for making that as efficient as possible.

Disable Auto-commit

Turn off auto-commit and just do one commit at the end. Otherwise PostgreSQL is doing a lot of work for each record added. In general when you are doing bulk inserts, you want to turn off some of the database features to gain speed.

Use COPY FROM

Use COPY FROM STDIN to load all the records in one command, instead of a series of INSERT commands. This reduces parsing, planning, and related overhead a great deal. If you do this, it is not necessary to experiment with autocommit, as it is only one command.

Remove Indexes

If you are loading a freshly created table, the fastest way is to create the table, bulk-load with COPY, then create any indexes needed for the table. Creating an index on pre-existing data is quicker than updating it incrementally as each record is loaded.

If you are augmenting an existing table, you can DROP INDEX, load the table, then recreate the index. Of course, the database performance for other users may be adversely affected during the time that the index is missing.