Chapter 6. Importing Data

Importing existing data into the Red Hat CMS is a common requirement for deployments. This chapter discusses some of the approaches to data import, and discusses the tradeoffs of each technique. Since importing existing data is highly dependent on the format of the data, Red Hat CMS provides tools that developers can configure and extend to meet their needs.

TipTip
 

The database invariants test suites can be used to verify data integrity. Use these suites as part of your QA process prior to importing the actual data into a production system.

6.1. Java

The Red Hat CMS Java APIs can be used to directly create content. This approach has the maximum flexibility, giving the developer a way to import content items and manage these items directly with the domain APIs. For example, this approach makes it simple to import a specific content item, assign the item a specific lifecycle and workflow, and publish the content item. This technique is slower than directly accessing the database, since a considerable amount of computation occurs at the Java layer.

There are two packages in Red Hat CMS that provide good starting points for data imports using this approach: the package com.arsdigita.cms.populate and the class com.arsdigita.cms.installer.xml.ContentItemLoader. The populate package does not provide any capability for data import per se; it is designed to programmatically generate large numbers of content items of various types for scale testing. The ContentItemLoader will import data from an XML file into the database; the actual data will need to conform to the specific XML schema.