Chapter 1. Introduction to Berkeley DB Java Edition

Table of Contents

Features
The JE Application
Databases and Database Environments
Database Records
Putting and Getting Database Records
Duplicate Data
Replacing and Deleting Entries
Secondary Databases
Transactions
JE Resources
Application Considerations
JE Backup and Restore
Getting and Using JE
JE Exceptions

Welcome to Berkeley DB Java Edition (JE). JE is a general-purpose, transactionally protected, embedded database written in 100% Java (JE makes no JNI calls). As such, it offers the Java developer safe and efficient in-process storage and management of arbitrary data.

JE requires Java J2SE 1.4.2 or better.

Features

JE provides an enterprise-class Java-based data management solution. You use JE through a series of Java APIs. All you need to get started is to add a single jar file to your application's classpath. See Getting and Using JE for more information.

JE offers the following major features:

  • Large database support. JE databases efficiently scale from one to millions of records. The size of your JE databases are likely to be limited more by physical constraints than by any limits imposed upon you by JE.

    Databases are described in Databases.

  • Multiple thread and process support. JE is designed from the ground up for multiple threads of control. Both read and write operations can be performed by multiple threads. JE uses record-level locking for high concurrency in threaded applications. Further, JE uses robust deadlock detection to help you ensure that two threads of control do not deadlock indefinitely.

    Moreover, JE allows multiple processes to access the same databases. However, in this configuration JE requires that there be no more than one process allowed to write to the database. Read-only processes are guaranteed a consistent, although potentially out of date, view of the stored data.

  • Database records. All database records are organized as simple key/data pairs. Both keys and data can be anything from primitive Java types to the most complex of Java objects.

    Database records are described in Database Records.

  • Transactions. Transactions allow you to treat one or more operations on one or more databases as a single unit of work. JE transactions offer the application developer recoverability, atomicity, and isolation for your database operations.

    Note that transaction protection is optional. Transactions are described in Transactions.

  • Indexes. JE allows you to easily create and maintain secondary indices for your primary data through the use of secondary databases. In this way, you can obtain rapid access to your data through the use of an alternative, or secondary, key.

    Indexes are described in Secondary Databases.

  • In-memory cache. The cache allows for high speed database access for both read and write operations by avoiding unnecessary disk I/O. The cache will grow on demand up to a preconfigured maximum size. To improve your application's performance immediately after startup time, you can preload your cache in order to avoid disk I/O for production requests of your data.

    Cache management is described in The Evictor Thread and in Sizing the Cache.

  • Log files. For data persistence, JE databases are stored in one or more log files on disk. The log files are write-once and are portable across platforms with different endian-ness.

    Note that unlike other database implementations, there are no change records or change logs in JE. Instead, JE employs write-ahead-logging to protect database modifications. Before any change is made to a database, JE writes information about the change to the log file.

    Note that JE's log files are not binary compatible with Berkeley DB's database files. However, both products provide dump and load utilities, and the files that these operate on are compatible across product lines.

    JE's log files are described in more detail in Backing up and Restoring Berkeley DB Java Edition Applications. For information on using JE's dump and load utilities, see The Command Line Tools.

  • Background threads. JE provides several threads that manage internal resources for you. There is the evictor thread, which is responsible for keeping the in-memory cache within a preconfigured maximum size by removing unneeded records from it. The checkpointer is responsible for flushing database data to disk that was written to cache as the result of a transaction commit (this is done in order to shorten recovery time). Finally, the cleaner thread is responsible for cleaning and removing unneeded log files, thereby helping you to save on disk space.

    Background thread management is described in Managing the Background Threads.

  • Database environments. Database environments provide a unit of encapsulation and management for one or more databases. In addition, the environment is the unit of management for internal resources such as the in-memory cache and the background threads. Note that all applications using JE are required to use database environments.

    Database environments are described in Database Environments.

  • Backup and restore. JE's backup mechanism consists of simply copying JE's log files to a safe location for storage. To recover from a catastrophic failure, you copy your archived log files back to your production location on disk and reopen the JE environment.

    Note that JE always performs normal recovery when it opens a database environment. Normal recovery brings the database to a consistent state based on change information found in the database log files.

    JE's backup and recovery mechanisms are described in Backing up and Restoring Berkeley DB Java Edition Applications.