The KahaDB message store is the default persistence store used by Fuse Message Broker. It is a file-based persistence adapter that is optimized for maximum performance. The main features of KahaDB are:
journal-based storage so that messages can be rapidly written to disk
allows for the broker to restart quickly
storing message references in a B-tree index which can be rapidly updated at run time
full support for JMS transactions
various strategies to enable recovery after a disorderly shutdown of the broker
The KahaDB message store is an embeddable, transactional message store that is fast and reliable. It is an evolution of the AMQ message store used by Fuse Message Broker 5.0 to 5.3. It uses a transactional journal to store message data and a B-tree index to store message locations for quick retrieval.
Figure 2.1 shows a high-level view of the KahaDB message store.
Messages are stored in file-based data logs. When all of the messages in a data log have been successfully consumed, the data log is marked as deletable. At a predetermined clean-up interval, logs marked as deletable are either removed from the system or moved to an archive.
An index of message locations is cached in memory to facilitate quick retrieval of message data. At configurable checkpoint intervals, the references are inserted into the metadata store.
The data logs are used to store data in the form of journals, where events of all kinds—messages, acknowledgments, subscriptions, subscription cancellations, transaction boundaries, etc.— are stored in a rolling log. Because new events are always appended to the end of the log, a data log file can be updated extremely rapidly.
Implicitly, the data logs contain all of the message data and all of the information about destinations, subscriptions, transactions, etc.. This data, however, is stored in an arbitrary manner. In order to facilitate rapid access to the content of the logs, the message store constructs metadata to reference the data embedded in the logs.
The metadata cache is an in-memory cache consisting mainly of destinations and message references. That is, for each JMS destination, the metadata cache holds a tree of message references, giving the location of every message in the data log files. Each message reference maps a message ID to a particular offset in one of the data log files. The tree of message references is maintained using a B-tree algorithm, which enables rapid searching, insertion, and deletion operations on an ordered list of messages.
The metadata cache is periodically written to the metadata
store on the file system. This procedure is known as check
pointing and the length of time between checkpoints is configurable using the
checkpointInterval
configuration attribute. For details on
how to configure the metadata cache, see Optimizing the Metadata Cache.
The metadata store contains the complete broker metadata, consisting mainly of a B-tree
index giving the message locations in the data logs. The metadata store is written to a file
called db.data
, which is periodically updated from the metadata
cache.
The metadata store duplicates data that is already stored in the data logs (in a raw, unordered form). The presence of the metadata store, however, enables the broker instance to restart rapidly. If the metadata store got damaged or was accidentally deleted, the broker could recover by reading the data logs, but the restart would then take a considerable length of time.