$Revision: 3630 $
Copyright 2002-2010 Fred Toussi. Permission is granted to distribute this document without any alteration under the terms of the HSQLDB license. Additional permission is granted to the HSQL Development Group to distribute this document with or without alterations under the terms of the HSQLDB license.
$Date: 2010-06-06 10:44:27 -0400 (Sun, 06 Jun 2010) $
Table of Contents
HyperSQL has many modes of operation and features that allow it to be used in very different scenarios. Levels of memory usage, speed and accessibility by different applications are influenced by how HyperSQL is deployed.
The decision to run HyperSQL as a separate server process or as an in-process database should be based on the following:
When HyperSQL is run as a server on a separate machine, it is isolated from hardware failures and crashes on the hosts running the application.
When HyperSQL is run as a server on the same machine, it is isolated from application crashes and memory leaks.
Server connections are slower than in-process connections due to the overhead of streaming the data for each JDBC call.
You can reduce client/server traffic using SQL Stored procedures to reduce the number of JDBC execute calls.
During development, it is better to use a Server with server.silent=false, which displays the statements sent to the server on the console window.
To improve speed of execution for statements that are executed repeatedly, reuse a parameterized PreparedStatement for the lifetime of the connections.
TEXT tables are designed for special applications where the data has to be in an interchangeable format, such as CSV (comma separated values). TEXT tables should not be used for routine storage of data.
MEMORY tables and CACHED tables are generally used for data storage. The difference between the two is as follows:
The data for all MEMORY tables is read from the *.script file when the database is started and stored in memory. In contrast the data for cached tables is not read into memory until the table is accessed. Furthermore, only part of the data for each CACHED table is held in memory, allowing tables with more data than can be held in memory.
When the database is shutdown in the normal way, all the data for MEMORY tables is written out to the disk. In comparison, the data in CACHED tables that has changed is written out during operation and at shutdown.
The size and capacity of the data cache for all the CACHED tables is configurable. This makes it possible to allow all the data in CACHED tables to be cached in memory. In this case, speed of access is good, but slightly slower than MEMORY tables.
For normal applications it is recommended that MEMORY tables are used for small amounts of data, leaving CACHED tables for large data sets. For special applications in which speed is paramount and a large amount of free memory is available, MEMORY tables can be used for large tables as well.
HyperSQL 2.0 supports dedicated storage and access to BLOB and CLOB objects. These objects can have huge sizes. BLOB or CLOB is specified as the type of a column of the table. Afterwards, rows can be inserted into the table using a PreparedStatement for efficient transfer of large LOB data to the database. In mem: catalogs, CLOB and BLOB data is stored in memory. In file: catalogs, this data is stored in a single separate file which has the extension *.lobs. The size of this file can grow to huge, terabyte figures.
LOB data should be store in the database using a JDBC PreparedStatement object. The streaming methods send the LOB to the database in one operation as a binary or character stream. Inside the database, the disk space is allocated as needed and the data is saved as it is being received. LOB data should be retrieved from the database using a JDBC ResultSet method. When a streaming method is used to retrieve a LOB, it is retrieved in large chunks in a transparent manner. LOB data can also be stored by calling a JDBC method with String or byte[] argument, but these methods limit the size of the LOB that can be stored or retrieved.
LOB data is not duplicated in the database when a lob is copied from one table to another. The disk space is reused when a LOB is deleted and is not contained in any table.
By using a dedicated LOB store, HyperSQL achieves consistently high speeds (usually over 20MB / s) for both storage and retrieval of LOBs.
The LOB catalog is stored in the database as a memory table. Therefore the amount of JVM memory should be increased when more than tens of thousands of LOBs are stored in the database.
The files used for storing HyperSQL database data are all in the same directory. New files are always created and deleted by the database engine. Two simple principles must be observed:
The Java process running HyperSQL must have full privileges on the directory where the files are stored. This include create and delete privileges.
The file system must have enough spare room both for the 'permanent' and 'temporary' files. The default maximum size of the *.log file is 50MB. The *.data file can grow to up to 16GB (more if the default has been increased). The .backup file can be up to the size of the *.data file. The *.lobs file can grow to several terabytes. The temporary files created at the time of a SHUTDOWN can be equal in size to the *.script file and the .data file.
A file: catalog can be made readonly permanently, or it can be opened as readonly. To make the database readonly, the property, value pair, readonly=true can be added to the .properties file of the database.
It is also possible to open a normal database as readonly. For this, the property can be included in the URL of the first connection to the database.
There is another option which allows MEMORY tables to be writable, but without persisting the changes at SHUTDOWN. This option is activated with the property, value pair, files_readonly= true, which can be added to the .properties file of the database, or included in the URL of the first connection to the database. This option is useful for running application tests which operate on a predefined dataset.
Memory used by the program can be thought of as two distinct pools: memory used for table data which is not released unless the data is deleted and memory that can be released or is released automatically, including memory used for caching, building result sets and other internal operations such as storing the information needed for a rollback a transaction.
Most JVM implementations allocate up to a maximum amount of memory (usually 64 MB by default). This amount is generally not adequate when large memory tables are used, or when the average size of rows in cached tables is larger than a few hundred bytes. The maximum amount of allocated memory can be set on the Java command line that is used for running HyperSQL. For example, with Sun JVM, parameter -Xmx256m increases the amount to 256 MB.
The memory used for a MEMORY table is the sum of memory used by
each row. Each MEMORY table row is a Java object that has 2 int or
reference variables. It contains an array of objects for the fields in
the row. Each field is an object such as Integer
,
Long
, String
, etc. In
addition each index on the table adds a node object to the row. Each
node object has 6 int or reference variables. As a result, a table with
just one column of type INTEGER will have four objects per row, with a
total of 10 variables of 4 bytes each - currently taking up 80 bytes per
row. Beyond this, each extra column in the table adds at least a few
bytes to the size of each row.
By default, all the rows in the result set are built in memory, so
very large result sets may not be possible to build. In server mode
databases, by default, the result set memory is released from the server
once the database server has returned the result set.
in-process databases release the memory when the
application program releases the
java.sql.ResultSet
object. Server modes require
additional memory for returning result sets, as they convert the full
result set into an array of bytes which is then transmitted to the
client.
HyperSQL 2.0 supports disk-based result sets. The commands,
SET SESSION RESULT MEMORY ROWS <integer>
and
SET DATABASE DEFAULT RESULT MEMORY ROWS
<integer>
specify a threshold for the number of rows.
Results with row counts above the threshold are stored on disk. These
settings also apply to temporary tables and subquery tables.
When the setFetchSize() method of the Statement interface is used to limit the number rows fetched, the whole result is held by the engine and is returned to the JDBC ResultSet in blocks of rows of the specified fetch size. Disk-based result sets slow down the database operations and should be used only when absolutely necessary, perhaps with result sets that are larger than tens of thousands of rows.
When UPDATE and DELETE queries are performed on CACHED tables, the full set of rows that are affected, including those affected due to ON UPDATE actions, is held in memory for the duration of the operation. This means it may not be possible to perform deletes or updates involving very large numbers of rows of CACHED tables. Such operations should be performed in smaller sets.
When transactions support is enabled with SET AUTOCOMMIT FALSE, lists of all insert, delete or update operations are stored in memory so that they can be undone when ROLLBACK is issued. For CACHED tables, only the transaction information is held in memory, not the actual rows that have changed. Transactions that span thousands of modification to data will take up a lot of memory until the next COMMIT or ROLLBACK clears the list. Each row modification uses less than 100 bytes until COMMIT.
When subqueries or views are used in SELECT and other statements,
transient tables are created and populated by the engine. If the
SET SESSION RESULT MEMORY ROWS <integer>
statement has been used, these transient tables are stored on disk when
they are larger than the threshold.
With CACHED tables, the data is stored on disk and only up to a maximum number of rows are held in memory at any time. The default is up to 50,000 rows. The SET FILES CACHE ROWS command or the hsqldb.cache_rows connection property can be set to alter this amount. As any random subset of the rows in any of the CACHED tables can be held in the cache, the amount of memory needed by cached rows can reach the sum of the rows containing the largest field data. For example if a table with 100,000 rows contains 40,000 rows with 1,000 bytes of data in each row and 60,000 rows with 100 bytes in each, the cache can grow to contain 50,000 of the smaller rows, but as explained further, only 10,000 or the large rows.
An additional property, hsqldb.cache_size is used in conjunction with the hsqldb.cache_rows property. This puts a limit in bytes on the total size of rows that are cached. The default values is 10,000KB. (This is the size of binary images of the rows and indexes. It translates to more actual memory, typically 2-4 times, used for the cache because the data is represented by Java objects.)
If memory is limited, the hsqldb.cache_rows or hsqldb.cache_size database properties can be reduced. In the example above, if the hsqldb.cache_size is reduced from 10,000 to 5,000, it will allow the number of cached rows to reach 50,000 small rows, but only 5,000 of the larger rows.
Data for CLOB and BLOB columns is not cached and does not affect the CACHED table memory cache.
The use of Java nio file access method also increases memory usage. Access with nio improves database update speed and is used by default for data files up to 256 MB. For minimal memory use, nio access should be disabled.
HyperSQL uses a set of fast pools for immutable objects such as
Integer, Long and short String objects that are stored in the database.
In most circumstances, this reduces the memory footprint still further
as fewer copies of the most frequently-used objects are kept in memory.
The object pools are shared among all databases in the JVM. The size of
each pool can be modified only by altering and recompiling the
org.hsqldb.store.ValuePool
class.
Access to lobs is always performed in chunks, so it is perfectly possible to store and access a CLOB or BLOB that is larger than the JVM memory allocation. Early versions of HyperSQL 2.0 use memory-based tables for the lob catalog (not the data). Therefore it is practical to store about 100,000 individual lobs in the database with the default JVM memory allocation. More lobs can be stored with larger JVM memory allocations. The realistic maximum number of lobs stored in the database is probably about a million. The actual total size of lobs is almost unlimited. We have tested with over 100 GB of lobs without any loss of performance.
In all running modes (server or in-process) multiple connections to the database engine are supported. in-process (standalone) mode supports connections from the client in the same Java Virtual Machine, while server modes support connections over the network from several different clients.
Connection pooling software can be used to connect to the database but it is not generally necessary. Connection pools may be used for the following reasons.
To allow new queries to be performed while a time-consuming query is being performed in the background. In HyperSQL, blocking depends on the transaction control model, the isolation level, and the current activity by other sessions.
To limit the maximum number of simultaneous connections to the database for performance reasons. With HSQLDB this can be useful if your application is designed in a way that opens and closes connections for each small task. Also, the overall performance may be higher when fewer simultaneous connections are used. If you want to reduce the number of simultaneous sessions, you can use a connection pool with fewer pooled connections.
An application that is not both multi-threaded and transactional, such as an application for recording user login and logout actions, does not need more than one connection. The connection can stay open indefinitely and reopened only when it is dropped due to network problems.
When using an in-process database, when the last connection to the database is closed, the database still remains open. An explicit SHUTDOWN command, with or without an argument, is required to close the database. A connection property on the connection URL or in a properties object can be used to shutdown the database when the last connection is closed.
When using a server database (and to some extent, an in-process database), care must be taken to avoid creating and dropping JDBC Connections too frequently. Failure to observe this will result in poor performance when the application is under heavy load.
A common error made by users in load-test simulations is to use a single client machine to open and close thousands of connections to a HyperSQL server instance. The connection attempts will fail after a few thousand because of OS restrictions on opening sockets and the delay that is built into the OS in closing them.
Different modes of operation and settings are used for different purposes. Some scenarios are discussed below:
For application unit testing you can use an all-in-memory, in-process database.
If the tests are all run in one process, then the contents of a mem: database survives between tests. To release the contents you can use the SHUTDOWN command (an SQL command). You can even use multiple mem: databases in your tests and SHUTDOWN each one separately.
If the tests are in different processes and you want to keep the data between the tests, the best solution is to use a Server instance that has a mem: database. After the tests are done, you can SHUTDOWN this database, which will shutdown the server.
The Server has an option that allows databases to be created as needed by making a connection (see the Listeners Chapter). This option is useful for testing, as your server is never shut down. Each time you connect to the mem: database that is served by the Server, the database is created if it does not exist (i.e. has been previously shut down).
If you do not want to run a Server instance, and you need
persistence between tests in different processes, then you should use a
file: database. You can use the
shutdown=true
connection property to ensure the
database is persisted fully after the connections are closed. An
alternative option is to use hsqldb.write_delay=false
connection property, but this is slightly slower than the other
option.
It has been reported that some data access frameworks do not close all their connection to the database after the tests. In such situations, you need to use zero WRITE DELAY if you want the data to persist at the end of the tests
You may actually want to use a file: database, or a server instance that serves a file: database in preference to a mem: database. As HyperSQL logs the DDL and DML statements in the .log file, this file can be used to check what is being sent to the database. Note that UPDATE statements are represented by a DELETE followed by an INSERT statement. Statements are written out when the connection commits. The write delay also has an effect on how soon the statements are written out.
Some types of tests start with a database that already contains the tables and data, and perform various operations on it during the tests. You can create and populate the initial database then set the property "files_read_only=true" in the .properties file of the database. The tests can then modify the database, but these modifications are not persisted after the tests have completed.
In this usage, the amount of data change is often limited and
there is often a requirement to persist the data immediately. You can
use the property write_delay=false
to force a disk
sync after each commit. Before the application is closed, you should
perform the SHUTDOWN command to ensure the database is opened instantly
when it is next opened.
This usage involves a server application, such as a web application, connecting to an embedded HyperSQL instance. In this usage, the database is often accessed heavily, therefore performance and latency is a consideration. If the database is updated heavily, the default value of the WRITE DELAY property (1 sec) is often enough, as it is assumed the server or the application does not go down frequently. If it is necessary, you can reduce the WRITE DELAY to a small value (20 ms) without impacting the update speed. If you reduce WRITE DELAY to zero, performance drops to the speed of disk file sync operation.
Alternatively, a server application can use an all-in-mem database instance for fast access, while sending the data changes to a persistent, disk based instance either periodically or in real time.
Since you won't be able to access
in-process database instances from other
processes, you will often want to run a Listener in your server
applications with embedded databases. You can do this by starting up a
Server or WebServer instance programmatically, but you could also use
the class org.hsqldb.util.MainInvoker
to start up
your application and a Server or WebServer without any programming.
Example 11.1. MainInvoker Example
java -cp path/to/your/app.jar:path/to/hsqldb.jar your.App "" org.hsqldb.server.Server
server.properties
file. You can
then connect to the database from outside using a JDBC URL like
jdbc:hsqldb:hsql://hostname
.
This tactic can be used to run off-the-shelf server applications with an embedded HyperSQL Server, without doing any coding.
MainInvoker
can be used to run any
number of Java class main method invocations in a single JVM. See the
API spec for
MainInvoker
for details on its usage.
All file database that are not readonly, write changes to the .log
file. There are scenarios where writing to the .log file can be turned
off to improve performance, especially with larger databases. For these
applications you can set the property
hsqldb.log_data=false
to disable the recovery log and
speed up data change performance. The equivalent SQL command is SET
FILES LOG FALSE.
With this setting, no data is logged, but all the changes to cached tables are written to the .data file. To persist all the data changes up to date, you can use the CHECKPOINT command. If you perform SHUTDOWN, the data is also persisted correctly. If you do not use CHECKPOINT or SHUTDOWN. All the changes are lost and the database reverts to its original state when it is opened.
Your server applications can use a database as a temporary disk data cache which is not persisted past the lifetime of the application. For this usage, delete the database files when the application ends.
On some platforms, such as embedded devices which are reliable, this is also a useful option. Your application issues CHECKPOINT to save the changes made so far. This method of use reduces write operations on SSD devices. For this usage, the lock file should also be disabled with the connection property hsqldb.lock_file=false.
Any database that is not produced with the release version of HyperSQL 2.0 must be upgraded to this version. Most catalogs created with 1.8.x can be upgraded simply by opening with HyperSQL 2. When this is not possible due to errors, the rest of the procedures below should be followed.
Once a database is upgraded to 2.0, it can no longer be used with previous versions of HyperSQL.
If your database has been created with version 1.7.x, first upgrade to version 1.8.1 and perform a SHUTDOWN COMPACT with this version. You can then open and upgrade the database with version 2.0.
To upgrade from version 1.8.x with the default TEXT format script files, simply open the database with 2.0. If the version 1.8.x files have database script format set to BINARY or COMPRESSED (ZIPPED) you must issue the SET SCRIPTFORMAT TEXT and SHUTDOWN SCRIPT commands with the old version, then open with the new version of the engine. In most cases the upgrade is successful and complete.
It is strongly recommended to execute SHUTDOWN COMPACT after an automatic upgrade from previous versions.
If your database has been created with version 1.7.2 or 1.7.3, first upgrade to version 1.8.1 and perform a SHUTDOWN COMPACT with this version. You can then upgrade the database to version 2.0.
To upgrade from older version database files (1.7.1 and older)
that contain CACHED tables, use the SCRIPT procedure below. In all
versions of HyperSQL, the SCRIPT 'filename'
command
(used as an SQL statement) allows you to save a full record of your
database, including database object definitions and data, to a file of
your choice. You can export a script file using the old version of the
database engine and open the script as a database with 2.0.
Procedure 11.1. Upgrade Using the SCRIPT Procedure for Very Old Versions
Open the original database in the old version of DatabaseManager
Issue the SCRIPT command, for example SCRIPT
'newversion.script'
to create a script file containing a
copy of the database.
SHUTDOWN this database.
Copy the original *.properties
file into
newversion.properties
in the same directory as
newversion.script
Try to open the new database newversion
using DatabaseManager of version 1.8.1.
If there is any inconsistency in the data, the script line
number is reported on the console and the opening process is
aborted. Edit and correct any problems in the
newversion.script
before attempting to open
again. Use the guidelines in the next section (Manual Changes to the
.script
File). Use a programming editor that is
capable of handling very large files and does not wrap long lines of
text.
In HyperSQL 2.0 the full range of ALTER TABLE commands is
available to change the data structures and their names. However, if an
old database cannot be opened due to data inconsistencies, or it uses
index or column names that are not compatible with 2.0, manual editing
of the *.script
file can be performed.
Version 2.0 does not accept duplicate names for indexes that were allowed before 1.7.2.
Version 2.0 does not accept some table or column names that are SQL reserved keywords without double quoting.
Version 2.0 is more strict with check conditions and default values.
Other manual changes are also possible. Note that the
*.script
file must be the result of a SHUTDOWN SCRIPT
and must contain the full data for the database. The following changes
can be applied so long as they do not affect the integrity of existing
data.
Names of tables, columns and indexes can be changed. These changes must be consistent regarding foreign key constraints.
CHECK
A check constraint can always be removed.
NOT NULL
A not-null constraint can always be removed.
PRIMARY KEY
A primary key constraint can be removed. It cannot be removed if there is a foreign key referencing the column(s).
UNIQUE
A UNIQUE constraint can be removed if there is no foreign key referencing the column(s).
FOREIGN KEY
A FOREIGN KEY constraint can always be removed.
COLUMN TYPES
Some changes to column types are possible. For example an INTEGER column can be changed to BIGINT.
After completing the changes and saving the modified
.script
file, you can open the database as
normal.
HyperSQL 2.0 conforms to the SQL Standard better than previous versions and supports more features. For these reasons, there may be some compatibility issues when converting old database, or using applications that were written for version 1.8.x or earlier. Some of the potential issues are listed here.
User names and passwords are case-sensitive. Check the .script file of a database for the correct case of user name and password and use this form in the connection properties or on connection URL.
Check constraints must conform to the SQL Standard. A check constraint is rejected if it is not deterministic or retrospectively deterministic. When opening an old database, HyperSQL silently drops check constraints that no longer compile. See under check constraints for more detail about what is not allowed.
Type declarations in column definition and in cast expressions must have the necessary size parameters.
In connection with the above, an old database that did not have
the enforce_strict_size
property, is now converted
to version 2.0 with the engine supplying the missing size parameters.
For example, a VARCHAR column declaration that has no size, is given a
32K size. Check these sizes are adequate for your use, and change the
column definition as necessary.
Column names in a GROUP BY clause were previously resolved to the column label. They are now resolved to column name first, and if the name does not match, to the column label.
If two or more tables in a join contain columns with the same name, the columns cannot be referenced in join and where conditions. Use table names before column names to qualify the references to such columns.
Table definitions containing GENERATED BY DEFAULT AS IDENTITY but with no PRIMARY KEY do not automatically create a primary key. Database .script files made with 1.8 are fine, as the PRIMARY KEY clause is always included. But your application program may assume an automatic primary key is created.
CREATE ALIAS is now obsolete. Use the new function definition
syntax. The org.hsqldb.Library
class no longer
exists. You should use the SQL form of the old library functions. For
example, use LOG(x)
rather than the direct form,
"org.hsqldb.Library.log"(x)
.
The names of some commands for changing database and session properties have changed. See the list of statements in this chapter.
The database engine saves the files containing all the data in a file catalog when a shutdown takes place. It automatically recovers from an abnormal termination and preserves the data when the catalog is opened next time. In an ideal operating environment, where there is no OS crash, disk failure, bugs in code, etc. there would be no need regularly to backup a database. This is meant to say, the engine performs the routine shutdown procedure internally, therefore backing up catalogs is an insurance policy against all sorts of misadventure that are not under the control of the database engine.
The data for each catalog consists of up to 5 files in the same
directory with the endings such as *.properties
,
*.script
, etc., as detailed in previous
chapters.
HyperSQL 2.0 includes commands to backup the database files into
a single .tar
or .tar.gz
file
archive. The backup can be performed by a command given in a JDBC session
if the target database catalog is running, or on the command-line if the
target catalog has been shutdown.
To back up a running catalog, obtain a JDBC connection and
issue a BACKUP DATABASE
command in SQL. In its most
simple form, the command format below will backup the database as a
single .tar.gz
file to the given directory.
BACKUP DATABASE TO <directory name> BLOCKING
See the next section under Statements for details about the command and its options. See the sections below about restoring a backup.
To back up an offline catalog, the catalog must be in shut down state. You will run a Java command like this
Example 11.2. Offline Backup Example
java -cp path/to/hsqldb.jar org.hsqldb.lib.tar.DbBackup --save \ tar/path.tar db/base/path
tar/path.tar
is a file path to
the *.tar
or *.tar.gz
file to be
created, and db/base/path
is the file path to the
catalog file base name (in same fashion as in
server.database.*
settings and JDBC URLs with catalog
type file:.
You can list the contents of backup tar files with
DbBackup
on your operating system command line,
or with any Pax-compliant tar or pax client (this includes GNU tar),
Example 11.3. Listing a Backup with DbBackup
java -cp path/to/hsqldb.jar org.hsqldb.lib.tar.DbBackup --list tar/path.tar
.+script
to match
entries ending in "script", not *script
.
You can examine the contents of the backup in their entirety by restoring the backup, as explained in the following section, to a temporary directory.
You use DbBackup
on your operating system
command line to restore a catalog from a backup.
Example 11.4. Restoring a Backup with DbBackup
java -cp path/to/hsqldb.jar org.hsqldb.lib.tar.DbBackup --extract \ tar/path.tar db/dir
tar/path.tar
is a file path to
the *.tar or *.tar.gz file to be read, and db/dir
is the target directory to extract the catalog files into. Note that
db/dir
specifies a directory path, without the
catalog file base name. The files will be created with the names stored
in the tar file (and which you can see as described in the preceding
section).
HyperSQL supports encrypted databases. Encryption services use the Java Cryptography Extensions (JCE) and uses the ciphers installed with the JRE. HyperSQL itself does not contain any cryptography code.
Three elements are involved in specifying the encryption method and key. A cipher, together with its configuration is identified by a string which includes the name of the cipher and optional parameters. A provider is the fully qualified class name of the cipher provider. A key is represented as a hexadecimal string.
First, a key must be created for the desired cipher and configuration. This is done by calling the function CRYPT_KEY(<cipher spec>, <provider>). If the default provider (the built-in JVM ciphers) is used, then NULL should be specified as the provider. The CRYPT_KEY function returns a hexadecimal key. The function call can be made in any HyperSQL database, so long as the provider class is on the classpath. This key can be used to create a new encrypted database. Calls to this function always return different keys, based on a generated random values.
As an example, a call to CRYPT_KEY('Blowfish', null) returned the string, '604a6105889da65326bf35790a923932'. To create a new database, the URL below is used:
jdbc:hsqldb:file:<database
path>;crypt_key=604a6105889da65326bf35790a923932;crypt_type=blowfish
The third property name is crypt_provider. This is specified only when the provider is not the default provider.
HyperSQL works with any symmetric cipher that may be available from the JVM.
The files that are encrypted include the .script, .data, .backup and .log files. The .lobs file is not encrypted by default. The property crypt_lobs=true must be specified to encrypt the .lobs file.
General operations on an encrypted database are performed the same as with any database. However, some operations are significantly slower than with the equivalent cleartext database. With MEMORY tables, there is no difference to the speed of SELECT statements, but data change statements are slower. With CACHED tables, the speed of all statements is slower.
Security considerations for encrypted databases have been discussed at length in HSQLDB discussion groups. Development team members have commented that encryption is not a panacea for all security needs. The following issues should be taken into account:
Encrypted files are relatively safe in transport, but because databases contain many repeated values and words, especially known tokens such as CREATE, INSERT, etc., breaking the encryption of a database may be simpler than an unknown file.
Only the files are encrypted, not the memory image. Poking into computer memory, while the database is open, will expose the contents of the database.
HyperSQL is open source. Someone who has the key, can compile and use a modified version of the program that saves a full cleartext dump of an encrypted database
Database operations can be monitored at different levels using internal HyperSQL capabilities or add-ons.
Statement level monitoring allows you to gather statistics about executed statements. HyperSQL is supported by the monitoring tool JAMon (Java Application Monitor). JAMon is currently developed as the SourceForge project, jamonapi.
JAMon works at the JDBC level. It can monitor and gather statistics on different types of executed statements or other JDBC calls.
Early versions of JAMon were developed with HSQLDB and had to be integrated into HSQLDB at code level. The latest versions can be added on as a proxy in a much simpler fashion.
HyperSQL can log important internal events of the engine. These events occur during the operation of the engine, and are not always coupled with the exact type of statement being executed. Normal events such as opening and closing of files, or errors such as OutOfMemory conditions are examples of logged events.
HyperSQL supports two methods of logging. One method is specific to the individual database and is managed internally by HyperSQL. The other method is specific to JVM and is managed by a logging framework.
The internally-generated, individual log for the database can be
enabled with the SET DATABASE EVENT LOG LEVEL
statement, described in the next section. This method of logging is very
useful for desktop application deployment, as it provides an ongoing
record of database operations.
HyperSQL also supports log4J and JDK logging. The same event information that is passed to the internal log, is passed to external logging frameworks. These frameworks are configured outside HyperSQL. The log messages include the unique id of the database that generated the message, so it can be identified in a multi-database server context.
System level statements are listed in this section. Statements that begin with SET DATABASE or SET FILES are for properties that have an effect on the normal operation of HyperSQL. The effects of these statements are also discussed in different chapters.
SHUTDOWN
shutdown statement
<shutdown statement> ::= SHUTDOWN [IMMEDIATELY |
COMPACT | SCRIPT]
Shutdown the database. If the optional qualifier is not used, a normal SHUTDOWN is performed. A normal SHUTDOWN ensures all data is saved correctly and the database opens without delay on next use.
SHUTDOWN IMMEDIATELY |
Saves the *.log file and closes the database files. This is the quickest form of shutdown. This command should not be used as the routine method of closing the database, because when the database is accessed next time, it may take a long time to start. |
SHUTDOWN COMPACT |
This is similar to normal SHUTDOWN, but reduces the *.data file to its minimum size. It takes longer than normal SHUTDOWN. |
SHUTDOWN SCRIPT |
This is similar to SHUTDOWN COMPACT, but it does not rewrite
the This command produces a full script of the database which can be edited for special purposes prior to the next startup. |
Only a user with the DBA role can execute this statement.
BACKUP DATABASE
backup database statement
<backup database statement> ::= BACKUP DATABASE TO
<file path> {SCRIPT | [NOT] COMPRESSED} BLOCKING
Backup the database to specified <file
path>
for archiving purposes.
The <file path>
can be in two forms. If
the <file path>
ends with a forward slash, it
specifies a directory. In this case, an automatic name for the archive is
generated that includes the date, time and the base name of the database.
The database is backed up to this archive file in the specified directory.
If the <file path>
does not end with a forward
slash, it specifies a user-defined file name for the backup archive. The
archive is in tar, gzip format depending on whether it is compressed or
not.
The SCRIPT option is not currently supported. If SCRIPT is
specified, the backup will consist of two files, a
*.properties
file and a *.script
file, which contain all the data and settings of the database. These files
are not compressed.
If COMPRESSED or NOT COMPRESSED is specified, the backup consists of the current snapshot of database files. During backup, a CHECKPOINT command is silently executed.
The qualifier, BLOCKING, means all database operations are suspended during backup.
The HyperSQL jar also contains a program that creates an archive of an offline database. It also contains a program to expand an archive into database files. These programs are documented in this chapter under Backing up Database Catalogs.
Only a user with the DBA role can execute this statement.
CHECKPOINT
checkpoint statement
<checkpoint statement> ::= CHECKPOINT
[DEFRAG]
Closes the database files, rewrites the script file, deletes the
log file and opens the database. If DEFRAG
is
specified, also shrinks the *.data
file to its minumum
size. Only a user with the DBA role can execute this statement.
Only a user with the DBA role can execute this statement.
CRYPT_KEY
crypt_key function
<crypt_key function> ::= CRYPT_KEY ( <cipher
spec>, <provider> )
The statement, CALL CRYPT_KEY( <cipher spec>,
<provider> )
returns a binary string representing a valid
key for the giver cipher and provider. The
<provider>
argument is specified as NULL for the
default provider.
SCRIPT
script statement
<script statement> ::= SCRIPT [<file
name>]
Returns a script containing SQL statements that define the
database, its users, and its schema objects. If <file
name>
is not specified, the statements are returned in a
ResultSet, with each row containing an SQL statement. No data statements
are included in this form. The optional file name is a single-quoted
string. If <file name>
is specified, then the
script is written to the named file. In this case, all the data in all
tables of the database is included in the script as INSERT
statements.
Only a user with the DBA role can execute this statement.
SET DATABASE COLLATION
set database collation statement
<set database collation statement> ::= SET
DATABASE COLLATION <collation name>
Each database can have its own collation. Sets the collation from the set of collations supported by HyperSQL. Once this command has been issued, the database can be opened in any JVM and will retain its collation. Only a user with the DBA role can execute this statement.
Only a user with the DBA role can execute this statement.
SET DATABASE DEFAULT TABLE TYPE
set database default table type
statement
<set database default table type> ::= SET DATABASE
DEFAULT TABLE TYPE { CACHED | MEMORY }
Sets the type of table created when the next CREATE TABLE statement is executed. The default is MEMORY.
Only a user with the DBA role can execute this statement.
SET DATABASE DEFAULT RESULT MEMORY ROWS
set database default result memory rows
statement
<set database default result memory rows> ::= SET
DATABASE DEFAULT RESULT MEMORY ROWS <unsigned integer
literal>
Sets the maximum number of rows of each result set and other
internal temporary table that is held in memory. This setting applies to
all sessions. Individual sessions can change the value with the
SET SESSION RESULT MEMORY ROWS
command. The default is
0, meaning all result sets are held in memory.
Only a user with the DBA role can execute this statement.
SET DATABASE EVENT LOG LEVEL
set database event log level
statement*
<set database event log level> ::= SET DATABASE
EVENT LOG LEVEL { 0 | 1 | 2 }
Sets the amount of information logged in the internal, database-specific event log. Level 0 means no log. Level 1 means only important (error) events. Level 2 means more events, including both important and less important (normal) events. For readonly and mem: databases, if the level is set above 0, the log messages are directed to stderr.
Only a user with the DBA role can execute this statement.
SET DATABASE GC
set database gc statement
<set database gc statement> ::= SET DATABASE GC
<unsigned integer literal>
An optional property which forces calls to System.gc()
after the specified number of row operations. The default value
for this property is 0, which means no System.gc() calls. Usual values for
this property range from 10000 depending on the system and the memory
allocation. This property may be useful in some in-process deployments,
especially with older JVM implementations.
Only a user with the DBA role can execute this statement.
SET DATABASE SQL SIZE
set database sql size statement
<set database sql size statement> ::= SET DATABASE
SQL SIZE { TRUE | FALSE }
Enable or disable enforcement of column sizes for CHAR and
VARCHAR columns. The default is TRUE, meaning table definition must
contain VARCHAR(n)
instead of
VARCHAR
.
Only a user with the DBA role can execute this statement.
SET DATABASE SQL NAMES
set database sql names statement
<set database sql names statement> ::= SET
DATABASE SQL NAMES { TRUE | FALSE }
Enable or disable full enforcement of the rule that prevents SQL keywords being used for database object names such as columns and tables. The default is FALSE, meaning disable.
Only a user with the DBA role can execute this statement.
SET DATABASE SQL REFERENCES
set database sql references statement
<set database sql references statement> ::= SET
DATABASE SQL REFERENCES { TRUE | FALSE }
This command can enable or disable full enforcement of the rule that prevents ambiguous column references in SQL statements (usually SELECT statements). A column reference is ambiguous when it is not qualified by a table name or table alias and can refer to more than one column in a JOIN list.
The property is FALSE by default. It is better to enable this check while development, to improve the quality and correctness of SQL statements.
Only a user with the DBA role can execute this statement.
SET DATABASE REFERENTIAL INTEGRITY
set database referential integrity statement
<set database referential integrity statement> ::=
SET DATABASE REFERENTIAL INTEGRITY { TRUE | FALSE }
This command enables or disables the enforcement of referential integrity constraints (foreign key constraints), check constraints apart from NOT NULL and triggers. By default, referential integrity constraints are checked.
The only legitimate use of this statement is before importing large amounts of external data into tables that have existing FOREIGN KEY constraints. After import, the statement must be used again to enable constraint enforcement.
If you are not sure the data conforms to the constraints, run queries to verify all rows conform to the FOREIGN KEY constraints and take appropriate actions for the rows that do not conform.
A query example to return the rows in a foreign key table that have no parent is given below:
Example 11.5. Finding foreign key rows with no parents after a bulk import
SELECT * FROM foreign_key_table LEFT OUTER JOIN primary_key_table ON foreign_key_table.fk_col = primary_key_table.pk_col WHERE primary_key_table.pk_col IS NULL
Only a user with the DBA role can execute this statement.
SET DATABASE UNIQUE NAME
set database unique name
<set database unique name statement> ::= SET
DATABASE UNIQUE NAME <identifier>
Each HyperSQL catalog (database) has an engine-generated internal name. This name is based on the time of creation of the database and is exactly 16 characters. The name is used for in log events sent to external logging frameworks. This name can be changed by an administrator. The new name must be exactly 16 characters long.
SET DATABASE TRANSACTION CONTROL
set database transaction control statement
<set database transaction control statement> ::=
SET DATABASE TRANSACTION CONTROL { LOCKS | MVLOCKS | MVCC
}
Set the concurrency control system for the database. It can be issued only when all sessions have been committed or rolled back. This command and its modes is discussed in the Sessions and Transactions chapter.
Only a user with the DBA role can execute this statement.
SET FILES BACKUP INCREMENT
set files backup increment statement
<set database backup increment statement> ::= SET
FILES BACKUP INCREMENT { TRUE | FALSE }
Older versions of HSQLDB perform a backup of the .data file before its contents are modified and the whole .data file is saved in a compressed form when a CHECKPOINT or SHUTDOWN is performed. This takes a long time when the size of the database exceeds 100 MB or so (on an average 2010 computer, you can expect a backup speed of 20MB / s or more).
The alternative is backup in increments, just before some part of the .data file is modified. In this mode, no backup is performed at CHECKPIONT or SHUTDOWN. This mode is preferred for large databases which are opened and closed frequently.
The default mode is TRUE. If the old method of backup is preferred, the mode can be set FALSE.
Only a user with the DBA role can execute this statement.
SET FILES CACHE ROWS
set files cache rows statement
<set files cache rows statement> ::= SET FILES
CACHE ROWS <unsigned integer literal>
Sets the maximum number of rows (of CACHED tables) held in the memory cache.
Only a user with the DBA role can execute this statement.
SET FILES CACHE SIZE
set files cache size statement
<set files cache size statement> ::= SET FILES
CACHE SIZE <unsigned integer literal>
Sets maximum amount of data (of CACHED tables) in kilobytes held in the memory cache.
Only a user with the DBA role can execute this statement.
SET FILES DEFRAG
set files defrag statement
<set files defrag statement> ::= SET FILES DEFRAG
<unsigned integer literal>
Sets the threshold for performing a DEFRAG during a checkpoint.
The <unsigned integer literal>
is the percentage
of abandoned space in the *.data
file. When a
CHECKPOINT is performed either as a result of the .log
file reaching the limit set by SET FILES LOG SIZE m
, or
by the user issuing a CHECKPOINT command, the amount of space abandoned
since the database was opened is checked and if it is larger than
specified percentage, a CHECKPOINT DEFRAG is performed instead of a
CHECKPOINT.
The default is 0, which indicates no DEFRAG. Useful values are between 10 to 50
Only a user with the DBA role can execute this statement.
SET FILES LOG
set files log statement
<set files log statement> ::= SET FILES LOG { TRUE
| FALSE }
Sets logging of database operations on or off. Turning logging off is for special usage, such as temporary cache usage.
Only a user with the DBA role can execute this statement.
SET FILES LOG SIZE
set files log size statement
<set files log size statement> ::= SET FILES LOG
SIZE <unsigned integer literal>
Sets the maximum size in MB of the *.log
file
to the specified value. The default maximum size is 50 MB. If the value is
zero, no limit is used for the size of the file. When the size of the file
reaches this value, a CHECKPOINT is performed and the the
*.log
file is cleared to size 0.
Only a user with the DBA role can execute this statement.
SET FILES BACKUP INCREMENT
set files backup increment statement
<set files increment backup statement> ::= SET
FILES INCREMENT BACKUP { TRUE | FALSE }
This specifies the method for internal backup operation. The default is true.
During updates, the contents of the .data file is modified. When this property is true, the modified contents are backed up gradually. This causes a marginal slowdown in operations, but allows fast checkpoint and shutdown with large .data files.
When the property is false, the .data file is backed up entirely at the time of checkpoint and shutdown. Up to version 1.8.0, HSQLDB supported only full backup. Version 1.8.1 supports incremental backup.
Only a user with the DBA role can execute this statement.
SET FILES NIO
set files nio
<set files nio statement> ::= SET FILES NIO { TRUE
| FALSE }
Changes the access method of the .data file. The default is TRUE and uses the Java nio classes to access the file.
Only a user with the DBA role can execute this statement.
SET FILES WRITE DELAY
set files write delay statement
<set files write delay statement> ::= SET FILES
WRITE DELAY {{ TRUE | FALSE } | <seconds value> | <milliseconds
value> MILLIS}
Set the WRITE DELAY property of the database. The WRITE DELAY controls the frequency of file sync for the log file. When WRITE_DELAY is set to FALSE or 0, the sync takes place immediately at each COMMIT. WRITE DELAY TRUE performs the sync once every 10 seconds (which is the default). A numeric value can be specified instead.
The purpose of this command is to control the amount of data loss in case of a total system crash. A delay of 1 second means at most the data written to disk during the last second before the crash is lost. All data written prior to this has been synced and should be recoverable.
A write delay of 0 impacts performance in high load situations, as the engine has to wait for the file system to catch up.
To avoid this, you can set write delay down to 10 milliseconds.
Each time the SET FILES WRITE DELAY statement is executed with any value, a sync is immediately performed. Only a user with the DBA role can execute this statement.
Only a user with the DBA role can execute this statement.
SET FILES SCALE
set files scale
<set files scale statement> ::= SET FILES SCALE
<scale value>
Changes the scale factor for the .data file. The default scale is 8 and allows 16GB of data storage capacity. The scale can be increased in order to increase the maximum data storage capacity. The scale values 8, 16, 32, 64 and 128 are allowed. Scale value 128 allows a maximum capacity of 256GB.
This command can be used only when there is no data in CACHED tables.
Only a user with the DBA role can execute this statement.
SET FILES LOB SCALE
set files lob scale
<set files lob scale statement> ::= SET FILES LOB
SCALE <scale value>
Changes the scale factor for the .lobs file. The scale is interpreted in kilobytes. The default scale is 32 and allows 64TB of lob data storage capacity. The scale can be reduced in order to improve storage efficiency. If the lobs are a lot smaller than 32 kilobytes, reducing the scale will reduce wasted space. The scale values 1, 2, 4, 8, 16, 32 are allowed. For example if the average size of lobs is 4 kilobytes, the default scale of 32 will result in 28KB wasted space for each lob. Reducing the lob scale to 2 will result in average 1KB wasted space for each lob.
This command can be used only when there is no lob in the database.
Only a user with the DBA role can execute this statement.
$Revision: 3601 $