Durability Configuration
Global Configuration
There are global configuration values for durability, which can be adjusted by specifying the following configuration options:
default wait for sync behavior
--database.wait-for-sync boolean
Default wait-for-sync value. Can be overwritten when creating a new
collection.
The default is false.
force syncing of collection properties to disk
--database.force-sync-properties boolean
Force syncing of collection properties to disk after creating a collection
or updating its properties.
If turned off, no fsync will happen for the collection and database
properties stored in parameter.json
files in the file system. Turning
off this option will speed up workloads that create and drop a lot of
collections (e.g. test suites).
The default is true.
interval for automatic, non-requested disk syncs
--wal.sync-interval
The interval (in milliseconds) that ArangoDB will use to automatically
synchronize data in its write-ahead logs to disk. Automatic syncs will
only
be performed for not-yet synchronized data, and only for operations that
have been executed without the waitForSync attribute.
Per-collection configuration
You can also configure the durability behavior on a per-collection basis. Use the ArangoDB shell to change these properties.
gets or sets the properties of a collection
collection.properties()
Returns an object containing all collection properties.
- waitForSync: If true creating a document will only return after the data was synced to disk.
- journalSize : The size of the journal in bytes.
- isVolatile: If true then the collection data will be kept in memory only and ArangoDB will not write or sync the data to disk.
- keyOptions (optional) additional options for key generation. This is
a JSON array containing the following attributes (note: some of the
attributes are optional):
- type: the type of the key generator used for the collection.
- allowUserKeys: if set to true, then it is allowed to supply own key values in the _key attribute of a document. If set to false, then the key generator will solely be responsible for generating keys and supplying own key values in the _key attribute of documents is considered an error.
- increment: increment value for autoincrement key generator. Not used for other key generator types.
- offset: initial offset value for autoincrement key generator. Not used for other key generator types.
- indexBuckets: number of buckets into which indexes using a hash table are split. The default is 16 and this number has to be a power of 2 and less than or equal to 1024. For very large collections one should increase this to avoid long pauses when the hash table has to be initially built or resized, since buckets are resized individually and can be initially built in parallel. For example, 64 might be a sensible value for a collection with 100 000 000 documents. Currently, only the edge index respects this value, but other index types might follow in future ArangoDB versions. Changes (see below) are applied when the collection is loaded the next time. In a cluster setup, the result will also contain the following attributes:
- numberOfShards: the number of shards of the collection.
- shardKeys: contains the names of document attributes that are used to
determine the target shard for documents.
collection.properties(properties)
Changes the collection properties. properties must be a object with one or more of the following attribute(s): - waitForSync: If true creating a document will only return after the data was synced to disk.
- journalSize : The size of the journal in bytes.
- indexBuckets : See above, changes are only applied when the collection is loaded the next time. Note: it is not possible to change the journal size after the journal or datafile has been created. Changing this parameter will only effect newly created journals. Also note that you cannot lower the journal size to less then size of the largest document already stored in the collection. Note: some other collection properties, such as type, isVolatile, or keyOptions cannot be changed once the collection is created.
Examples
Read all properties
arangosh> db.example.properties();
{
"doCompact" : true,
"journalSize" : 33554432,
"isSystem" : false,
"isVolatile" : false,
"waitForSync" : false,
"keyOptions" : {
"type" : "traditional",
"allowUserKeys" : true,
"lastValue" : 0
},
"indexBuckets" : 8
}
arangosh> db.example.properties();
Change a property
arangosh> db.example.properties({ waitForSync : true });
{
"doCompact" : true,
"journalSize" : 33554432,
"isSystem" : false,
"isVolatile" : false,
"waitForSync" : true,
"keyOptions" : {
"type" : "traditional",
"allowUserKeys" : true,
"lastValue" : 0
},
"indexBuckets" : 8
}
arangosh> db.example.properties({ waitForSync : true });
Per-operation configuration
Many data-modification operations and also ArangoDB's transactions allow to specify a waitForSync attribute, which when set ensures the operation data has been synchronized to disk when the operation returns.
Disk-Usage Configuration
The amount of disk space used by ArangoDB is determined by a few configuration options.
Global Configuration
The total amount of disk storage required by ArangoDB is determined by the size of the write-ahead logfiles plus the sizes of the collection journals and datafiles.
There are the following options for configuring the number and sizes of the write-ahead logfiles:
maximum number of reserve logfiles
--wal.reserve-logfiles
The maximum number of reserve logfiles that ArangoDB will create in a
background process. Reserve logfiles are useful in the situation when an
operation needs to be written to a logfile but the reserve space in the
logfile is too low for storing the operation. In this case, a new logfile
needs to be created to store the operation. Creating new logfiles is
normally slow, so ArangoDB will try to pre-create logfiles in a background
process so there are always reserve logfiles when the active logfile gets
full. The number of reserve logfiles that ArangoDB keeps in the background
is configurable with this option.
maximum number of historic logfiles
--wal.historic-logfiles
The maximum number of historic logfiles that ArangoDB will keep after they
have been garbage-collected. If no replication is used, there is no need
to keep historic logfiles except for having a local changelog.
In a replication setup, the number of historic logfiles affects the amount
of data a slave can fetch from the master's logs. The more historic
logfiles, the more historic data is available for a slave, which is useful
if the connection between master and slave is unstable or slow. Not having
enough historic logfiles available might lead to logfile data being
deleted
on the master already before a slave has fetched it.
the size of each WAL logfile
--wal.logfile-size
Specifies the filesize (in bytes) for each write-ahead logfile. The
logfile
size should be chosen so that each logfile can store a considerable amount
of
documents. The bigger the logfile size is chosen, the longer it will take
to fill up a single logfile, which also influences the delay until the
data
in a logfile will be garbage-collected and written to collection journals
and datafiles. It also affects how long logfile recovery will take at
server start.
whether or not oversize entries are allowed
--wal.allow-oversize-entries
Whether or not it is allowed to store individual documents that are bigger
than would fit into a single logfile. Setting the option to false will
make
such operations fail with an error. Setting the option to true will make
such operations succeed, but with a high potential performance impact.
The reason is that for each oversize operation, an individual oversize
logfile needs to be created which may also block other operations.
The option should be set to false if it is certain that documents will
always have a size smaller than a single logfile.
When data gets copied from the write-ahead logfiles into the journals or datafiles
of collections, files will be created on the collection level. How big these files
are is determined by the following global configuration value:
--database.maximal-journal-size size
Maximal size of journal in bytes. Can be overwritten when creating a new
collection. Note that this also limits the maximal size of a single
document.
The default is 32MB.
Per-collection configuration
The journal size can also be adjusted on a per-collection level using the collection's properties method.