UPSERT
The UPSERT keyword can be used for checking whether certain documents exist, and to update/replace them in case they exist, or create them in case they do not exist. On a single server, upserts are executed transactionally in an all-or-nothing fashion. For sharded collections, the entire update operation is not transactional.
Each UPSERT operation is restricted to a single collection, and the collection name must not be dynamic. Only a single UPSERT statement per collection is allowed per AQL query, and it cannot be followed by read operations that access the same collection, by traversal operations, or AQL functions that can read documents.
The syntax for an upsert operation is:
UPSERT searchExpression INSERT insertExpression UPDATE updateExpression IN collection options
UPSERT searchExpression INSERT insertExpression REPLACE updateExpression IN collection options
When using the UPDATE variant of the upsert operation, the found document will be partially updated, meaning only the attributes specified in updateExpression will be updated or added. When using the REPLACE variant of upsert, existing documents will be replaced with the contexts of updateExpression.
Updating a document will modify the document's revision number with a server-generated value. The system attributes _id, _key and _rev cannot be updated, _from and _to can.
The searchExpression contains the document to be looked for. It must be an object literal without dynamic attribute names. In case no such document can be found in collection, a new document will be inserted into the collection as specified in the insertExpression.
In case at least one document in collection matches the searchExpression, it will be updated using the updateExpression. When more than one document in the collection matches the searchExpression, it is undefined which of the matching documents will be updated. It is therefore often sensible to make sure by other means (such as unique indexes, application logic etc.) that at most one document matches searchExpression.
The following query will look in the users collection for a document with a specific name attribute value. If the document exists, its logins attribute will be increased by one. If it does not exist, a new document will be inserted, consisting of the attributes name, logins, and dateCreated:
UPSERT { name: 'superuser' }
INSERT { name: 'superuser', logins: 1, dateCreated: DATE_NOW() }
UPDATE { logins: OLD.logins + 1 } IN users
Note that in the UPDATE case it is possible to refer to the previous version of the document using the OLD pseudo-value.
Setting query options
As in several above examples, the ignoreErrors option can be used to suppress query errors that may occur when trying to violate unique key constraints.
When updating or replacing an attribute with a null value, ArangoDB will not remove the attribute from the document but store a null value for it. To get rid of attributes in an upsert operation, set them to null and provide the keepNull option.
There is also the option mergeObjects that controls whether object contents will be merged if an object attribute is present in both the UPDATE query and in the to-be-updated document.
Note: the default value for mergeObjects is true, so there is no need to specify it explicitly.
To make sure data are durable when an update query returns, there is the waitForSync query option.
Returning documents
UPSERT
statements can optionally return data. To do so, they need to be followed
by a RETURN
statement (intermediate LET
statements are allowed, too). These statements
can optionally perform calculations and refer to the pseudo-values OLD
and NEW
.
In case the upsert performed an insert operation, OLD
will have a value of null.
In case the upsert performed an update or replace operation, OLD
will contain the
previous version of the document, before update/replace.
NEW
will always be populated. It will contain the inserted document in case the
upsert performed an insert, or the updated/replaced document in case it performed an
update/replace.
This can also be used to check whether the upsert has performed an insert or an update internally:
UPSERT { name: 'superuser' }
INSERT { name: 'superuser', logins: 1, dateCreated: DATE_NOW() }
UPDATE { logins: OLD.logins + 1 } IN users
RETURN { doc: NEW, type: OLD ? 'update' : 'insert' }