Handling errors

Most operations return an lcb_error_t status code. A successful operation is defined by the return code of LCB_SUCCESS, while any other code indicates an error condition. You can find a full list of error codes in the <libcouchbase/error.h> header.

Note: New applications are advised to enable extended error codes by using the LCB_CNTL_DETAILED_ERRCODES setting (see Tuning and configuring). Extended error codes are not enabled by default to avoid sending older applications error codes that they cannot handle.

To handle the errors properly, you must understand what the errors mean and whether they indicate a data error, that the operation can be retried, or a fatal error.

Data errors are errors received from the cluster as a result of a requested operation on a given item. For example, if an lcb_get() is performed on a key that does not exist, the callback will receive an LCB_KEY_ENOENT error. Other examples of conditions that can result in data errors include:
  • Adding a key that already exists (An LCB_KEY_EEXISTS) will be thrown
  • Replacing a key that does not already exist (LCB_KEY_ENOENT)
  • Appending, prepending, or incrementing an item that does not already exist (LCB_KEY_ENOENT, LCB_NOT_STORED)
  • Modifying an item while specifying a CAS, where the CAS on the server has already been modified (LCB_KEY_EEXISTS).
Environmental errors related to load or network issues might be received in exceptional conditions. Although you should not receive these types of errors in normal, well-provisioned deployments, your application must be prepared to handle them and take proper action. Environmental errors are divided into the following subgroups:
  • Transient. This error type indicates an environment and/or resource limitation either on the server or on the network link between the client and the server. Examples of transient errors include timeout errors or temporary failures such as LCB_ETIMEDOUT (took too long to get a reply) and LCB_ETMPFAIL (server was too busy). Transient errors are typically best handled on the application side by backing off and retrying the operation, with the intent of reducing stress on the exhausted resource. Some examples of transient error causes:
    • Insufficient cache memory on the server
    • Overutilization of the network link between the client and server or between several servers
    • Router or switch failure
    • Failover of a node
    • Overutilization of application-side CPU.
  • Fatal. This error type indicates that the client has potentially entered into an irrecoverable failed state, either because of invalid user input (or client configuration), or because an administrator has modified settings on the cluster (for example, a bucket has been removed). Examples of fatal errors include LCB_AUTH_ERROR (authentication failed) and LCB_BUCKET_ENOENT (bucket does not exist).

    Fatal errors typically require inspection of the client configuration and a restart of the client application or a reversal of the change performed at the cluster. Some examples of fatal error causes:

    • Bucket does not exist.
    • Bucket password is wrong.
    • None of the nodes in the cluster are reachable.
The lcb_errflags_t enumeration defines a set of flags that are associated with each error code. These flags define the type of error. Some examples of error types:
  • LCB_ERRTYPE_INPUT, which is set when a malformed parameter is passed to the library
  • LCB_ERRTYPE_DATAOP, which is set when the server is unable to satisfy data constraints such as a missing key or a CAS mismatch.
The LCB_EIF_${TYPE} macros, where ${TYPE} represents one of the lcb_errflags_t flags, can be used to check whether an error is of a specific type. The following example shows how to check the return codes:
static void get_callback(lcb_t instance,
  const void *cookie, lcb_error_t err, const lcb_get_resp_t *resp)
{
  if (err == LCB_SUCCESS) {
    printf("Successfuly retrieved key!\n");
  } else if (LCB_EIFDATA(err)) {
    switch (err) {
    case LCB_KEY_ENOENT:
      printf("Key not found!\n");
      break;
    default:
      printf("Received other unhandled data error\n");
      break;
    }
  } else if (LCB_EIFTMP(err)) {
    printf("Transient error received. May retry\n");
  }
}

When to check for errors, and what they mean

Success and failure depend on the context. A successful return code for one of the data operation APIs (for example, lcb_store()) does not mean the operation itself succeeded and the key was successfully stored. Rather, it means the key was successfully placed inside the library’s internal queue. The actual error code is delivered as the error parameter in the operation callback itself (that is, the callback installed with lcb_set_storage_callback()).

Errors received as a return value from a scheduling function (i.e. lcb_get()) will fail only if there is an issue with the cluster (that is, a node was failed over and there is no replica at that position), or if invalid input was specified to the library. Errors received in the callback will either be data responses from the server or environmental errors (such as an error that took place while reading from the network).

If a scheduling API returns anything but LCB_SUCCESS, the callback for that specific request will not be delivered. Conversely, it is guaranteed that the callback will always be delivered if the return code for the scheduling function is LCB_SUCCESS.

Program crashes and pitfalls

If your application abnormally terminates while invoking a function with the library, you may have either encountered a bug or passed the library an invalid pointer. Keep in mind the following points:
  • The library is not thread safe. While you may use multiple lcb_t handles in different threads, you must never access the same handle from multiple threads without using external synchronization functions (such as mutexes).
  • The response structures within the callback are valid only in the scope of the callback function itself. This means you must copy the structure (and any contained keys and values) into another location in memory if you wish to use it outside the callback.
  • Callbacks will not be invoked if the scheduling function returns a failure status. This means that the following code will result in accessing uninitialized memory:
    struct myresult {
      char *value;
      lcb_error_t err;
    }
    static void get_callback(lcb_t instance, const void *cookie,
        lcb_error_t err, const lcb_get_resp_t *resp)
    {
      struct myresult *mr = (struct myresult *)cookie;
      mr->err = err;
      if (mr->err == LCB_SUCCESS) {
        mr->value = malloc(resp->v.v0.nkey + 1);
        memcpy(mr->value, resp->v.v0.key, resp->v.v0.nkey);
        mr->value[resp->v.v0.nkey] = '\0';
      } else {
        mr->value = NULL;
      }
    }
    
    // Some lines later
    struct myresult mr;
    lcb_get(instance, &mr, 1, &cmdlist);
    lcb_wait(instance);
    if (mr.value) {
      // If lcb_get() returned an error, this will be uninitialized access!
      // ...
    }

A crash can also be a result of a bug in the library. Sometimes the library will call abort when it detects an inconsistent state. If you think you have found a bug in the library you should file a bug in our issue tracker. When filing a bug, please be sure to include the library version and any relevant code samples.

Diagnosing Issues

Diagnosing issues can typically be done by enabling logging (see Setting up logging).