Providing observe functions

As of Couchbase Server 2.0, the underlying binary protocol provides the ability to observe items. This means an application can determine whether a document has been persisted to disk, or exists on a replica node. This provides developers assurance that a document will survive node failure. In addition, since the new views functionality of Couchbase Server will only index a document and include it in a view once the document is persisted, an observe function provides assurance that a document will or will not be in a view.

Before you provide an observe-function, you need to understand how to retrieve cluster topology for your SDK. In other words, your SDK needs to be able to determine if a key is on a master and/or replica nodes. The observe-function that you provide in your SDK will need to be sent from your SDK to an individual node where the key exists; therefore being able to retrieve cluster topology is critical to implement an observe. Your SDK must also be able to be ‘cluster-aware’. This means that your SDK should be able to get updated cluster topology after node failure, rebalance, or node addition. For more information about getting cluster topology from an SDK, see Getting cluster topology.

To provide an observe function in your SDK, you send the following binary request from an SDK:

Byte/     0       |       1       |       2       |       3       |
     /              |               |               |               |
    |0 1 2 3 4 5 6 7|0 1 2 3 4 5 6 7|0 1 2 3 4 5 6 7|0 1 2 3 4 5 6 7|
    +---------------+---------------+---------------+---------------+
   0| 0x80          | 0x92          | 0x00          | 0x00          |
    +---------------+---------------+---------------+---------------+
   4| 0x00          | 0x00          | 0x00          | 0x00          |
    +---------------+---------------+---------------+---------------+
   8| 0x00          | 0x00          | 0x00          | 0x14          |
    +---------------+---------------+---------------+---------------+
  12| 0xde          | 0xad          | 0xbe          | 0xef          |
    +---------------+---------------+---------------+---------------+
  16| 0x00          | 0x00          | 0x00          | 0x00          |
    +---------------+---------------+---------------+---------------+
  20| 0x00          | 0x00          | 0x00          | 0x00          |
    +---------------+---------------+---------------+---------------+
  24| 0x00          | 0x04          | 0x00          | 0x05          |
    +---------------+---------------+---------------+---------------+
  28| 0x68 ('h')    | 0x65 ('e')    | 0x6c ('l')    | 0x6c ('l')    |
    +---------------+---------------+---------------+---------------+
  32| 0x6f ('o')    | 0x00          | 0x05          | 0x00          |
    +---------------+---------------+---------------+---------------+
  36| 0x05          | 0x77 ('w')    | 0x6f ('o')    | 0x72 ('r')    |
    +---------------+---------------+---------------+---------------+
  40| 0x6c ('l')    | 0x64 ('d')    |
    +---------------+---------------+
observe command
Field        (offset) (value)
Magic        (0)    : 0x80
Opcode       (1)    : 0x92
Key length   (2,3)  : 0x0000
Extra length (4)    : 0x00
Data type    (5)    : 0x00
Vbucket      (6,7)  : 0x0000
Total body   (8-11) : 0x00000014
Opaque       (12-15): 0xdeadbeef
CAS          (16-23): 0x0000000000000000
Key #0
    vbucket  (24-25): 0x0004
    keylen   (26-27): 0x0005
             (28-32): "hello"
Key #1
    vbucket  (33-34): 0x0005
    keylen   (35-36): 0x0005
             (37-41): "world"

In this type of binary request, all the information that follows the CAS value is considered payload. All information up to and including the CAS value is considered header data. The format of this request is similar to any other Couchbase Server read/write request, but there are differences in the header and payload. Here we specify the key that we want to observe as payload, beginning with Key #0. In this example, we provide two keys that we want to observe, hello and world. The Opcode : 0x92 indicates to Couchbase Server that this is an observe request.

Your SDK should build a binary request packet once for all the keys that will be observed. After your SDK sends the request to all master and replica nodes containing the key, a node will send back one response with all keys that exist on that node.

When you make a binary request, you are providing the functional equivalent of the following Couchbase Server STAT requests which are used in the telnet protocol:

  • STAT key_is_dirty : If Couchbase Server responds with a value of 0, this means a key is persisted; if key_is_dirty has the value 1, the key is not yet persisted.

  • STAT key_cas : Couchbase Server provides the current CAS value for a key as a response. This type of information is helpful to use in your SDK to determine if a key has been updated before you perform an observe.

You will determine how often your SDK will poll Couchbase Server as part of an observe request. Keep in mind that you should take into account your expected server workload. You will also need to determine a standard timeout for the observe function you provide; as an option you can provide the developer the ability to set a timeout as a parameter. The following statistics are useful in determining how often you should poll:

  • Persist Stat : check this server statistic within your SDK to determine how many milliseconds it takes for a key to be persisted.

  • Repl Stat : check this server statistic to determine how many milliseconds it takes for a key to be placed on a replica node.

When Couchbase Server responds to an observe request, it will be in the following binary format:

Byte/     0       |       1       |       2       |       3       |
     /              |               |               |               |
    |0 1 2 3 4 5 6 7|0 1 2 3 4 5 6 7|0 1 2 3 4 5 6 7|0 1 2 3 4 5 6 7|
    +---------------+---------------+---------------+---------------+
   0| 0x81          | 0x92          | 0x00          | 0x00          |
    +---------------+---------------+---------------+---------------+
   4| 0x00          | 0x00          | 0x00          | 0x00          |
    +---------------+---------------+---------------+---------------+
   8| 0x00          | 0x00          | 0x00          | 0x26          |
    +---------------+---------------+---------------+---------------+
  12| 0xde          | 0xad          | 0xbe          | 0xef          |
    +---------------+---------------+---------------+---------------+
  16| 0x00          | 0x00          | 0x03          | 0xe8          |
    +---------------+---------------+---------------+---------------+
  20| 0x00          | 0x00          | 0x00          | 0x64          |
    +---------------+---------------+---------------+---------------+
  24| 0x00          | 0x04          | 0x00          | 0x05          |
    +---------------+---------------+---------------+---------------+
  28| 0x68 ('h')    | 0x65 ('e')    | 0x6c ('l')    | 0x6c ('l')    |
    +---------------+---------------+---------------+---------------+
  32| 0x6f ('o')    | 0x01          | 0x00          | 0x00          |
    +---------------+---------------+---------------+---------------+
  36| 0x00          | 0x00          | 0x00          | 0x00          |
    +---------------+---------------+---------------+---------------+
  40| 0x00          | 0x0a          | 0x00          | 0x05          |
    +---------------+---------------+---------------+---------------+
  44| 0x00          | 0x05          | 0x77 ('w')    | 0x6f ('o')    |
    +---------------+---------------+---------------+---------------+
  48| 0x72 ('r')    | 0x6c ('l')    | 0x64 ('d')    | 0x00          |
    +---------------+---------------+---------------+---------------+
  52| 0xde          | 0xad          | 0xbe          | 0xef          |
    +---------------+---------------+---------------+---------------+
  56| 0xde          | 0xad          | 0xca          | 0xfe          |
    +---------------+---------------+---------------+---------------+
observe response
Field        (offset) (value)
Magic        (0)    : 0x81
Opcode       (1)    : 0x92
Key length   (2,3)  : 0x0000
Extra length (4)    : 0x00
Data type    (5)    : 0x00
Status       (6,7)  : 0x0000
Total body   (8-11) : 0x00000026
Opaque       (12-15): 0xdeadbeef
Persist Stat (16-19): 0x000003e8 (msec time)
Repl Stat    (20-23): 0x00000064 (msec time)
Key #0
    vbucket  (24-25): 0x0004
    keylen   (26-27): 0x0005
             (28-32): "hello"
    keystate (33)   : 0x01 (persisted)
    cas      (34-41): 000000000000000a
Key #1
    vbucket  (42-43): 0x0005
    keylen   (44-45): 0x0005
             (46-50): "world"
    keystate (51)   : 0x00 (not persisted)
    cas      (52-59): deadbeefdeadcafe

In the Couchbase Server response, keystate will indicate whether a key is persisted or not. The following are possible values for keystate :

  • 0x00 : Found, not persisted. Indicates key is in RAM, but not persisted to disk.

  • 0x01 : Found, persisted. Indicates key is found in RAM, and is persisted to disk

  • 0x80 : Not found. Indicates the key is persisted, but not found in RAM. In this case, a key is not available in any view/index. Couchbase Server will return this keystate for any item that is not stored in the server. It indicates you will not expect to have the item in a view/index.

  • 0x81 : Logically deleted. Indicates an item is in RAM, but is not yet deleted from disk.

It is important that you understand the difference between ‘not found’ and ‘logically deleted.’ The context in which your SDK receives this message is important. If an SDK performs a write for a key and the key is not found, then the responses ‘not found’ and ‘logically deleted’ indicate the same state of a key. After an SDK performs a document write, the first thing the SDK needs to determine is whether or not the item has been stored on the right node; in this scenario, the ‘not found’ and ‘logically deleted’ response both mean that the item is not yet stored on the appropriate node.

If the SDK performs a delete on a key, then the observe responses ‘not found’ and ‘logically deleted’ have two different meanings about a key. If Couchbase Server returns ‘not found’ for a delete operation, this means that the delete has been persisted on that node. If you receive a ‘logically deleted’ response then it means that the item has been removed from Couchbase Server RAM but the item is not yet deleted from disk.

As a final note, should you choose to provide an observe-function as an asynchronous method, you need to provide an ‘observe-set’ as part of your SDK. An observe-set is a table that stores all the ongoing observe requests sent from the SDK. When Couchbase Server fulfills an observe request by providing all required status updates for a key, your SDK should remove an observe request from the observe-set. In the SDK you should naturally also provide a function that retrieves any asynchronous observe results that are received from Couchbase Server and stored in SDK run time memory.