Timeout errors

Timeout errors when querying a view with stale=false.

When querying a view with stale=false, you get often timeout errors for one or more nodes. These nodes are nodes that did not receive the original query request, for example you query node 1, and you get timeout errors for nodes 2, 3 and 4 as in the example below (view with reduce function _count):

> curl -s 'http://localhost:9500/default/_design/dev_test2/_view/view2?full_set=true&stale=false'
{"rows":[
  {"key":null,"value":125184}
],
"errors":[
  {"from":"http://192.168.1.80:9503/_view_merge/?stale=false","reason":"timeout"},
  {"from":"http://192.168.1.80:9501/_view_merge/?stale=false","reason":"timeout"},
  {"from":"http://192.168.1.80:9502/_view_merge/?stale=false","reason":"timeout"}
]
}

The problem here is that by default, for queries with stale=false (full consistency), the view merging node (node which receive the query request, node 1 in this example) waits up to 60000 milliseconds (1 minute) to receive partial view results from each other node in the cluster. If it waits for more than 1 minute for results from a remote node, it stops waiting for results from that node and a timeout error entry is added to the final response. A stale=false request blocks a client, or the view merger node as in this example, until the index is up to date, so these timeouts can happen frequently.

If you look at the logs from those nodes you got a timeout error, you’ll see the index build/update took more than 60 seconds, example from node 2:

[couchdb:info,2012-08-20T15:21:13.150,[email protected]:<0.6234.0>:couch_log:info:39] Set view
  `default`, main group `_design/dev_test2`, updater finished
 Indexing time: 93.734 seconds
 Blocked time:  10.040 seconds
 Inserted IDs:  124960
 Deleted IDs:   0
 Inserted KVs:  374880
 Deleted KVs:   0
 Cleaned KVs:   0

In this case, node 2 took 103.774 seconds to update the index.

In order to avoid those timeouts, you can pass a large connection_timeout in the view query URL, example:

> time curl -s
 'http://localhost:9500/default/_design/dev_test2/_view/view2?full_set=true&stale=false&connection_timeout=999999999'
{"rows":[
{"key":null,"value":2000000}
]
}
real  2m44.867s
user   0m0.007s
sys    0m0.007s

And in the logs of nodes 1, 2, 3 and 4, respectively you’ll see something like this:

node 1, view merger node

[couchdb:info,2012-08-20T16:10:02.887,[email protected]:<0.27674.0>:couch_log:info:39] Set view
  `default`, main group `_design/dev_test2`, updater
finished
 Indexing time: 155.549
seconds
 Blocked time:  0.000 seconds
 Inserted IDs:96
 Deleted IDs:   0
 Inserted KVs:  1500288
 Deleted KVs:   0
 Cleaned KVs:   0

node 2

[couchdb:info,2012-08-20T16:10:28.457,[email protected]:<0.6071.0>:couch_log:info:39] Set view
  `default`, main group `_design/dev_test2`, updater
finished
 Indexing time: 163.555
seconds
 Blocked time:  0.000 seconds
 Inserted IDs:  499968
 Deleted IDs:   0
 Inserted KVs:  1499904
 Deleted KVs:   0
 Cleaned KVs:   0

node 3

[couchdb:info,2012-08-20T16:10:29.710,[email protected]:<0.6063.0>:couch_log:info:39] Set view
  `default`, main group `_design/dev_test2`, updater
finished
 Indexing time: 164.808
seconds
 Blocked time:  0.000 seconds
 Inserted IDs:  499968
 Deleted IDs:   0
 Inserted KVs:  1499904
 Deleted KVs:   0
 Cleaned KVs:   0

node 4

[couchdb:info,2012-08-20T16:10:26.686,[email protected]:<0.6063.0>:couch_log:info:39] Set view
  `default`, main group `_design/dev_test2`, updater
finished
 Indexing time: 161.786
seconds
 Blocked time:  0.000 seconds
 Inserted IDs:  499968
 Deleted IDs:   0
 Inserted KVs:  1499904
 Deleted KVs:   0
 Cleaned KVs:   0