6.2.5. Pagination Recipe¶
This recipe explains how to paginate over view results. Pagination is a user interface (UI) pattern that allows the display of a large number of rows (the result set) without loading all the rows into the UI at once. A fixed-size subset, the page, is displayed along with next and previous links or buttons that can move the viewport over the result set to an adjacent page.
We assume you’re familiar with creating and querying documents and views as well as the multiple view query options.
Example Data¶
To have some data to work with, we’ll create a list of bands, one document per band:
{ "name":"Biffy Clyro" }
{ "name":"Foo Fighters" }
{ "name":"Tool" }
{ "name":"Nirvana" }
{ "name":"Helmet" }
{ "name":"Tenacious D" }
{ "name":"Future of the Left" }
{ "name":"A Perfect Circle" }
{ "name":"Silverchair" }
{ "name":"Queens of the Stone Age" }
{ "name":"Kerub" }
A View¶
We need a simple map function that gives us an alphabetical list of band names. This should be easy, but we’re adding extra smarts to filter out “The” and “A” in front of band names to put them into the right position:
function(doc) {
if(doc.name) {
var name = doc.name.replace(/^(A|The) /, "");
emit(name, null);
}
}
The views result is an alphabetical list of band names. Now say we want to display band names five at a time and have a link pointing to the next five names that make up one page, and a link for the previous five, if we’re not on the first page.
We learned how to use the startkey
, limit
, and skip
parameters in
earlier documents. We’ll use these again here. First, let’s have a look at
the full result set:
{"total_rows":11,"offset":0,"rows":[
{"id":"a0746072bba60a62b01209f467ca4fe2","key":"Biffy Clyro","value":null},
{"id":"b47d82284969f10cd1b6ea460ad62d00","key":"Foo Fighters","value":null},
{"id":"45ccde324611f86ad4932555dea7fce0","key":"Tenacious D","value":null},
{"id":"d7ab24bb3489a9010c7d1a2087a4a9e4","key":"Future of the Left","value":null},
{"id":"ad2f85ef87f5a9a65db5b3a75a03cd82","key":"Helmet","value":null},
{"id":"a2f31cfa68118a6ae9d35444fcb1a3cf","key":"Nirvana","value":null},
{"id":"67373171d0f626b811bdc34e92e77901","key":"Kerub","value":null},
{"id":"3e1b84630c384f6aef1a5c50a81e4a34","key":"Perfect Circle","value":null},
{"id":"84a371a7b8414237fad1b6aaf68cd16a","key":"Queens of the Stone Age","value":null},
{"id":"dcdaf08242a4be7da1a36e25f4f0b022","key":"Silverchair","value":null},
{"id":"fd590d4ad53771db47b0406054f02243","key":"Tool","value":null}
]}
Setup¶
The mechanics of paging are very simple:
- Display first page
- If there are more rows to show, show next link
- Draw subsequent page
- If this is not the first page, show a previous link
- If there are more rows to show, show next link
Or in a pseudo-JavaScript snippet:
var result = new Result();
var page = result.getPage();
page.display();
if(result.hasPrev()) {
page.display_link('prev');
}
if(result.hasNext()) {
page.display_link('next');
}
Paging¶
To get the first five rows from the view result, you use the ?limit=5
query parameter:
curl -X GET http://127.0.0.1:5984/artists/_design/artists/_view/by-name?limit=5
The result:
{"total_rows":11,"offset":0,"rows":[
{"id":"a0746072bba60a62b01209f467ca4fe2","key":"Biffy Clyro","value":null},
{"id":"b47d82284969f10cd1b6ea460ad62d00","key":"Foo Fighters","value":null},
{"id":"45ccde324611f86ad4932555dea7fce0","key":"Tenacious D","value":null},
{"id":"d7ab24bb3489a9010c7d1a2087a4a9e4","key":"Future of the Left","value":null},
{"id":"ad2f85ef87f5a9a65db5b3a75a03cd82","key":"Helmet","value":null}
]}
By comparing the total_rows
value to our limit
value,
we can determine if there are more pages to display. We also know by the
offset member that we are on the first page. We can calculate the value for
skip=
to get the results for the next page:
var rows_per_page = 5;
var page = (offset / rows_per_page) + 1; // == 1
var skip = page * rows_per_page; // == 5 for the first page, 10 for the second ...
So we query CouchDB with:
curl -X GET 'http://127.0.0.1:5984/artists/_design/artists/_view/by-name?limit=5&skip=5'
Note we have to use '
(single quotes) to escape the &
character that is
special to the shell we execute curl in.
The result:
{"total_rows":11,"offset":5,"rows":[
{"id":"a2f31cfa68118a6ae9d35444fcb1a3cf","key":"Nirvana","value":null},
{"id":"67373171d0f626b811bdc34e92e77901","key":"Kerub","value":null},
{"id":"3e1b84630c384f6aef1a5c50a81e4a34","key":"Perfect Circle","value":null},
{"id":"84a371a7b8414237fad1b6aaf68cd16a","key":"Queens of the Stone Age",
"value":null},
{"id":"dcdaf08242a4be7da1a36e25f4f0b022","key":"Silverchair","value":null}
]}
Implementing the hasPrev()
and hasNext()
method is pretty
straightforward:
function hasPrev()
{
return page > 1;
}
function hasNext()
{
var last_page = Math.floor(total_rows / rows_per_page) +
(total_rows % rows_per_page);
return page != last_page;
}
Paging (Alternate Method)¶
The method described above performed poorly with large skip values until CouchDB 1.2. Additionally, some use cases may call for the following alternate method even with newer versions of CouchDB. One such case is when duplicate results should be prevented. Using skip alone it is possible for new documents to be inserted during pagination which could change the offset of the start of the subsequent page.
A correct solution is not much harder. Instead of slicing the result set
into equally sized pages, we look at 10 rows at a time and use startkey
to
jump to the next 10 rows. We even use skip, but only with the value 1.
Here is how it works:
- Request rows_per_page + 1 rows from the view
- Display rows_per_page rows, store + 1 row as next_startkey and next_startkey_docid
- As page information, keep
startkey
and next_startkey - Use the next_* values to create the next link, and use the others to create the previous link
The trick to finding the next page is pretty simple. Instead of requesting 10
rows for a page, you request 11 rows, but display only 10 and use the values
in the 11th row as the startkey
for the next page. Populating the link to
the previous page is as simple as carrying the current startkey
over to the
next page. If there’s no previous startkey
, we are on the first page. We
stop displaying the link to the next page if we get rows_per_page or less
rows back. This is called linked list pagination, as we go from page to
page, or list item to list item, instead of jumping directly to a
pre-computed page. There is one caveat, though. Can you spot it?
CouchDB view keys do not have to be unique; you can have multiple index
entries read. What if you have more index entries for a key than rows that
should be on a page? startkey
jumps to the first row, and you’d be screwed
if CouchDB didn’t have an additional parameter for you to use. All view keys
with the same value are internally sorted by docid, that is, the ID of
the document that created that view row. You can use the startkey_docid
and endkey_docid
parameters to get subsets of these rows. For
pagination, we still don’t need endkey_docid
, but startkey_docid
is very
handy. In addition to startkey
and limit
, you also use
startkey_docid
for pagination if, and only if, the extra row you fetch to
find the next page has the same key as the current startkey
.
It is important to note that the *_docid parameters only work in addition to the *key parameters and are only useful to further narrow down the result set of a view for a single key. They do not work on their own (the one exception being the built-in _all_docs view that already sorts by document ID).
The advantage of this approach is that all the key operations can be performed on the super-fast B-tree index behind the view. Looking up a page doesn’t include scanning through hundreds and thousands of rows unnecessarily.
Jump to Page¶
One drawback of the linked list style pagination is that you can’t
pre-compute the rows for a particular page from the page number and the rows
per page. Jumping to a specific page doesn’t really work. Our gut reaction,
if that concern is raised, is, “Not even Google is doing that!” and we tend
to get away with it. Google always pretends on the first page to find 10 more
pages of results. Only if you click on the second page (something very few
people actually do) might Google display a reduced set of pages. If you page
through the results, you get links for the previous and next 10 pages,
but no more. Pre-computing the necessary startkey
and startkey_docid
for 20 pages is a feasible operation and a pragmatic optimization to know the
rows for every page in a result set that is potentially tens of thousands
of rows long, or more.
If you really do need to jump to a page over the full range of documents (we have seen applications that require that), you can still maintain an integer value index as the view index and take a hybrid approach at solving pagination.