Views let you sort things by any value of your data, even complex JSON-keys are possible as we’ve seen in earlier chapters. By-data sorting is very useful to allow users to find things quickly, a name is much easier to find in a list of names that is sorted alphabetically. Humans naturally resort to a divide-an-conquer algorithm (sounds familiar? :) and don’t consider a large part of the input set because they know the name won’t show up there. Likewise, sorting by number and date helps a great deal to let users manage their ever increasing amounts of data.
There’s another sorting type that is a little more fuzzy. Search engines show you results in order of relevance. That relevance is what the search engine thinks is most relevant to you given your search term (and potential search and surfing history). There are other systems trying to infer from earlier data what is most relevant to you, but they have the near-to-impossible task to guess what a user is interested in. Computers are notoriously bad at guessing.
The easiest way for a computer to figure out what’s most relevant for a user is letting the user prioritize things. Take a todo application for example: it allows users to reorder todo items so they know what they need to work on next. The underlying problem, keeping a user-defined sorting order can be found in a number of other places.
Let’s stick with the todo application. The naïve approach is pretty easy: with each todo item we store an integer that specifies the location in a list. We use a view to get all todo items in the right order.
Example: A Bunch of Todo Item Documents
{
"title":"Remember the Milk",
"date":"2009-07-22T09:53:37",
"sort_order":2
}
{
"title":"Call Fred",
"date":"2009-07-21T19:41:34",
"sort_order":3
}
{
"title":"Gift for Amy",
"date":"2009-07-19T17:33:29",
"sort_order":4
}
{
"title":"Laundry",
"date":"2009-07-22T14:23:11",
"sort_order":1
}
Example: A View to Sort Todo Items
function(todo) {
if(todo.sort_order && todo.title) {
emit(todo.sort_order, todo.title);
}
}
Example: The Views' Result
{
"total_rows": 4,
"offset": 0,
"rows": [
{
"key":1,
"value":"Laundry",
"id":"..."
},
{
"key":2,
"value":"Remember the Milk",
"id":"..."
},
{
"key":3,
"value":"Call Fred",
"id":"..."
},
{
"key":4,
"value":"Gift for Amy",
"id":"..."
}
]
}
That looks reasonably easy but can you spot the problem? Here’s a hint: what do you have to do if getting a gift for Amy becomes a higher priority than the milk? Conceptually, the work required is simple:
-
Assign "Gift for Amy" the sort_order of "Remember the Milk".
-
Increment the sort_order of "Remember the Milk" and all items that follow by one.
Under the hood, this is a lot of work. With CouchDB you’d have to load every document, increment the sort_order and save it back. If you have a lot of todo items (I do), then this is some significant work. Maybe there’s a better approach.
The fix is simple: Instead of using an integer to specify the sort order, we use a float:
{
"title":"Remember the Milk",
"date":"2009-07-22T09:53:37",
"sort_order":0.2
}
{
"title":"Call Fred",
"date":"2009-07-21T19:41:34",
"sort_order":0.3
}
{
"title":"Gift for Amy",
"date":"2009-07-19T17:33:29",
"sort_order":0.4
}
{
"title":"Laundry",
"date":"2009-07-22T14:23:11",
"sort_order":0.1
}
The view stays the same. Reading this is as easy as the previous approach. Reordering becomes much easier now. The application frontend can keep a copy of the sort_order values around, so when we move an item and store the move, we do not only have available the new position but also the sort_order value for the two new surrounding items.
Let’s move "Gift for Amy" above "Remember the Milk". The surrounding sort_order in the target position are 0.1 and 0.2. To store "Gift for Amy" with the correct sort_order, we simply use the median of the two surrounding values: (0.1 + 0.2) / 2 = 0.3 / 2 = 0.15.
If we query the view again now we get the desired result:
Example: The New Views Result
{
"total_rows": 4,
"offset": 0,
"rows": [
{
"key":0.1,
"value":"Laundry",
"id":"..."
},
{
"key":0.15,
"value":"Gift for Amy",
"id":"..."
},
{
"key":0.2,
"value":"Remember the Milk",
"id":"..."
},
{
"key":0.3,
"value":"Call Fred",
"id":"..."
}
]
}
The downside of this approach is that with an increasing number of reorderings, float precision can become an issue as digits “grow” infinitely. One solution is to not care and expect that a single user will not exceed any limits. Alternatively, an administrative task can reset the whole list to single-decimals when a user is not active.
The advantage of this approach is that you only have to touch a single document which is efficient for storing the new ordering of a list and updating the view that maintains the ordered index since only only the changed document has to be incorporated into the index.