Table of Contents

CouchDB’s JSON documents are great for programatic access in most environments. Almost all languages have HTTP and JSON libraries, and in the unlikely event that yours doesn’t, writing them is fairly simple. However, there is one important use case that JSON documents don’t cover: building plain old HTML web pages. Browsers are powerful and it’s exciting that we can build Ajax applications using only CouchDB’s JSON and HTTP APIs, but this approach is not appropriate for most public facing websites.

HTML is the lingua franca of the web, for good reasons. By rendering our JSON documents into HTML pages, we make them available and accessible for a wider variety of uses. With the pure-Ajax approach, visually impaired visitors to our blog stand a chance of not seeing any useful content at all, as popular screen-reading browsers have a hard time making sense of pages when the content is changed on the fly via JavaScript. Another important concern for authors is that their writing be indexed by search engines. Maintaining a high-quality blog doesn’t do much good if readers can’t find it via a web search. Most search engines do not execute JavaScript found within a page, so to them an Ajax blog looks devoid of content. We also musn’t forget that HTML is likely more friendly as an archive format in the long-term, than the platform-specific JavaScript and JSON approach we used in the previous section. Also, by serving plain HTML we make our site snappier, as the browser can render meaningful content with fewer round-trips to the server. These are just a few of the reasons it makes sense to provide web content as HTML.

The traditional way to accomplish the goal of rendering HTML from database records is by using a middle-tier application server, such as Ruby on Rails or Django, which loads the appropriate records for a user request, runs a template function using them, and returns the resulting HTML to the visitors browser. The basics of this don’t have change in CouchDB’s case; wrapping JSON views and documents with an application server is relatively straightforward. Rather than using browser-side JavaScript load JSON from CouchDB and render dynamic pages, Rails or Django (or your framework of choice) could make those same HTTP requests against CouchDB, render the output to HTML, and return it to the browser. We won’t cover this approach in this book, as it’s specific to particular languages and frameworks, and surveying the different options would take more space than you want to read.

CouchDB includes functionality designed to make it possible to do most of what an application tier would do, without relying on additional software. The appeal of this approach is that CouchDB can serve the whole application without dependencies on a complex environment such as might be maintained on a production web server. Because CouchDB is designed to run on client computers, where the environment is out of control of application developers, having some built in templating capabilities greatly expands the potential uses of these applications. When your application can be served by a standard CouchDB instance you gain deployment ease and flexibility.

The Show Function API

Show functions, as they are called, have a constrained API designed to ensure cacheability and side-effect free operation. This is in stark contrast to other application servers, which give the programmer the freedom to run any operation as the result of any request. Let’s look at a few example show functions.

The most basic show function looks something like this:

function(doc, req) {
  return '<h1>' + doc.title + '</h1>';
}

When run with a document that has a field called title with the content "Hello World", this function will send an HTTP response with the default content-type of text/html, the UTF-8 character encoding, and the body <h1>Hello World</h1>.

The simplicity of the request/response cycle of a show function is hard to overstate. The most common question we hear about it is, "how can I load another document so that I can render its content as well?" The short answer is that you can’t. The longer answer is that for some applications you might use a list function to render a view result as HTML, which gives you the opportunity to use more than one document as the input of your function. We’ll cover list functions in the next chapter.

The basic function from a document and a request to a response, with no side effects and no alternative inputs, stays the same even as we start using more advanced features. Here’s a more complex show function illustrating the ability to set custom headers:

function(doc, req) {
  return {
    body : '<foo>' + doc.title + '</foo>',
    headers : {
      "Content-Type" : "application/xml",
      "X-My-Own-Header": "you can set your own headers"
    }
  }
}

If this function were called with the same document as we used in the previous example, the response would have a content-type of application/xml, and the body <foo>Hello World</foo>. You should be able to see from this, how you’d be able to use show functions to generate any output you need, from any of your documents.

Popular uses of show functions are for outputting HTML page, CSV files, or XML needed for compatibiity with a particular interface. The CouchDB test suite even illustrates using show functions to output a PNG image. To output binary data, there is the option to return a Base-64 encoded string, like this:

function(req, doc) {
  return {
    base64 :
      ["iVBORw0KGgoAAAANSUhEUgAAABAAAAAQCAMAAAAoLQ9TAAAAsV",
        "BMVEUAAAD////////////////////////5ur3rEBn////////////////wDBL/",
        "AADuBAe9EB3IEBz/7+//X1/qBQn2AgP/f3/ilpzsDxfpChDtDhXeCA76AQH/v7",
        "/84eLyWV/uc3bJPEf/Dw/uw8bRWmP1h4zxSlD6YGHuQ0f6g4XyQkXvCA36MDH6",
        "wMH/z8/yAwX64ODeh47BHiv/Ly/20dLQLTj98PDXWmP/Pz//39/wGyJ7Iy9JAA",
        "AADHRSTlMAbw8vf08/bz+Pv19jK/W3AAAAg0lEQVR4Xp3LRQ4DQRBD0QqTm4Y5",
        "zMxw/4OleiJlHeUtv2X6RbNO1Uqj9g0RMCuQO0vBIg4vMFeOpCWIWmDOw82fZx",
        "vaND1c8OG4vrdOqD8YwgpDYDxRgkSm5rwu0nQVBJuMg++pLXZyr5jnc1BaH4GT",
        "LvEliY253nA3pVhQqdPt0f/erJkMGMB8xucAAAAASUVORK5CYII="].join(''),
    headers : {
      "Content-Type" : "image/png"
    }
  };
}

The above function outputs a 16 x 16 pixel version of the CouchDB logo. The JavaScript code necessary to generate images from document contents would likely be quite complex, but the ability to send Base-64 encoded binary data means that query servers written in other launguages like C or PHP have the ability to output any data type.

Side-Effect Free

We’ve mentioned that a key constraint of show functions is that they are side effect free. This means that you can’t use them to update documents, kick off background processes, or trigger any other function. In the big picture, this is a good thing, as it allows CouchDB to give performance and reliability gaurantees that standard web frameworks can’t. Because a show function will always return the same result given the same input, and can’t change anything about the environment in which it runs, it’s output can be cached and intelligently reused. In a high-availability deployment with proper caching, this means that a given show function will only be called once for any particular document, and the CouchDB server may not even be contacted for subsequent requests.

Working without side-effects can be a little bit disorienting for developers who are used to the anything-goes approach offered by most application servers. It’s considered best practice to ensure that actions run in response to GET requests are side-effect free and cacheable, but rarely do we have the discipline to achieve that goal. CouchDB takes a different tack: because we’re a database, not an application server, we think it’s more important to enforce best practices (and ensure that developers don’t write functions that adversely effect the database server) than offer absolute flexibility. Once you’re used to working within these constraints, they start to make a lot of sense. (There’s a reason they are considered best practices.)

Design Doc

Before we look into show functions themselves, we’ll quickly review how they are stored on design documents. CouchDB looks for show functions stored in a top-level field called shows which is named like this to be parallel with views, lists and filters. Here’s an example design document that defines two show functions:

{
  "_id" : "_design/show-function-examples",
  "shows" : {
    "summary" : "function(doc, req){ ... }",
    "detail" : "function(doc, req){ ... }"
  }
}

There’s not much to note here except the fact that design documents can define multiple show functions. Now let’s see how these functions are run.

Querying Show Functions

We’ve described the show function API, but we haven’t yet seen how these functions are run. In CouchDB 0.10.0 There are two equivalent ways to query a show function, we’ll show them both here.

The first method is to add a query parameter to a regular document request. If the original document is located at /mydb/72d43a93eb74b5f2, then you can run a show function on it, and receive the formatted output instead of JSON, by querying a URL like this:

GET /mydb/72d43a93eb74b5f2?show=mydesign/myshow

Because show functions are stored in design documents, they are also available as resources within the design document path. The URLs used in this type of query are longer and less user-friendly, but they have the advantage that all resources provided by a particular design doc can be found under a common root, which makes custom application proxying simpler. Equivalent to the above URL, is this request against the design document resource:

GET /mydb/_design/mydesign/_show/myshow/72d43a93eb74b5f2

In either of these cases, if the document with id 72d43a93eb74b5f2 does not exist, the request will result in an HTTP 404 Not Found error response.

However, show functions can also be called without a document id at all, like this:

GET /mydb/_design/mydesign/_show/myshow

In this case, the doc argument to the function has the value null. This option is useful in cases where the show function can make sense without a document. For instance, in the example application we’ll explore in the next section, we use the same show function to provide for editing existing blog posts when a docid is given, as well as for composing new blog posts when no docid is given. The alternative would be to maintain an alternate resource (likely a static HTML attachment) with parallel functionality. As programmers we strive not to repeat ourselves, which motivated us to give show functions the ability to run without a document id.

Design Document Resources

In addition to the ability to run show functions, other resources are available within the design document path. You’ve now seen _view and _show, in the next chapter you’ll see _list and in the roadmap chapter we’ll talk about _update and _rewrite. This combination of features within the design document resource means that applications can be deployed without exposing the full CouchDB API to visitors, with only a simple proxy to rewrite the paths. We won’t got into full detail here, but the gist of it is that end users would run the above query from a path like this:

GET /_show/myshow/72d43a93eb74b5f2

Under the covers, an HTTP proxy would be programmed to prepend the database and design document portion of the path, in this case /mydb/_design/mydesign, so that CouchDB sees the standard query. With such a system in place, end users can only access the application via functions defined on the design document, so developers can enforce constraints and prevent access to raw JSON document and view data. While it doesn’t provide 100% security, using custom rewrite rules is an effective way to control the access end users have to a CouchDB application. This technique has been used in production by a few sites at the time of this writing.

Query Parameters

The request object (including helpfully parsed versions of query parameters) is available to show functions as well. By way of illustration, here’s a show function which returns different data based on the URL query parameters:

function(req, doc) {
  return "<p>Aye aye, " + req.parrot + "!</p>";
}

Requesting this function with a query parameter will result in the query parameter being used in the output:

GET /mydb/_design/mydesign/_show/myshow?parrot=Captain

In this case we’ll see the output: <p>Aye aye, Captain!</p>

Allowing URL parameters into the function does not effect cacheability, as each unique invocation results in a distinct URL. However, making heavy use of this feature will lower your cache effectiveness. Query parameters like this are most useful to do things like switch the mode or the format of the show function output. It’s recommended that you avoid using them for things like inserting custom content (such as the requesting user’s nickname) into the response, as that will mean that each users’s data must be cached seperately.

Accept Headers

Part of the HTTP spec allows for clients to give hints to the server about which content-types they are capable of accepting. At this time, the JavaScript query server shipped with CouchDB 0.10.0 contains helpers for working with Accept headers. However, web browser support for Accept headers is very poor, which has prompted frameworks such as Ruby on Rails to remove thier support for them. CouchDB may or may not follow suite here, but the fact remains that you are discouraged from relying on Accept headers for applications which will be accessed via web browsers.

There is a suite of helpers for Accept headers present as well, which allows you to specify the format in a query parameter as well. For instance

GET /db/_design/app/_show/post
Accept: application/xml

is equivalent to a similar URL with mismatched Accept headers. This is because browsers don’t use sensible Accept headers for feed URLs. Browsers 1, Accept headers 0. Yay browsers.

GET /db/_design/app/_show/post?format=xml
Accept: x-foo/whatever

The request function allows developers to switch response Content-types based on the client’s request. The next example adds the ablity to return either HTML, XML, or developer-designated content-type: "foo".

CouchDB’s main.js library provides the provides("format", render_function) function, which makes it easy for developers to handle client requests for multiple Mime Types in one form function.

This function also shows off the use of registerType(name, mime_types), which adds new types to mapping object used by respondWith. The end result is ultimate flexibility for developers, with an easy interface for handling different types of requests. main.js uses a JavaScript port of Mimeparse, an open source reference implementation, to provide this service.

Etags

We’ve mentioned that show function requests are side effect free and cacheable, but we haven’t discussed the mechanism used to accomplish this. Etags are a standard HTTP mechanism for indicating whether a cached copy of an HTTP response is still current. Essentially, when the client makes its first request to a resource, the response is accompanied by an Etag, which is an opaque string token unique to the version of the resource requested. The second time the client makes a request against the same resource, it sends along the original Etag with the request. If the server determines that the Etag still matches the resource, it can avoid sending the full response, instead replying with message that essentially says "you have the latest version already."

When implemented properly, the use of Etags can cut down significantly on server load.

CouchDB provides an Etag header, so that by using an HTTP proxy-cache like Varnish or memcached, you’ll instantly remove load from CouchDB.

cacheability ftw

Functions and Templates

CouchDB’s process runner only looks at the functions stored under show, but we’ll want to keep the template html seperate from the content negotiation logic. The couchapp script handles this for us, using the !code and !json handlers.

Let’s follow the show function logic through the files Sofa splits it into.

Example
show function
  explain !code and !json
  tiny screenshot of full json version

# show func code …

respondWith()
template
  <%= syntax %>
helper
ajax
  comments
  (one query for html)
  we'll show how to get the post and its comments in a single query later.

Learning Shows

Before we dig into the full complex beast (yeah, I said it) that will render the Post permalink pages, let’s look at some Hello World form examples. The first one just shows the function arguments, and the simplest possible return value.

figure/basic-form-function.jpg
Figure: Basic Form Function

A form is a JavaScript function that converts a document, and some details about the HTTP request, into an HTTP response. Typically it will be used to construct HTML, but it is also capable of returning Atom feeds, images, or even just filtered JSON.

The document argument is just like the documents passed to Map functions.

Using Templates

The only thing missing from the show function development experience, is the ability to render HTML without ruining your eyes looking at a whole lot of manual string concatenation, among other unpleasantries. Most programming environments solve this problem with templates, eg: documents which look like HTML but have portions of thier content filled out dynamically.

Dynamically combining template strings and data in JavaScript is a solved problem. However it hasn’t caught on partly because JavaScript doesn’t have very good support for multiline "heredoc" strings. After all, once you get through escaping quotes and leaving out newlines, it’s not much fun to edit HTML templates inlined into JavaScript code. We’d much rather keep our templates in seperate files, where we can avoid all the escaping work, and they can be syntax-highlighted by our editor.

The couchapp script has a couple of helpers to make working with templates and library code stored in design documents less painful. In the function below, we use them to load a Blog Post template, as well the JavaScript function responsible for rendering it.

The !json and !code macros provided by couchapp should be self-explanatory. couchapp uses them to insert data from the Design Document itself, into the function. They can also be used to import data into Map, Reduce and Validation functions.

!code inserts the string value found in the file lib/helpers/template.js directly into the function source code. If that string is anything other than valid JavaScript, you should expect a syntax error. On the other hand, !json inserts the value of any node of the design document as a json literal, storing it in a JavaScript variable for access during function execution.

After the couchapp macros have been executed the design document may have serveral copies of some JavaScript libraries and HTML templates, as they could have been inserted into more than one function using macros. Some users find this disconcerting, but as it is preprocessed code (and because couchapp clone knows how to unwind the macros) developers are never confronted with it in the course of their work. It simplifies the CouchDB implementation to require that function definitions are a simple string. The alternative which has been considered on the mailing list is to build a library loading system into the CouchDB JavaScript runtime. However this greatly raises the bar for implementers of alternate query servers as well as complicating the Erlang computations required to know which query server instance is serving which functions.

As you can see, we take the opportunity in the function to strip JavaScript tags from the form post. That regex is not secure, and the blogging application is only meant to be written to by it’s owners, so we should probably drop the regex, and simplify the function to avoid transforming the document, instead passing it directly to the template. Or else we should port a known-good santinitization routine from another language and provide it in the templates library.

Writing Templates

Working with templates, instead of trying to cram all the presentation into one file, makes editing forms a little more relaxing. The templates are stored in their own file, so you don’t have to worry about JavaScript or JSON encoding, and your text editor can highlight the template’s HTML syntax. CouchDB’s JavaScript query server includes the E4X extensions for JavaScript, which can be helpful for XML templates, but does not work well for HTML. We’ll explore E4X templates in a later chapter, when we cover forms for views, which makes providing an ATOM feed of view results easy and memory-efficient.

figure/blog-html-template.jpg
Figure: The Blog Post Template

Trust us when we say that looking at this HTML page is much more relaxing than trying to understand what a raw-JavaScript one is trying to do. The template library we’re using in the example blog is by John Resig, and was chosen for simplicity. It could easily be replaced by one of many other options, such as the Django template language, available in JavaScript.

This is a good time to note that CouchDB’s architecture is designed to make it simple to swap out languages for the query servers. With a query server written in Lisp or Python or Ruby (or any language that supports JSON and stdio) you could have an even wider variety of templating options. The CouchDB team recommends sticking with JavaScript as it provides the highest level of support and interoperability, but other options are available.