Scaling Basics

Scaling is an overloaded term. Finding a discrete definition is tricky. Everyone and her grandmother have their own idea what scaling means. Most definitions are valid, but they can be contradicting. To make things even worse, there are a lot of misconceptions about scaling. To find out what it is, one needs a scalpel to find out the important bits.

First, scaling doesn’t refer to a specific technique or technology, scaling, or scalable, is an attribute of a specific architecture. What is being scaled varies for nearly each project.

Joe’s quote is the one that we find is the most accurate description of scaling. It is also wishy-washy; but that is the nature of scaling. An example: A website like facebook.com with a whole lot of users and data associated with these users and with more and more users coming in everyday might want to scale over user-data that typically lives in a database. In contrast flickr.com is at it’s core like Facebook with users and data for users, but in flickr’s case, the data that grows fastet is images uploaded by users. These images do not necessarily live in a database so scaling image storage is flickr’s path to growth. [fact check and/or find better example].

It is common to think of scaling as scaling up. This is shortsighted. Scaling can also mean scaling down. Being able to use fewer computers, when demand declines. More on that later.

These are just two services. There are a lot more and every one has different things they want to scale. CouchDB is a database; we are not going to cover every aspect of scaling any system. We concentrate on the bits that are interesting to you, the CouchDB user. We identified three general properties that you can scale with CouchDB:

Scaling Read Requests

A read request is retrieves a piece of information from the database. It passes the following stations within CouchDB: The HTTP server module needs to accept the request. For that, it opens a socket to send data over. The next station is the HTTP request handle module that analyzes the request and directs it to the appropriate sub-module in CouchDB. For single documents, the request then gets passed to the database module where the data for the document is looked up on the filesystem and returned all the way up again.

All this takes processing time and enough sockets (or file descriptors) must be available. The storage backend of the server must be able to fulfill all read requests. There are a few more things that can limit a system to accept more read requests; the basic point here is that a single server can only process so many concurrent requests. If your applications generates more requests you need to set up a second server that your application can read from.

The nice thing about read requests is that they can be cached. Often-used items can be held in memory and can be returned at a much higher level than the one that is your bottleneck. Requests that can use this cache, don’t ever hit your database and are thus virtually toll-free. Chapter 15 “Load Balancing” explains this scenario.

Scaling Write Requests

A write requests is like a read request, only a little worse. It not only reads a piece of data from disk, it writes it back after modifying it. Remember the nice thing about reads being cacheable. Writes: Not so much. A cache must be notified when a write changes data or clients must be told to not use the cache. If you have multiple servers for scaling reads, a write must occur on all servers. In any case, you need to work harder with a write. Chapter 18 “Multi-Master” covers methods for scaling write requests across servers.

Scaling Data

The third way of scaling is scaling data. Todays hard drives are cheap and have a lot of capacity, and it only gets better in the future, but there is only so much data a single server can make sensible use of. It must maintain one more indexes to the data which uses disk space again. Creating backups will take longer and other maintenance tasks become a pain.

The solution is to chop the data into manageable chunks and put each chunk on a separate server. All servers with a chunk now form a cluster that holds all your data. Chapter 20 “Database Partitioning” takes a look at creating and using these clusters.

While we are taking separate looks at scaling of reads, writes, and data, these rarely occur isolated. Decisions to scale one will affect the others. We will describe individual as well as combined solutions in the following chapters.

Basics First

Replication is the basis for all of the three scaling methods. Before we go scaling, Chapter 14 “Replication” will make you familiar with CouchDB’s excellent replication feature.