Advanced Search
Apple Developer Connection
Member Login Log In | Not a Member? Contact ADC

< Previous PageNext Page >

Locking Bottlenecks

It is difficult to communicate data between multiple threads or between thread and interrupt contexts without using locking or other synchronization. This locking protects your data from getting clobbered by another thread. However, it also has the unfortunate side effect of being a potential bottleneck.

In some types of communication (particularly n-way), locking can dramatically hinder performance by allowing only one thing to happen at a time. Read-write locks, discussed in “Synchronization Primitives”, can help alleviate this problem in the most common situation where multiple clients need to be able to read information but only rarely need to modify that data.

However, there are many cases where read-write locks are not helpful. This section discusses some possible problems and ways of improving performance within those constraints.

In this section:

Working With Highly Contended Locks
Reducing Contention by Decreasing Granularity


Working With Highly Contended Locks

When many threads need to obtain a lock (or a small number of threads need to obtain a lock frequently), this lock is considered highly contended. Highly contended locks frequently represent faulty code design, but they are sometimes unavoidable. In those cases, the lock tends to become a major performance bottleneck.

Take, for example, the issue of many-to-many communication that must be synchronized through a common buffer. While some improvement can be gained by using read-write locks instead of an ordinary mutex, the issue of multiple writers means that read-write locks still perform badly.

One possible solution for this many-to-many communication problem is to break the lock up into multiple locks. Instead of sharing a single buffer for the communication itself, make a shared buffer that contains accounting information for the communication (for example, a list of buffers available for reading). Then assign each individual buffer its own lock. The readers might then need to check several locations to find the right data, but this still frequently yields better performance, since writers must only contend for a write lock while modifying the accounting information.

Another solution for many-to-many communications is to eliminate the buffer entirely and communicate using message passing, sockets, IPC, RPC, or other methods.

Yet another solution is to restructure your code in a way that the locking is unnecessary. This is often much more difficult. One method that is often helpful is to take advantage of flags and atomic increments, as outlined in the next paragraph. For simplicity, a single-writer, single-reader example is presented, but it is possible to extend this idea to more complicated designs.

Take a buffer with some number of slots. Keep a read index and a write index into that buffer. When the write index and read index are the same, there is no data in the buffer. When writing, clear the next location. Then do an atomic increment on the pointer. Write the data. End by setting a flag at that new location that says that the data is valid.

Note that this solution becomes much more difficult when dealing with multiple readers and multiple writers, and as such, is beyond the scope of this section.

Reducing Contention by Decreasing Granularity

One of the fundamental properties of locks is granularity. The granularity of a lock refers to the amount of code or data that it protects. A lock that protects a large block of code or a large amount of data is referred to as a coarse-grained lock, while a lock that protects only a small amount of code or data is referred to as a fine-grained lock. A coarse-grained lock is much more likely to be contended (needed by one thread while being held by another) than a more finely grained lock.

There are two basic ways of decreasing granularity. The first is to minimize the amount of code executed while a lock is held. For example, if you have code that calculates a value and stores it into a table, don’t take the lock before calling the function and release it after the function returns. Instead, take the lock in that piece of code right before you write the data, and release it as soon as you no longer need it.

Of course, reducing the amount of protected code is not always possible or practical if the code needs to guarantee consistency where the value it is writing depends on other values in the table, since those values could change before you obtain the lock, requiring you to go back and redo the work.

It is also possible to reduce granularity by locking the data in smaller units. In the above example, you could have a lock on each cell of the table. When updating cells in the table, you would start by determining the cells on which the destination cell depends, then lock those cells and the destination cell in some fixed order. (To avoid deadlock, you must always either lock cells in the same order or use an appropriate try function and release all locks on failure.)

Once you have locked all the cells involved, you can then perform your calculation and release the locks, confident that no other thread has corrupted your calculations. However, by locking on a smaller unit of data, you have also reduced the likelihood of two threads needing to access the same cell.

A slightly more radical version of this is to use read-write locks on a per-cell basis and always upgrade in a particular order. This is, however, rather extreme, and difficult to do correctly.



< Previous PageNext Page >


Last updated: 2006-11-07




Did this document help you?
Yes: Tell us what works for you.

It’s good, but: Report typos, inaccuracies, and so forth.

It wasn’t helpful: Tell us what would have helped.
Get information on Apple products.
Visit the Apple Store online or at retail locations.
1-800-MY-APPLE

Copyright © 2007 Apple Inc.
All rights reserved. | Terms of use | Privacy Notice