When code breaks, there are two ways to figure out what went wrong: launch a debugger and step through the program, or add logging statements to see what is going on. In its crudest form, logging means adding a few System.err.println() calls here and there. What are the benefits of logging over using a debugger?
As a personal choice, we tend not to use debuggers beyond getting a stack trace or the value of a variable or two. One reason is that it is easy to get lost in details of complicated data structures and control flow; we find stepping through a program less productive than thinking harder and adding output statements and self-checking code at critical places. Clicking over statements takes longer than scanning the output of judiciously-placed displays. It takes less time to decide where to put print statements than to single-step to the critical section of code, even assuming we know where that is. More important, debugging statements stay with the program; debugger sessions are transient. | ||
--The Practice of Programming by Brian W. Kernighan and Rob Pike |
Another important use of logging is to provide an audit trail: who did what and when. The predominant pattern in WAF is to write such auditing information to the database rather than the filesystem. The logging framework in WAF is intended to be used primarily for troubleshooting and debugging.
Using System.out.println() for logging has numerous drawbacks.
It degrades performance. There is no easy way to conditionally turn off logging statements system-wide, without resorting to error-prone and maintenance-intensive workarounds such as:
if (DEBUG) { System.err.println("foo=" + foo); } |
where DEBUG is a system-wide boolean parameter that you can easily set to true or false.
Even with the above workaround, you don't get a fine-grained degree of control. The logging is either on, or off across the board. You could, of course, introduce more than one system-wide flag, like so:
if (WARN) { System.out.println("warn: bar=" + bar); } |
You are constrained to outputting only to the standard out and error devices. What if you need to output to rotated log files? Additional custom code is required to support this requirement.
Going down this path slowly but surely, you may end up developing a general-purpose logging library. Rather than reinventing the wheel, WAF relies on the Log4J library from the Jakarta project of the Apache Foundation. If you are not familiar with it, please skim through the online documentation at http://jakarta.apache.org/log4j/docs/. The rest of this document tries to highlight rather than explain the most salient Log4J features.
The key abstractions that Log4J provides are:
Ability to log to multiple devices. See the Javadoc for the Appender[1] interface and its implementing subclasses. There are many types of appenders included in Log4J: file appender, rolling file appender, remote syslog daemon appender, asynchronous appender, etc. Appenders are fairly easy to extend for special purposes. For instance, you could write a database appender. (In fact, several have already been written.) See Section 7.5.4 Custom Appenders for more discussion on this.
Ability to easily configure the format of the message. For example, each message can automatically include the timestamp, thread name, location (file name and line number), full or abbreviated class and method name from the which the message originated, etc. See the Javadoc for Layout (at http://jakarta.apache.org/log4j/docs/api/org/apache/log4j/Layout.html) and its subclasses.
API for logging errors and exceptions. See Section 7.5.2 Always log the Throwable.
Different logging levels. The minimum set of possible levels are off, fatal, error, warn, info, debug, and all. Extending the list of levels is possible but discouraged. You can change the requested level of logging either by altering your configuration files before system startup, or at runtime. More on this in Section 7.5.3 Runtime logging level manipulation. See also the Javadoc for Level (http://jakarta.apache.org/log4j/docs/api/org/apache/log4j/Level.html).
Fine-grained hierarchy of loggers. A logger is the central class that performs all logging operations. Loggers are named with dot-delimited names like foo.bar.baz or foo.bar.quux. Loggers form a hierarchy based on their names. You can configure and manipulate loggers in bulk by referring to their parent names. For example, both foo.bar.baz or foo.bar.quux are children of foo.bar.
Each logger has an associated logging level. By setting the logger's level, you can control the amount of information logged. Log statements are enabled only if the level of statement is greater than or equal to the current level of the logger. For example, if you set the foo.bar.baz logger's level to warn, it will only log message whose level is warn, error, or fatal. By setting the level for the foo.bar logger, you are automatically setting it for foo.bar.baz and foo.bar.quux, unless overridden on a more specific level.
Miscellaneous other items such as filters[2] and flexible configuration.
The next sections examine some of these benefits in more detail.
In general, if you catch an exception and decide not to rethrow it for whatever reason, you should log it. In doing so, avoid the following anti-pattern:
try { doSomething(); } catch (FooException ex) { ex.printStackTrace(); s_log.warn("foo occurred: " + ex.getMessage()); } |
Example 7-2. Exception logging anti-pattern
The printStackTrace() method should not be used, because its output goes to the standard error device System.err rather than the devices managed by Log4J.
The getMessage() method should not be used, because it does not include the stack trace. Stack traces are invaluable for tracking down the source of error conditions.
The following pattern should be used instead.
try { doSomething(); } catch (FooException ex) { s_log.debug("foo occurred", ex); // or s_log.info("foo occurred", ex); // or s_log.warn("foo occurred", ex); // or s_log.log(Level.WARN, "foo occurred", ex); } |
Example 7-3. Log the Throwable
Note that the various logging methods provided by the Logger [3] class all take a Throwable parameter, i.e. an Error or Exception. Given a throwable, Log4J will print both the error message obtained via getMessage() and the stack trace.
However, be careful not to fall into this trap:
try { doSomething(); } catch (FooException ex) { s_log.debug(ex); } |
Example 7-4. Swallowed stack trace anti-pattern
There is no debug(Throwable throwable) method — there is only debug(Object msg). Same goes for warn, info, etc. The above statement is essentially equivalent to the following:
s_log.debug(String.valueOf(ex)); |
This is probably not the intended result.
What if you want to log the current stack trace? The easiest way to do this is to log a new Throwable:
A variation of this technique is to add a public Throwable field to a class.
public class Frob { // add the following line temporarily public final Throwable STACK = new Throwable(); } |
Example 7-6. Tracking the origin of an object
Once this has been done, you can write code such as this:
class Foobar { void doStuff(Frob frob) { if ( !checksOutOK(frob) ) { s_log.error("Got a sketchy frob", frob.STACK); } } } |
This allows you to figure the origin of the frob object at a distant point further down the execution path.
Logging levels are usually configured at system startup via the configuration file. It is possible to change the logger's level at runtime. One way to do this is via the Developer Support interfaces. (See Section 7.3 Developer Support.) From the programmatic point of view, runtime manipulation of the logging level boils down to something as simple as this:
// assume s_log is instance of Logger s_log.setLevel(Level.WARN); |
This sets the logging level for the s_log logger to warn. This means that messages output via, say, debug(Object msg, Throwable throwable) will be suppressed, because warn is stricter than debug.
The standard pattern for instantiating loggers that is used throughout WAF is as follows:
package foo.bar; import org.apache.log4j.Logger; public class Baz { private final static Logger s_log = Logger.getLogger(Baz.class); } |
Example 7-7. Standard way of instantiating loggers
This instantiates a singleton logger named foo.bar.Baz. Note, however, that assigning the logger to a private static variable is merely a matter of syntactic convenience. If you ever need to, you can obtain the same logger from anywhere else in your code.
package foo.frob; import org.apache.log4j.Logger; class Nosey { void doStuff() { Logger logger = Logger.getLogger("foo.bar.Baz"); logger.debug("got the logger used by the Baz class."); } } |
Example 7-8. Random access to loggers
There is a couple reasons why this might be useful.
This allows you to adjust the logging level of any logger at runtime. If you have a production system that starts exhibiting erratic behavior, you can turn up the logging level in selected classes without shutting down and restarting the system. This is what Developer Support allows you to do.
There are times when you want to see what goes on in the class Foo when you call a method reliant on Foo elsewhere in your code.
To elaborate on the last point: Consider the logging output generated by the com.arsdigita.db.PreparedStatement logger. If the logging level for this logger is set to info, it will log every executed query and its bind variable values. If you are interested in seeing the query your code generates, you can set this logger's level to info in your config file. Note, however, that this will result in all queries being logged system-wide. That may be many more than you require. The alternative is to adjust the logging level at runtime:
// assume all the necessary imports class Frobnivator { void frobnivate() { Logger logger = Logger.getLogger("com.arsdigita.db.PreparedStatement"); Level old = logger.getLevel(); logger.setLevel(Level.INFO); doSomething(); logger.setLevel(old); } } |
Example 7-9. Adjusting logging level temporarily
In the above example, you end up enabling the PreparedStatement logger only for the duration of the doSomething() method. This may significantly cut down the number of queries logged, thus making it easier for you to see what is going on.
The obvious caveat here is that extraneous queries may still end up getting logged in the presence of multiple threads. This is because the logger's level cannot be adjusted on a per-thread basis - it can only be changed globally.
Out of the box, WAF provides two appenders: ConsoleAppender [4], wired to your standard output device (usually a TTY), and RollingFileAppender [5] tied to a file whose location is configurable.
You can also use other appenders. One of the interesting possibilities is the SocketAppender [6]. It expects a log viewer (or some other log processing application) to listen on the specified TCP port, possibly on a different machine. There are a number of GUI log viewers that work with the socket appender. These viewers give you fine-grained control over your logging output. See, for example, Chainsaw [7] and Lumbermill [8]. Both of these log viewers allow you to filter the output by level, logger, thread name, etc.
The Java language provides two abstractions for doing output: output streams [9] for working with sequences of bytes, and writers [10] for working with sequences of characters. An output stream or a writer can be buffered or unbuffered. For more information, read about BufferedOutputStream [11] and about BufferedWriter [12].
When I/O is buffered, it means that bytes or characters that you write do not necessarily go to their intended output device immediately. They are cached in an intermediate buffer. When the buffer fills up, it is flushed — an actual write to the underlying physical device occurs.
Why should you care about this? If you log a message to a buffered output immediately before the system crashes, you may not see the message logged anywhere. The system didn't have a chance to flush the buffered stream or writer. This situation is fairly rare, but when it does happen, it may stump the unwary troubleshooter. If you want to be absolutely sure you are not losing any logging statements, you must use unbuffered output.
For example, the standard output System.out may be buffered, while the standard error device System.err is usually not. Therefore, System.err is preferable when you want to make sure no logging is lost.
What kind of I/O does log4j use: buffered or unbuffered? It depends on how the particular appender is configured. In the case of FileAppender [13], you have the option of configuring it either way. See the methods getBufferedIO() [14] and setBufferedIO(boolean) [15].
For maximum performance, you should use buffered I/O. Only switch to unbuffered I/O when you suspect logged information may be getting lost. By default, the file appender is buffered.
For maximum performance, follow these rules:
Turn off all unnecessary logging. Set the threshold to error, fatal, or off to minimize the amount of logging produced.
As has been pointed out in Section 7.5.5 Beware of Buffered Streams, make sure you are using buffered I/O for logging.
If logging statements appear in a performance-critical section of code, they should be wrapped inside an if statement that checks the current level of the logger. The "\" character has been inserted where the lines were artificially broken for printing purposes:
if ( s_log.isDebugEnbled() ) { Object expensiveMsg = "parameter foo " + foo + " expected by the \ object " + frob + " was not supplied. Oh, and the value of baz was " + \ baz); s_log.debug(expensiveMsg); } |
By making the above logging call conditional, we avoid the high cost of constructing the expensiveMsg object, if the logging level is to stricter than debug.