How to tune MMBase for production

Abstract

How to for fine tuning MMBase


The first step, in identifying which systems perform poorly, is to divide the production environment in segments. Every segment represent a part of the response time.

Every system on a segment boundary should log the response time of the request to find the segment with the highest delay

Browser

Logging of the response time on the client can be done with several load test tools. A freely available one is Jmeter http://jakarta.apache.org/jmeter

Webserver or application server

All frequently used servers can log in the access.log response time information.

The apache http server uses the mod_log_config module which support custom log formats. See http://httpd.apache.org/docs/2.0/mod/mod_log_config.html for more information on LogFormat and CUstomLog

Apache Tomcat uses an Access Log Valve create a similar logfile as the apache http server does. For more information see http://tomcat.apache.org/tomcat-5.5-doc/config/valve.html

Database server

Most databases can monitor the time a query requires to execute. This is different for every database and difficult to map to a request of one user. MMBase can also log the query response time by enabling the debug level on the Sqlhandlers. The log configuration is default in the log4j.xml in the mmbase.jar or /WEB-INF/config/log/ directory.

<logger name="org.mmbase.storage.implementation.database.DatabaseStorageManager" 
        additivity="false"> <level class="&mmlevel;" value ="debug" />
  <appender-ref ref="sqllog" />
</logger>

If the response time in this segment is very bad then the problem can be in several parts:

The first thing to check is which resources on the server are overloaded and which processes are causing it: Monitor the cpu load, virtual memory usage and I/O operations statistics. When the cpu load or memory usage is not caused by the application server process then the problem is somewhere in other applications or the OS.

Before going into the details of the JVM realize that one of the advantages of Java is that it dynamically optimizes for data at runtime. A Java™ HotSpot™ virtual machine adapts and reacts to the specific machine and specific application it is running. In more and more cases Java performance meets or exceeds the performance of similar statically compiled programs. However this adaptability of the JVM makes it hard to measure small snippets of Java functionality.

One of the reasons that it's challenging to measure Java performance is that it changes over time. At startup, the JVM typically spends some time "warming up". Depending on the JVM implementation, it may spend some time in interpreted mode while it is profiled to find the 'hot' methods. When a method gets sufficiently hot, it may be compiled and optimized into native code.

Before you start to tune the command line arguments for Java be aware that Sun's HotSpot™ Java Virtual Machine has incorporated technology to begin to tune itself. This smart tuning is referred to as Ergonomics. Most computers that have at least 2 CPU's and at least 2 GB of physical memory are considered server-class machines which means that by default the settings are:

Please note that 32-bit Windows systems all use the -client compiler by default and 64-bit Windows systems which meet the criteria above will be treated as server-class machines.

Even though Ergonomics significantly improves the "out of the box" experience for many applications, optimal tuning often requires more attention to the sizing of the Java memory regions.

A lot of java developers don't know how memory management happens in the JVM. As a developer you don't have to care about memory (de)allocation. When an application is put in production it can be very important how the memory management system is tweaked. A badly tweaked application could halt for more then 10 minutes. This has been seen on a machine which had to swap a lot and had 1G assigned as max size (-Xmx). After tuning the settings it was reduced to 30 seconds or less.

Garbage collection (GC) is one of the hardest things to do efficiently for jvm's. You can make some parameters explicit to the jvm which makes it much easier for the jvm to guess when it should do something.

Most people know the parameters -Xms and -Xmx, but not what they will do to memory management and gc.A lot of MMBase instance in production do not have these settings or they are very high. In the 1G example the settings were reduced to 700M and it is still to high. The -Xms and -Xmx are allocating OS memory for the process. This is called the memory heap. When the heap is filled for 60% or more then the jvm will start a GC which will halt the process until it is finished (the 10 minute break). The default heap setting is 64M which MMBase will fill in seconds with an average web application. The GC will run very frequently to free memory which means less application cpu time. Most MMBase applications require about 300M of heap size.

Note that the -Xms and -Xmx settings does not match with what the OS returns for the memory usage of the process. The memory usage also includes program code and jvm code (j2se uses the jvm code in memory for all java processes). The memory footprint for the process have to stay below the physical memory to prevent high GC times. Setting the -Xmx the same as the physical memory will decrease performance, because it will likely result in paging of virtual memory to disk.

In the next part where the jvm or java is mentioned the Sun jvm is meant. A lot is the same for others, but not everything. The Sun jvm is divided into several spaces. It has one permanent generation, one old generation and one new generation. The new generation is divided in an eden and 2 survivor spaces.

The permanent generation is for storing the class objects and some other permanent stuff. In JSP environments when a lot of classes are loaded, it is a good practice to set the 'PermSize' value high enough. Full GCs are needed to extend the permanent generation.

New object instances are created in eden (new generation). When eden is full then a small GC is performed to clean it. The instances still in use are copied to one of the survivor spaces and the rest will be cleared. When instances are copied multiple times to the survivor spaces then it will be promoted to the old generation. When the old generation is full then a Full GC will be preformed on the heap. A Full GC, in the default case, will mean a halt on the jvm. The new generation GC won't halt the jvm. The new generation GC is also much faster then the old generation GC.

The ratio of the size of the generations based on the full heap size can have great impact on the throughput and halt time of the jvm especially when editwizards are involved. When you have memory issues then give it a shot with these settings

-server -XX:NewRatio=2 -XX:SurvivorRatio=6

The jvm has to run in server mode and not client mode. In client mode the ratio between old and new is -XX:NewRatio=8. This means that the new generation in client mode will be 1/8 of the heap and the old generation 7/8. In server mode the ratio is -XX:NewRatio=2. The new generation will then be 1/3 of the heap. The -XX:NewRatio=2 above just makes it explicit. The advantages are that more new instance can be created and die before a small GC is preformed.

The -XX:SurvivorRatio is default 25. The reason to increase the Survivor spaces is to prevent that there are too many new instances for the survivor space. If it doesn't fit then the rest will go to the old generation right away. This will happen when the editwizards are installed. A user keeps a stack of wizards with a lot of instances on the server. A very big wizard can occupy 4MB of memory. All instances die when the wizard is closed. You want to keep these objects in the new generation. With a default server mode jvm with 1G the Survivor spaces are 12M. If you have multiple editors then the stacks with wizards can easily be more then 12M and instances will go to the old generation with a small GC and can only be cleared with a Full GC.

To emphasize, in an ideal run you want your survivor space to have objects of different ages. That means you have enough space there to not instantly promote live objects to the old generation. This means less pollution of the old generation.

Default setup with 1G 
Old generation 682mb (client mode: 910) 
New geneartion 341mb (client mode: 113) 
eden 315 (client mode: 104) 
Survivors 12 (client mode:4)

Setup with 1G and -server -XX:NewRatio=2 -XX:SurvivorRatio=6 
Old generation 682mb 
New geneartion 341mb 
eden 255
Survivors 42

When you want to monitor the heap usage on a production server then you could use -Xloggc:/log/gc.log. The GC statistics will be written to this file with no overhead. GCViewer can generate a nice graph of the logfile (http://www.tagtraum.com/gcviewer.html)

Another tool which is very handy, and maybe even an absolute necessity, is jvmstat. Jvmstat shows you graphically how your memory is filled between permanent, old and young generation, It also shows how eden and survivor spaces are filled.

There are many articles on the Internet with more information on memory and heap size tuning on different platforms. See for example http://java.sun.com/performance/

MMBase relies heavily on its caches. Most caches are configurable in the caches.xml, which is in the root of the MMBase configuration. There are a few which have a large effect on memory usage and database round trips. The MMBase admin has a page where cache statistics are shown. This page shows how efficient a cache is.

The default implementation of all caches is a Least Recently Used Hashtable. Another implementation can be plugged in when this is not sufficient.

Before explaining these caches it is important to understand the two different types of nodes MMbase uses. MMBase has virtual and real nodes. Real nodes contain fields which are defined in a builder. Virtual nodes are usually a result of a multi level query. Fields in virtual nodes are original from multiple builders. The field name is always prefixed with the builder name. Real nodes represent objects like news items. Virtual nodes represent multiple parts of different objects. A virtual node can contain a news item title and the authors full name. Virtual nodes do not have a nodenumber.

How the caches are used, depends on how the application on top of MMbase is coded. An application which uses a lot of getRelatedNodes request requires a large 'RelatedNodes' cache. An application with a lot of complicated queries with multiple tables involved requires a larger 'Multilevel' cache.


This is part of the MMBase documentation.

For questions and remarks about this documentation mail to: [email protected]