Monday, August 31, 2009

Notes On How To Fix Tomcat Permgen Error

The PermGen problem rears its head when we try to reload an application in Tomcat multiple times. The effect is that after x number of restarts of a given application Tomcat eventually throws an error saying that it is out of PermGen space and then quits. This is what the problem is as defined on Frank Kieviet's blog.
The problem in a nutshell

Application servers such as Tomcat allow you to write an application (.ear, .war, etc) and deploy this application with other applications on this application server. Should you feel the need to make a change to your application, you can simply make the change in your source code, compile the source, and redeploy the application without affecting the other still running applications in the application server: you don't need to restart the application server. This mechanism works fine on Tomcat and other application servers.

The way that this works is that each application is loaded using its own Classloader. Simply put, a Classloader is a special class that loads .class files from jar files. When you undeploy the application, the Classloader is discarded and it and all the classes that it loaded, should be garbage collected sooner or later.

Somehow, something may hold on to the Classloader however, and prevent it from being garbage collected. And that's what's causing the java.lang.OutOfMemoryError: PermGen space exception.

PermGen space

What is PermGen space anyways? The memory in the Virtual Machine is divided into a number of regions. One of these regions is PermGen. It's an area of memory that is used to (among other things) load class files. The size of this memory region is fixed, i.e. it does not change when the VM is running. You can specify the size of this region with a commandline switch: \-XX:MaxPermSize. The default is 64 Mb on the Sun VMs.

If there's a problem with garbage collecting classes and if you keep loading new classes, the VM will run out of space in that memory region, even if there's plenty of memory available on the heap. Setting the \-Xmx parameter will not help: this parameter only specifies the size of the total heap and does not affect the size of the PermGen region.
So, in essence Tomcat is not able to dump the classloader on an application reload and eventually just runs out of PermGen space.

For me understanding the problem was the majority of the problem. Now that I get how it happens the question becomes how do we fix it. As it turns out there are quite a few tools. Here is how I went about solving it.

1. Start Tomcat
2. Deploy and run your application
3. Undeploy the application (just the application, not Tomcat)
4. Get a heap dump "jmap -dump:format=b,file=leak.hprof "
5. Run the Memory Analyzer tool
6. Use the Memory Analyzer to find the source of the leak
7. Fix the source of the memory leak
8. Put in the correct JVM settings
9. Reload the application until your hearts content

For more details I will just defer you to the links at the bottom of the post. The problem and solution has been described very well by other sites. My main motivation was to bring the various links together. As a brief summary what we are doing is running our application and then undeploying it. At this point there should not be any references to your classloader as the classloader should have been dumped by Tomcat. If it is not then you have a leak.

JVM Settings:

These are the JVM settings that I used to make the PermGen recover. For some reason there appears to be a bug in the settings that require you to put in the -client flag to make things work.
-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.local.only=false -Xms256m -Xmx512m -XX:+UseConcMarkSweepGC -XX:+CMSPermGenSweepingEnabled -XX:+CMSClassUnloadingEnabled -client
CleanupListener:
To solve the leaks I created a CleanupListener.

Tools:
There are some good commercial tools out there, but I found that between jconsole and the Memory Analyzer I was able to find the same information perfectly. I was a little disappointed that the Netbeans profiler could not solve this problem, but I have yet to find a tool that can find bottlenecks in your code as well as Netbeans!

Note: to run jconsole locally I did have to put in the following JVM arguments:
-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.local.only=false

Links:

I cannot take any credit for figuring out this problem, but it did take me awhile to scour the web to find all the documentation. This is what I found...in order of importance.

Frank Kieviet's Blog
- explains the problem perfectly

Memory Analyzer
- great tool that allows you to search the heap dump

Finding Memory Leaks in Java Apps
- another great detailed post that describes the problem and solution (uses jhat)

Causes of Java PermGen Memory Leaks
- very practical advice on how what might be causing the problem

Memory leaks where the classloader cannot be garbage collected

- more advice on what might be causing the problem and some example code

A Collection of JVM Options
- Ever wonder what all the switches in your JVM do?

jhat - Java Heap Analysis Tool

- an alternative to using the Memory Analyzer

PermGen problem - class loader does not collect

- forum that I found the -client flag setting fix

How to prevent memory leaks when reloading web applications


Good Riddance, PermGen OutOfMemoryError!


Yet another day in the life of a memory leak hunter