The phrase "memory leak" can cause shivers to run down the spine of the most seasoned developer. Having some process on your server that is gloaming onto memory and failing to release it is a guaranteed all nighter lurking somewhere in your future. Recently we were debugging a new, soon to be released application. We discovered what looked like a memory leak. The JVM memory used would climb steadily toward the maximum heap size. When the runtime garbage collection kicked in it would reduce the memory only by about a third of the increase. So, for example, memory use would climb from 300 megs to 600 megs and then GC would reduce usage back to 500 megs and so on. This situation would inevitably lock up the server with out of memory errors. What follows is a recap of our troubleshooting journey.
While CF Guru Mike Klostermeyer examined the code that was most commonly in play, I was tasked with examining the Java Settings. The first thing I did was fiddle with manually firing garbage collection. One set of code I found looked like this:
I decided to try a "system" garbage collection. As I understand it, the runtime GC is a suggestion, but the system GC is a "stop the world" command. I whipped up the following code:
Now in case you missed it in school or on Entertainment Tonight, Java divides the heap into "new" generation space and "old" (tenured) generation space. Without boring you with copious details, objects are always created on the "new" generation space on the theory that most objects live a very short time. For example, when you create a variable in the local scope it survives until the request ends and then it can be safely deleted from the heap. So the vast majority of variables and objects in any code base survive only a short time and live their entire life in the hurky jurky world of the "new" generation. Objects that are intended to live beyond a single request (like application and session variables) get a buyout and are moved to the "old" generation space.
A lot of the oddly named Java switches that we play with in the Jvm.config file have to do with allocating memory or collecting memory on the "old" or the "new" heap space. For example, -XX:+UseParNewGC specifies to the JVM which GC to use for cleaning up the new space. Anyway, the theory in our case was that the runaway memory allocation was occurring in "tenured" (old) memory. Mike and I began to work with the scopes that we considered candidates for "old" memory - application, server and session scoped objects and variables. After a day Mike finally found this snippet of code.
The purpose of this code is to retrieve a Real time stock quote in order to append the value to one of 12 or 13 studies and charts. Because we didn't want to get the quote 12 times in a row we are storing the quote in the session and then we accessing it from the subsequent (Nearly simultaneous) requests. The "application.ChartDataObj" is a collection of methods with no properties attached. So the code above either pulls the data directly from the session, or creates it directly and references it in the session. In either case the goal of this code block is to create the "snapData" variable (an array) for use later on in the function. The variable "snapData" is correctly vared at the top of the function.
When mike removed all of this code and replaced it with just:
What can be gleaned from this exercise? Well at least one "rule of thumb" for us is to carefully consider how we handle objects that are cached in persistent scopes. To boil it down to a single rule it would be "Avoid referencing returned objects from one persistent scope to another and copy by value instead".
In case you wanted a rundown of our final JVM arguments arrived at through trial and error - we found the following to work well in our environment (Your environment may be quite different):
We are also indebted (as always) to the many fine gurus who help out cheerily on the email lists to which we subscribe. The following blog posts on Java and Coldfusion deserve honorable mention:
I'm curious though... since since session scoped variables expire far sooner than application scoped variables, by referencing a application variable from a session variable does that extend the lifespan of the session variable? I would think not, but that that's the only scenario I can think of that would explain the problem as you described it.
On a related note, why the need to session scope stock data anyway? If I configure my portal to view ADBE stock updates, wouldn't it be the same as every other user viewing ADBE updates? Therefore, keeping all stock updates in just the application scope makes them universally available to all clients, and perhaps the session scope could simply store a list stock symbols to be pulled from the application scope rather than storing multiple copies of stock snapshots in the various sessions. Does that make sense?
We went round and round about this and tried several different ways of caching. The current method stores just one quote in each session (the latest) and it is only used for the subsequent requests that come in for the chart studies - a way of getting the latest data point. The usefulness of the data is about 300 or 400 milliseconds so additional caching would be moot - the quote ceases to be real time in seconds. The idea is to pull latest quote and then use it 12 times inside of the next 400 milliseconds... The next request will discard it in favor of new data. It's a very short term caching solution. I opted out of writing to the appliction scope so actively - following the mantra of the ap scope - "write once read many " :)
As for the session scoped variables living beyond the session - we have not seen any behavior like that to date - they definitely expire. My take is that the reference is marked for deletion on the tenured heap and only picked up by the main GC operation.
@Steven: I agree with your question about the best place to keep common data such as stock quotes. It sounds as if Mark and his colleagues decided that such data would be too stale to store, but for our application at work we store stock quotes in server scope with a 20-minute refresh-- both to save processing time, and to make sure that we have a backup data point available in case our quote feed goes down.