ColdFusion Muse

Application Variables Part II - Thread Safe Gotcha

Here's a tip on application variables from the inestimably knowledgeable Sean Corfield. In my previous post on Application Variables I used a pretty typical example of how they are initialized. I checked for the existence of a particular application variable. If it did not exist, I would lock and set all the variables. Sean pointed out that this approach is still subject to threading issues. To start with, here's the original example.

<cfif NOT isDefined('Application.dsn')OR IsDefined('url.forceAp')>
<cflock scope="APPLICATION" timeout="1">
<Cfset Application.dsn = 'myDatasource'>
<cfset Application.imgPath = expandPath('./images')>
</cflock>
</cfif>

Here's what could happen. Thread "A" and "B" hit the CFIF condition at the same time and both attempt the lock. One thread succeeds (let's say thread "A") and the other one waits for the lock to be free. As soon as it is free - meaning after thread "A" has already initialized the variable - thread "B" grabs the lock and initializes the variables again.

As Sean pointed out, this is irrelevant in my current example because I'm setting atomic (don't you love OO speak) variables that are static and it doesn't really matter if thread "A" overwrites thread "B". So let's use a better example. Let's say that you hold a table in memory that contains the IP addresses of the current users for security sake. Here's the code.

<cfif NOT isDefined('Application.curIps')OR IsDefined('url.forceAp')>
<cflock scope="APPLICATION" timeout="1">
<cfscript>
      curIps = queryNew("IP");
      queryAddRow(curIps);
      querySetCell(curIps,"IP",cgi.remote_addr);
      Application.curIps = curIps;
</cfscript>

</cflock>
</cfif>

Other code (presumably) checks the IP address and takes appropriate action. If thread "A" seizes the lock and writes it's IP into the query, but thread "B" comes along and overwrites it, you have the potential for the User of thread "A" to run amuck on your site without having his or her IP logged in the query. To avoid this, Sean suggested this:

<!--- check condition --->
<cfif NOT isDefined('Application.curIps') OR IsDefined('url.forceAp')>
   <!--- Use a named lock --->
<cflock name="ipInitialize" timeout="1" type="exclusive">
      <!--- check to make sure condition is still true --->
      <cfif NOT IsDefined('Application.curIps') OR IsDefined('url.forceAp')>
        <cfscript>
             curIps = queryNew("IP");
            queryAddRow(curIps);
            querySetCell(curIps,"IP",cgi.remote_addr);
            Application.curIps = curIps;
       </cfscript>
    </cfif>
</cflock>
</cfif>

In this case thread "A" will initialize the query, but Thread "B" will not overwrite it - because thread "B" will check again just to be sure that the variable doesn't exist.

A couple of notes. The code above is theoretical. Obviously other factors could make the overwrite issue a non-factor. It's only there to illustrate how merely checking and locking does not make your code thread safe. Secondly, I'm not at all sure if this code applies to the onApplicationStart( ) method of an Application.cfc file. I suspect that the framework model sorts out the threads for you so that the onApplicationStart() method only runs once for 1 thread. If you have any insight on this - feel free to comment.

  • Share:

9 Comments

  • Raymond Camden's Gravatar
    Posted By
    Raymond Camden | 8/22/05 10:07 AM
    just an fyi about your very last code example - you won't be able to refresh from the query string because you don't check for url.forceAp in the second cfif.
  • Mark's Gravatar
    Posted By
    Mark | 8/22/05 10:13 AM
    Ray - Thanks, I added the additional check.
  • Brian Kotek's Gravatar
    Posted By
    Brian Kotek | 8/22/05 10:34 AM
    Just FYI, this is known as "double-checked locking" (in Java for example).
  • Mark's Gravatar
    Posted By
    Mark | 8/22/05 10:47 AM
    I did not know that - thanks.
  • Barney's Gravatar
    Posted By
    Barney | 8/22/05 11:27 AM
    This is irrelevant in onApplicationStart, because at most one thread will ever execute that method, and all other concurrent threads will have to wait for the method to return before they can move on. So CF is taking care of this whole issue for you. Of course, if you need to refresh the app manually, then you still have to deal with it (since you can't use onApplicationStart except when the application starts up).
  • mark's Gravatar
    Posted By
    mark | 8/22/05 11:34 AM
    Barney - thanks for that update. I suspected as much but I did not know for sure.
  • Chris's Gravatar
    Posted By
    Chris | 10/20/05 8:35 AM
    >> "you can't use onApplicationStart except when the application starts up"

    One of my current applications uses onApplicationStart to initialize the application (load application settings from an xml file). One of the application variables it creates/populates is initializationDateTime.

    For every page request, the initialization date/time is checked against the modified time of the xml config file. If the xml config file has been modified since the application last initialized, the application re initializes itself. I do this by calling the onApplicationStart() function directly from onRequestStart, e.g.

    <cfif bReinitApp>
       <cfset onApplicationStart() />
    </cfif>

    By the time the onRequestStart function acts, the onApplicationStart has already run (if it needed to), so I won't have to worry about concurrency issues there. But I might have to worry about page requests colliding, so I might think about locking the reinit code. But it also looks like this is another example where it doesn't matter that one thread is overwriting another, in which case it might be better to leave out the locking in the interest of application efficiency, so page execution isn't waiting on locks to resolve themselves.

    Is there any best practices out there that state which approach is preferable?

    I also wanted to point out that it is possible for onApplicationStart to act more than once. And so I was also wondering, is it a suggested best practice to not call these event driven functions explicitly?
  • Mark's Gravatar
    Posted By
    Mark | 10/20/05 9:42 AM
    Chris - If I understand you correctly I would agree with you on how to approach this.

    But I wonder about the approach in general. Does it mean you have to do a check of the modified date of the XML file with every request? The only concern would be how often that xml file changes and what values within it are changing. In other words, if there are 20 attributes in the XML file and they change every 10 minutes en masse - I'd say your approach fits the bill. However, if there are 20 attributes in the XML file and only 2 or 3 of them change I would create a different function for handling those 2 or 3 (with appropriate locking) and call it to explicitely set these Ap. variables. Otherwise it seems like you are using "onApplicationStart()" to "manage" the entire scope when it is really design to "initialize" the scope.

    That's my take :)
  • Barney's Gravatar
    Posted By
    Barney | 10/20/05 11:21 AM
    Chris, you don't want to go calling onApplicationStart explicitly. Rather, make a private method that is called by both onApplicationStart and in onRequestStart. That'll be a lot more clear what's happening. Then you should use CFLOCK to single thread access to the call in onRequestStart (because of concurrency issues), but you needn't do it in onApplicationStart, because that method is inherently thread-safe. So in onRequestStart, something like this:

    <cfif appNeedsReinitialization()>
    <cflock scope="application" type="Exclusive" timeout="1">
    <cfif appNeedsReinitialization()>
    <cfset initializeApp() />
    </cfif>
    </cflock>
    </cfif>