ColdFusion Muse

Troubleshooting Coldfusion 4 and 5 servers

Finding a cause for something as generic as a CPU spike can be a frustrating task. In my experience it is best to start with "known" issues and get them out of the way. First, ensure that the OS is up to date with all patches (if using w2k - sp4). Likewise the RDBMS server should be up to date - both the OS and the application (sql2000 SP 2 for example). Next, examine the CF server and make sure that it is up to date as well. Be careful - make sure that all patches are installed and tested for compatibility on the dev server first. I never install a patch on the first day - or without doing it on the dev server first. Patches get rolled back too. Current CF 5 version is 5.0 - but there is an extensive list of hot fixes that may or may not be germane to your installation. Here's a link to Macromedia's Patch List.

p> I always start with issues that are "known bugs and workarounds" - because they are easy to check or discover and have known fixes for them.

Known Issues

  1. The dreaded CFMAIL bug - This bug occurs (listen carefully now - this is tricky), when a file handle for a mail file is created, but the process that writes the file is never completed. It is most common when you are sending out individual mail files on a busy server. Some error that occurs procedurely after the call to cfmail interupts the process and keeps the file from being written - but the file handle and entry in the file directory still exists. This results in a "zero length" cfmail file. It's easy to check. Go to the %cf root%\mail\spool directory and see if a file exists there with a zero byte size. Take the steps below to fix it.
    • Install a batch file for repair - one that stopts the cf application server, clears the spool directory and restarts the service. It looks lik this:
      @Echo off
      REM Replace the following Drive
      REM Letter with the correct
      REM drive for your
      REM \CFUSION installation folder
      NET STOP "Cold Fusion Application Server"
      NET STOP "Cold Fusion Executive"
      C:
      CD \CFUSION\MAIL\SPOOL
      MKDIR TEMP
      MOVE *.CFMAIL TEMP
      NET START "Cold Fusion Application Server"
      NET START "Cold Fusion Executive"
      ECHO Operation Complete
    • Check the error log - specifically you are looking for errors that occur afte a call to CFMAIL.
    • Check disk Usage - If your disk is thrashing (swapping heavily and cueing read/write requests it's time to consider more ram and/or a better disk subsystem. If the disk is nearly full you may need bigger disk or you may need to configure it differently.
  2. Verity Issues - If you are using verity and you don't have an adequate maintenance plan on the collection(s) you may experience performance issues. Make sure you are reindexing the collection periodically. If you have issues you can take the following escalating steps.
    • Reindex, Purge or delete and re-add collections
    • Move to CFMX - cfmx actually has some verity improvements.
    • Move to the Verity 2k Server
  3. Sequential File operations - If you have an application that makes extensive use of file operations - particularly reading, writing and/or appending a sequence of files in a single operation, you may experience a server spike. Remember there are some operations in CF that use the file system that are not cffile. For example, cfhttp and cfftp. A request that manages several files in sequence will naturally run longer than a typical http request. That means you already have a thread that you can expect to last longer than normal. When you use cffile you create multiple file handles (pointers) and send requests to the disk system. Since the disk cannot fulfill each request simultaneiously it queues the requests as well. That means a single request for multiple file operations will spike the processor. If any of those threads hang you are now in a situation where the clock cycles are concentrating on a thread that cannot be released. How to solve this?
    • Use named CFLOCKs around file operations. You can restrict the process to a single thread that must be executed sequentiall this way.
    • Consider a different solution - CF is probably not the best choice for batching large groups of files for data import/export and the like.
  4. Log files (and cfam) - CF can be configured to do a lot of logging. It's important to remember that CF must append to the log file with each logging operation. The larger the log file the more difficult this may be. 2 steps will help.
    • Purge the log files periodically.
    • Disable and/or patch the CFAM service - This is the "management repository serivce" that runs with the JRUN vm. I have not found it to be a terribly useful service in CF 5 unless you are using enterprise in a cluster and using the deployment and archiving service.
  5. Client Variables in registry - If the application in question makes extensive use of client variables and you have not altered the default handling of client variables, they are being stored in the registry. Check the registry size and see if it is unduly large. Here are some tips to solve this problem.
    • Resize the maximum cieling of the registry to accomodate the variables.
    • Possibly run the "clean" tool to remove client variables. Macromedia has one, but Edge web hosting has a much faster and cleaner removal tool. (Edge web)
    • Possibly move client variables to a database instead. Actually, if you are going to make extensive use of client variables this is the preferred solution.
  6. Locking - Yes it's true. Failure to lock session and application variables is a very likely suspect. Each session and application write should have exclusive locks and each read should have read only locks (this second step is mitigated in CFMX).
    • Search code base for session and/or application variables and ensure locking.
    • Set flag for "enforce strict validation" in the cf administrator and check the log for locking related errors. NOTE: do this on a dev server. If you do it on a live server it may bring down your ap until you fix the errors.

Of course there are also external causes of server performance problems - a topic for another blog.

  • Share:

3 Comments

  • Douglas Spooner's Gravatar
    Posted By
    Douglas Spooner | 8/30/06 11:49 AM
    Where can I get hold of the tool you mention here, we have a very large hive thats causing problems with our server due to the fact the variables are stored in the registry.(the edge site is now a spam link site)

    Possibly run the "clean" tool to remove client variables. Macromedia has one, but Edge web hosting has a much faster and cleaner removal tool. (Edge web)
  • Edmond Choy's Gravatar
    Posted By
    Edmond Choy | 4/30/08 6:54 PM
    I found the link to the tool at http://www.edgewebhosting.net/About_Us/Quick_Fix/
  • Swagat's Gravatar
    Posted By
    Swagat | 8/24/10 5:57 AM
    Ben Forta, best-selling ColdFusion author is coming to India this August at India's largest Adobe Flash Platform Conference. Ben Forta will conduct a visionary keynote on the opening day of the summit. For more information and to register log on to adobesummit.com