ColdFusion Muse

Query Caching Run Amuck: Know Your FIFO Buffer

Query caching is one of those underutilized features of ColdFusion that can exponentially speed up your application. It is also one of those misunderstood features that, when used incorrectly can be very disappointing. Let me say at the start that the Muse believes you should use query caching. If you don't believe I'm a fan then check out my post titled, Good Developers Practice Safe Query Caching. It's not a panacea, but it definitely has it's uses. Almost every application has some queries that can be cached - and saving round trips to the database is the holy grail of application tuning. But in this post we want to talk about naughty developers who cache irresponsibly... developers who do not understand the nature of the FIFO buffer.

FIFO Buffer?

No, it's not the mouse from "American Tail" (We have a Pwan!). It stands for "First In First Out" and it is one of the controlling features of query caching. Remember that query caching is controlled by two things. First, each query has a "time to live" sort of value controlled by either the cachedwithin or cacheafter attributes of the cfquery tag. Second, each query is subject to be removed from the cache based on the FIFO buffer and the limit set in the ColdFusion Administrator. So, for example, if you set the number of queries to cache to 500 and your application tries to cache query number 501, the buffer looks for the "oldest" query (the one that was "first added to the cache" - "first in") and removes it. The next time that first query is run it will be added to the front edge of the rolling buffer and the oldest query will be removed - and so on.

Now this fairly sensible framework where a limit is set to the total number of queries able to be cached and the oldest query is the first to go (also known as a "Logan's Run framework") is put in place to conserve the balance of resources on the web server. If an application were allowed to cache thousands and thousands of queries then pretty soon retrieving information from the cache would in itself become a bottleneck. In fact, the CF Administrator won't allow you to set the cache limit higher than 4 digits - making the effective FIFO buffer maximum 9,999 queries.

Make Caching Unusable

The problem rears its ugly head when you don't understand the nature of this buffer and you cache too aggressively. For example, let's say you have a search engine running against the library of congress. One of the most popular features is to search by author. You decide some caching might help you, so your search query looks like this.

<cfquery name="getBooks" datasource="#dsn#"
        cachedwithin="#createtimespan(0,1,0,0)#">

    SELECT *
    FROM     locBooks
    WHERE    author = '#form.author#'
</cfquery>
Now this seems quite sensible doesn't it? If I search for Hemingway (presumably a popular author) my query is cached and the next fellow looking for Hemingway will pull his information from the cache. I would guess that the locBooks table doesn't change that much so I'm pretty sure I'm in ok caching for a full hour. I could probably even increase it to several hours - maybe even a day.

Does anyone see the flaw in my plan? If my little application becomes popular and dozens of searches are run each minute my buffer is going to fill up quickly. Each unique search takes up an entry in the cache no matter how obscure or trivial. Did Lindsay Lohan write her memoirs yet? Wasn't there a mathematician named Grenalda Smortsgrammer? How do you spell Kierkegaard? Many authors are going to be searched for just one time and you certainly don't want to waste space in the cache with them. Why? Because if the FIFO buffer fills up the queries you really want to keep in the cache will begin to get pushed out.

Not Just Theory

I'm not just blowing smoke here. I have a server with a large heap that contains an application that depends on caching very large queries of product data from Amazon. It began to underperform and upon examination I found caching being used for a search query like the one above - where the parameters passed to the query were largely unique. The FIFO buffer was set to the maximum of 9999, but it was filling up and my product data queries were being forced out by trivial search queries. Putting things back in order fixed the issue and now my application is singing a happy tune again.

Conclusion? When implementing a caching strategy make sure and take the size of the FIFO buffer and the uniqueness of your intended queries into account. The "Server Monitor" has a nice feature under Statistics/Database that will show you how many queries are being cached and allow you to browse them and sort by hit count.



The higher the hit count the more effective your strategy (meaning you are reducing round trips to the DB server). How do you know if you are failing? If you have mostly "1s" in the hitcount column and your buffer is full (the number of cached queries equals the maximum size of your FIFO buffer) then you are likely losing most of the benefit you might have accrued.

Personal Note

The muse has been failing his readers this year and not producing posts at a useful rate. On my list of goals for the fall is to post at 5-8 times per month, so stay tuned. Hopefully I will not disappoint.

Related Blog Entries

Comments
George Murphy's Gravatar Does, this same principle apply for query caching if you are using a framework like ColdBox?
# Posted By George Murphy | 9/20/10 11:24 AM
Mark Kruger's Gravatar @George,

Great question. No it does not apply to ColdBox nor does it typically apply to the various frameworks out there that abstract the DB calls for you. Those applications typically use various forms of caching in persistent scopes (application mostly). Of course they suffer from their own nuances and take careful planning and tuning as well. In my view you should never leave stuff up to a framework of any kind until you thoroughly understand the implications.

This post refers specifically to using the native cache attributes of the cfquery tag that ship with ColdFusion. Thanks for the input.
# Posted By Mark Kruger | 9/20/10 11:32 AM
Alycen's Gravatar @George - just as an addendum to what the Muse said in reply, the caching that frameworks apply, while useful, can also become problematic at some point since application and server scope variables are often stored in memory and have the possibility of eating up all your RAM if they get out of control, so that's just something to keep in mind if you are seeing issues with your server performance.
# Posted By Alycen | 9/20/10 7:19 PM
Brad's Gravatar Nice post. I've been using the query cacheing for years but I've never really thought about it in this much detail. In the coldfusion monitor I'm not seeing any templates in cached queries list. The graph in the query cache status is working. Any suggestions?
# Posted By Brad | 11/15/10 4:16 AM
Charlie Arehart's Gravatar I know this is an old entry, and like so many of Mark's, one that so many can learn a lot from.

But to answer the last comment from Brad, who wondered how to get the CF Enterprise Server Monitor (CFSM) to show the details of cached queries, it's that you need to turn on the "start profiling" feature in the CFSM.

To be clear, you do NOT need to turn on any of the "start" buttons to see the query cache status (tracking the number of cached queries), which is what Mark was referring to, but to see the details of what queries are cached, you do need to turn on "start profiling".

Note also that you only see queries that are cached AFTER that (not any that were cached before that was turned on).

As long as we're discussing that, it's worth noting also that you do want to be careful turning on "start profiling". Just as some know that turning on "start memory tracking" can be REALLY expensive, even for many sites turning on "start profiling" could also add considerable overhead. It just depends on the number of queries run: the profiling feature mostly adds overhead to queries (it not only tracks the details of cached queries, but indeed ALL queries run in CF).

Some people can leave profiling on all the time and have no impact, others could find it a problem immediately. There are no really good metrics or details, to beware.

But again, the query cache status page that Mark showed is "free", as you need not turn on any of the "start" buttons. There are indeed quite a few features of the CFSM that do not require turning on those buttons. I discuss those more in an old entry at http://carehart.org/blog/client/index.cfm/2007/6/1...

Hope those bits are helpful to future readers here. Thanks again, Mark.
# Posted By Charlie Arehart | 8/3/13 10:29 PM



Blog provided and hosted by CF Webtools. Blog Sofware by Ray Camden.