Coldfusion Muse
Posted At : March 20, 2008 8:56 PM | Posted By : Mark Kruger

List Delimiters and Coldfusion Magic

Here is one of those finicky nuances that might surprise you about Coldfusion. Many languages have list functions or something similar to list functions. In many of these languages there is some version of split or splitf that allows you to specify any string as a delimiter regardless of length. This might lead you to believe that you can use a multi-character string as a delimiter in list functions in Coldfusion. Not only is this not the case but the way delimiters behave can cause you to believe it is working when in fact it is not. Let me explain.

Let's start with some sample code. Consider this simple little list parsing scriptlet:

<cfset itemlist = 'joe3bob3harry3mary3Ann4jo-bob'/>

<cfloop list="#itemlist#" index="x" delimiters="3">

<cfoutput>#x#<br></cfoutput>

</cfloop>

If you run this code it will display a list that looks like this:

  joe
bob
harry
mary
Ann4jo-bob

The 3 is the delimiter and the list is exactly what you think it should be. So far there is nothing surprising. But now let's attempt a multi-character delimiter. Let's use "3-4" as our delimiter in the following example.

<cfset itemlist = 'joe3-4bob3-4harry3-4mary3-4Ann4jo-bob'/>

<cfloop list="#itemlist#" index="x" delimiters="3-4">

<cfoutput>#x#<br></cfoutput>

</cfloop>
You would expect this to show:
  joe
bob
harry
mary
Ann4jo-bob

But instead you get a very peculiar result. It will look like this:
  joe
bob
harry
mary
Ann
jo
bob

What in the ham sandwich is going on here? As it happens, Coldfusion will look for any one of the three characters as a delimiter. So any time a 3, 4 or a dash (-) shows up in the string, it is treated as a delimiter and a list item is identified. This will fool you because your compound delimiter will often look like it is working correctly. Why? Because the first letter of your delimiter will always trigger a list item. If your characters are semi-unique it is often going to look right. But you will end up with a hard-to find bug that only rears its head when your tested code starts handling regular data.

Please note, the code above was tested on Coldfusion 7. Coldfusion 8 introduces lots of additional functionality - looping over a file a line at a time for example - that may change this behavior. Oddly, I have never actually seen this list nuance blogged or discussed anywhere, but that could be an anomaly. In any case I will try and test it on CF 8 if I have the time and post an update.

Comments
Jeff Peter’s book “ColdFusion Lists, Arrays and Structures” covers an example like these plus lots of other behavior you would sometimes not expect.
# Posted By Darrell | 3/20/08 10:41 PM
hmm.. I never even considered it could mean anything else. Every coldfusion function that has a "delimiters" attribute defines it as a list of delimiters. Some of them even have an "includeEmptyElements" attribute that would show the problem pretty clearly. Maybe setting your multi-character delimiter as a variable and then using that variable would work, but I doubt it... seems like you'd end up having to use a user defined function like Split() http://www.cflib.org/udf.cfm/split
# Posted By JC | 3/21/08 6:02 AM
CF8 results are identical to CF7. I just finished upgrading my last server to it, so thanks for the excuse to play. lol.

joe
bob
harry
mary
Ann4jo-bob

joe
bob
harry
mary
Ann
jo
bob
# Posted By JC | 3/21/08 6:05 AM
Yeah, I too think that it's common understanding that list functions deal with single-character delimiters. As JS says, the "delimiters" attribute is plural, indicating that each character is considered a delimiter by itself.
# Posted By Tom Mollerus | 3/21/08 8:17 AM
<cfset equation = "1*9+4-3/2.5">
<cfset operands = listToArray(equation,"+-*/")>
<cfset operators = listToArray(equation,"0123456789.")>

that's pretty cool
# Posted By Dan Roberts | 3/21/08 8:35 AM
@Dan,

Wow... that is a really useful example that I had not thought of - a way to get all the operators out of an equation... neato.
# Posted By mark kruger | 3/21/08 8:47 AM
In the past I would have checked each character individually or looped over a regular expression. This is so much better.

I looked at Java's docs and String.split() accepts a regular expression. That is just awesome and would eliminate a lot of looping over regular expressions in parsing.
# Posted By Dan Roberts | 3/21/08 8:53 AM
The same using String.split() would be the following. Heck, this may be exactly how CF is doing it under the hood. This also allows for using multiple lengths of delimiters.

<cfset equation = "1*9+4-3/2.5">
<cfset operands = equation.split("[+\-*/]")>
<cfset operators = equation.split("[0-9.]+")>

It does produce an extra empty element at the start of the operators array, though that is probably my fault somewhere in the expression. Also it is my understanding that it returns a slightly different variable type than CF generally uses for arrays.
# Posted By Dan Roberts | 3/21/08 9:19 AM
Hi Mark, this has caused me heartache since way back in the day... the easiest way to work with this scenario i found was to do a replace on your multiple character delimiter with say a pipe '|' character just before you need to run list functions on it, such as replace(list,'mulidelim','|')
# Posted By David Sirr | 3/26/08 6:34 AM



Blog provided and hosted by CF Webtools. Blog Sofware by Ray Camden.