ColdFusion Muse

Matching MIME and Extension After Upload

Muse Reader Asks:
How can I check that the extension of an uploaded file matches the mime type?

If you read my previous post, I presume what you mean is "how do I make sure the file is what it says it is". Unfortunately this is not as easy as it seems. MIME, as it turns out, has little to do with the file. It's an attribute that tells an OS what to do with content after it's been transmitted. In fact, MIME was originally designated as a way of setting up boundary containers in an email message to facilitate attachments. As you may know, SMTP only handles character data - not binary data. When you send an attachment it is actually encoded as character data. The MIME type tells the receiving client to put the file back together as a certain type of file. That's pretty much where MIME ends...

MIME is also used in HTTP requests because the web, like email, uses character streams. At some point MIME came to mean application mapping - the program set up on a given system to handle a particular file type. The type is determined by the extension. A system will attempt to open a file of a given extension with whatever application is set up to do it. Perhaps you've had someone send a Word Doc from his or her MAC to your PC via email and file comes as a DAT extension. You can choose "open with" or you can rename it to change the extension and it will open, but if you just double click on it to open it it will complain that there is no program set up to open this type of file.

This brings us to your problem. The only way to determine if a file is of the type you specified is to examine the file somehow. The clues that let you in on the secret would be different for every file. A CAD file, for example, might include a header of a certain length with sections for this or that. I'm not sure you would be able to do this in a generic enough fashion for it to make sense. Having said that, if you are concerned with Images you can use one of the image CFC's and a "try catch" block to query the attributes of the file. If the image tag chokes it's likely there is something wrong with the file. You could probably come up with Java classes that work with a number of file types. If you are trying to support a limited number of types this could be a good solution.

I certainly don't know everything about this topic. Perhaps an enterprising reader can post an elegant solution so we can glean from it.

NOTE: See this previous post on MIME Types for information on how to determine the associated mime type for a given extension.

Comments
Ryan's Gravatar On linux, you can shell out and call the 'file' program on a given file and it will examine it and can often tell you what kind it is. I don't know how many different kinds of files it can identify, but with some quick testing, I could find only one type it couldn't handle (a smartdraw document). Here is what it came up with:

databse.mdb: Microsoft Access Database
courses2.csv: ASCII text, with CRLF line terminators
ep015.mp3: MP3 file with ID3 version 2.2.0 tag
plan.doc: Microsoft Office Document
smart draw doc.sdr: data
ScoringKey.pdf: PDF document, version 1.2
smartdraw_trial_1346.exe: MS-DOS executable (EXE), OS/2 or MS Windows
website - server guidelines.html: HTML document text
# Posted By Ryan | 8/10/06 10:55 AM



Blog provided and hosted by CF Webtools. Blog Sofware by Ray Camden.