I promised some information from the seminar on application security by Shlomy Gantz. This post is the first of what I hope is 3 or 4 posts unveiling some little thought about security issues when you are doing application programming. Of course we all know about cross site scripting (XSS), SQL injection attack (SIA) and acronym overload seizure (AOS). If you don't, you can find examples of the first two in part IV of my series on the Security Pyramid. In this article I'd like to explore what Shlomy called "HTML Injection". Now I knew that this little gem existed but mostly I thought of it as a XSS attack - where a user is able to place JavaScript designed to steal information from other unsuspecting users into a page (as in my cookie example). What Shlomy did was much harder to detect.
Let's say you have a comment application. It's designed to let users add comments in a text box. Users can add whatever comments they like and they can edit their comments. The code might look something like this:
User | Comment | |
Bob | I have Blue eyes and I like walks on the beach | edit |
Mary | Bob has brown eyes. | edit |
Mary | Bob is a Freak. | edit |
Mary | Bob wears Superman underwear. | edit |
Bob | My nickname is Studly Doright. | edit |
Obviously Bob lives above the garage drinking Red Bull and playing Halo 3. But what Mary doesn't know is that Bob is a clever and geeky fellow who knows a thing or two about web programming in spite of the autographed picture of a sheep named Bambi hanging on his closet door. Bob is able to edit his own entries so he does the following.
The HTML renders like this:
User | Comment | |
Bob | I have Blue eyes and I like walks on the beach and My nickname is Studly Doright. | edit |
Anyway, using this technique a clever (and internally wounded and suffering in spite of his new pocket protector) Bob could wreak havoc on the site and control how virtually everything displays.
As always the fix is to control user input on the server side and validate everything that you plan on displaying later. Each input should submit to some form of rules to make sure that it is what you believe it is supposed to be - especially on a web site where anonymous users produce content for others to see (like forums for example). It is not enough to merely strip out any "<script>" tags. If you want to allow HTML then think about the rules carefully and make sure you have included some scrubbing code to handle Bob.