Do you have a question? Post it now! No Registration Necessary. Now with pictures!
- Posted on
- Sanitise function
- Nik Coughin
March 2, 2005, 4:17 pm
rate this thread
frames, iframes (have I missed anything? any other dangerous html that
should be stripped?) and also prevents SQL attacks. If I have to I'll just
do a little research and write it myself, but always nice not to have to
reinvent the wheel. Something nice and simple, like $str = sanitise(
$str ); would be ideal.
"Come to think of it, there are already a million monkeys on a million
typewriters, and the Usenet is NOTHING like Shakespeare!" - Blair Houghton
Re: Sanitise function
of different places: between <script> tags, linked in by a <link> tag,
onXXXX handlers, href and src attributes, CSS declarations, and possibly
others. You also have to worry about <object> and <embed>. The rarely used
<base> tag can totally screw with your relative links. A <style> tag can
make everything disappear ("body "). Even inline style is
dangerous, since it allows someone to position an element anywhere on the
page--e.g. a fake tool bar that cover up the real one.
It's also very tricky to write regexps that look for these tags. Internet
Explorer will ignore char(0), for example. "<script..." will be
interpreted as "<script...". And then there's second-order attacks to watch
for, where the attack code is formed after an offending tag is removed (e.g.
"<scr<script> dummie = 0; </script>ipt> ... ").
There are two reasonable approaches to this problem:
A. Don't allow HTML. Pass everything through htmlspecialchars() before
B. Look for tags that you do allow, replace them with placeholders (e.g. <b>
=> [[[b]]]), strip off all other tags, and change the placeholders back to