For the majority of PHP applications, htmlspecialchars() will be your best friend. htmlspecialchars() supplied with no arguments will convert special characters to HTML entities, below shows the conversions performed:
'&' (ampersand) becomes '&' '"' (double quote) becomes '"' '<' (less than) becomes '<' '>' (greater than) becomes '>'
Eagle eyed readers may notice this does not include single quotes. For this reason we recommend that htmlspecialchars() is always used with the ‘ENT_QUOTES’ to ensure single quotes will be encoded. Below shows the singe quote entity conversion:
"'" (single quote) becomes ''' (or ')
htmlspecialchars() vs htmlentities()
Another function exists which is almost identical to htmlspecialchars(). htmlenities() performs the same functional sanitization on dangerous characters, however, encodes all character entities when one is available. This may lead to excessive encoding and cause some content to display incorrectly if character sets change.
http://www.example.com/view.php?name=te"st [...] <script> var = "te\"st "; // addslashes() displayname(var); </script>
<script> var = "test1</script><script>alert(document.cookie);</script>"; displayname(var); </script>
Where Entity Encoding Fails
We talked before about considerations for the location of data, and will go over some examples where entity encoding with htmlspecialchars() is not enough. One of the most common examples of this is when data is inserted within the actual tag or attribute of an element.
This is just one of many somewhat rare situations where extremely strict filtering is required. For an in depth look at many injection scenarios and their prevention methods, take a look at the OWASP XSS Prevention Cheat Sheet.
Third Party PHP Libraries
Virtue Security makes no recommendation or provides any warranty for third party products or software; however, we are aware that several third party PHP libraries are commonly used to assist in XSS prevention. Below are projects that may assist developers building suitable whitelists:
HTML Purifier – http://htmlpurifier.org/
PHP Anti-XSS – https://code.google.com/p/php-antixss/
htmLawed – http://www.bioinformatics.org/phplabware/internal_utilities/htmLawed/
Other Things to Remember