30 Jul, 2015

Mitigating JavaScript context Cross-Site Scripting in PHP

by John Poulin

Cross-Site Scripting (XSS) is a vulnerability I personally spend a lot of time researching and writing about. This is largely due to the fact that XSS is EVERYWHERE!

This post will demonstrate how we can mitigate JavaScript context XSS in PHP applications.

A Common Misunderstanding

When attempting to mitigate XSS, the solution is simple: output escape. Most developers know this. The problem, however, is that output escaping needs to occur within a specific context.

Websites generally have three main contexts: HTML, JavaScript, and Cascading Style Sheets (CSS). Each particular context has a different syntax with varying control characters. As an example, HTML browsers understand that if they see an angled bracket (<) followed by some alpha symbol ([a-z]), they need to parse it as an HTML tag. In CSS, however, angled brackets have no context-specific meaning. In JavaScript, a semi-colon (;) is used to signify the end of an instruction, whereas in HTML, it may be used to signify the end of an html entity.

The point here is that there is no magic function that will completely mitigate XSS across all contexts. Let us consider an example:

If you aren’t familiar with htmlentities, it is a function that converts all “applicable” characters into their html entity equivalent. More importantly, it is the most widely recommended function in PHP for mitigating XSS vulnerabilities.

In the above example, an attacker will not be able to inject arbitrary HTML tags—those are filtered by the htmlentities function. We can, however, inject JavaScript control characters, such as quotes and semicolons. With the following payload, we can exploit the XSS vulnerability: ';alert(1);'

What’s the Solution?

Since the htmlentities function doesn’t help us, what does? Unfortunately, the answer isn’t that simple. Let us consider only JavaScript context XSS, forgoing CSS and other potential alternatives.

One particular candidate for an encoding function is json_encode. This function is designed to return the JSON representation of a particular value. Part of the JSON encoding process involves encapsulating the user input in double quotes. Any double quotes provided in the input are escaped by prepending a backslash ().

Replacing htmlentities with json_encode, we have:

var a = '<?= json_encode($_GET['p']); ?>';

Providing the previous payload (';alert(1);') suggests that the application is still vulnerable. Digging into the source code we can see that the input was wrapped in double quotes as we expected. But, since the json_encode function was being rendered within the string literal context, the double quotes prove meaningless. Since a single quote is not a control character in the JSON standard, it is not escaped. As such, the double quotes are rendered as string literals, and our payload executes.

Another solution would be wrapping the original call in double quotes instead of single quotes:

var a = "<?= json_encode($_GET['p']); ?>";

When we attempt to execute our previously used payload, we notice that it doesn’t execute. Without being too quick to jump to conclusions, we inspect the page source.

Nothing appears to be encoded. In fact, upon inspecting the developer console we see that we have a JavaScript error. By adjusting our payload and removing the single quotes we have solved the JavaScript error, and our payload executes.

One more attempt. Let us try leveraging the json_encode function without wrapping the output in any string literals.

var a = <?= json_encode($_GET['p']); ?>;

By providing the original payload (';alert(1);'), there are no JavaScript errors, and it doesn’t appear to execute. The output is entirely wrapped in double quotes ensuring that it is a string literal, and will not execute. Our next thought would be to try a modified payload variant, replacing the single quotes with double quotes: ";alert(1);". In this case too, the input is turned into a string literal, with double quotes being replaced with their Unicode variant. Nothing is executed, no JavaScript errors are reported. It appears that using json_encode without encapsulating it in string literals works.

Scanning the application with both BurpSuite and XSSValidator, we conclude that the application is no longer vulnerable to XSS.

Conclusion

If you write PHP web applications that include consume user input, you are at risk of being vulnerable to XSS. The htmlentities function is not the golden function that will mitigate all XSS vulnerabilities.

In JavaScript context, leverage the json_encode function. As we demonstrated, though, this function alone can be used incorrectly, resulting in the application’s continuous vulnerability to XSS.

The json_encode function must be used on its own, without being wrapped in string literals. The example below demonstrates the use with string prefixes and suffixes in a manner that is not vulnerable to XSS.