10.2. Preventing XSS by Stripping HTML Tags

10.2.1. Overview

Note

Having server-side HTML data was more likely with our legacy code, but is still possible. We should avoid data with HTML where possible.

There are certain cases where you have server-side HTML and don’t want to see escaped HTML, but instead want to either:

  • Strip all HTML tags from your server-side data, or
  • Strip all HTML tags except safe HTML tags (e.g. “<br />”) from your server-side data.

In both cases, we use a library called bleach.

10.2.2. Mako filters for bleaching

At the time of writing this, we do not yet have Mako filters for bleaching.  However, that would be very useful and was detailed and requested on this PR. If you need this, please do the following:

  1. See if this was already added,
  2. If not, implement it and add to the xss-lint repo.
  3. Update this documentation to detail using the filters.

10.2.3. Strip all HTML tags

You would typically do this when people have entered HTML tags inside a field in the past, and you no longer want to support HTML, but you also don’t want escaped HTML tags to start appearing on the page.

Here is an example using bleach to strip all tags.

10.2.4. Strip all but safe HTML tags

You would do this if you in fact want to allow a user to be able to use certain simple HTML tags, like ”<br />”, in their input.  Use this sparingly.  It is much simpler to deal with plain text fields.

Here is an example using bleach to only allow basic/safe supported tags.

In addition to adding in the HTML to your data, you will probably need to turn off HTML escaping when outputting this data inside a template. The following is an example of this in Mako.

# Sample Mako template with page level HTML-escaping on by default

# Expression before:

${title}

# Expression after:

# Title was cleaned with bleach and is safe.
${title | n, decode.utf8}