Subscribe to Stay Safe with Our Guides!

12 June 2012

DeXSS -- Java program for removing JavaScript from HTML

Download as PDFPDF

Dynamic web sites which allow users to enter text content containing HTML are at risk for so-called cross-site scripting attacks.

A common approach taken to mitigate this risk is to allow some HTML content, but block content that is potentially harmful. One problem with a straightforward approach to blocking such content is that HTML parsing in browsers differs from the ideal, and nefarious individuals can take advantage of these differences to obscure content.

Remove Javascript

DeXSS uses TagSoup, an open-source HTML parser that attempts to mimic how web browsers work. TagSoup reads wild HTML and generates SAX2 events. DeXSS invokes TagSoup and follows it with a pipeline of SAX2 filters to remove HTML tags such as script and attribute values containing such scripts.

DeXSS 1.2 is an Alpha release. You should be aware of the following issues:

  • This release implements a blacklist approach, which has advantages over a whitelist approach, but also has inherent risks. There are still a number of known XSS attacks that DeXSS does not yet detect.

  • DeXSS is agressive about removing style attributes that fail the CSS analyzer. There are probably other CSS attacks that DeXSS does not protect against.

  • Elements that TagSoup thinks should be in the head are discarded by the default settings; changing the BODY_ONLY flag to allow head content will reduce effectiveness greatly. Consequently, DeXSS should not be used to parse entire user-provided HTML files, but only parts that are destined for inclusion.

  • The output of DeXSS is intended for browsers, not for storage. As a result, some constructs may be overly verbose.

  • Configurability and test suites are lacking.

  • DeXSS does not specially handle any HTML5 elements or attributes not present in HTML4.

DeXSS API

DeXSS includes the following classes for direct use:

  • Test, a command-line utility for testing XSS removal.

  • DeXSS, which implements a string-to-string conversion of HTML, with XSS removal.

  • DeXSSParser, which can be used directly as a SAX2 parser to produce SAX2 events from an input stream.

  • DeXSSFilterPipeline, which can be used as a SAX2 filter if you have already used TagSoup to produce SAX2 events

Download

Current Version

Official website: dexss.org

Download as PDFPDF

Kindly Bookmark and Share it:




If you enjoyed The Hacking Articles, Make sure you subscribe to our RSS feed. Stay Updated with latest Security threats from all over the world ! Stay Safe !
The content of This Article "DeXSS -- Java program for removing JavaScript from HTML" is for Educational Purpose & Security Awareness only. Please Feel free to Contact Us. Stay Safe !
If you like The Hacking Articles, Then Donate Us !
Every donation helps keeps motivating me to keep this website up to date!


Follow Me on Pinterest

Twitter Delicious Facebook Digg Stumbleupon Favorites More