It’s time to take XML out back and shoot it

XML Anxiety

Author(s):

XML security problems are numerous, but you can take steps to limit your exposure – or you can use a different standard.

For this month’s column, I intended to write about XML security and how to avoid all the attacks and problems that can occur. I started making a list of issues both well known and not so well known. After listing 20 items, I realized I wouldn’t have enough space to cover everything, so I moved on to plan B: Instead of focusing on the problems, I’d look at the solutions. This worked reasonably well until I realized one small problem: Even if you use software like Python’s new defusedxml and defusedexpat a number of problems are still difficult to deal with.

A Brief History of XML

XML came from the W3C (World Wide Web Consortium), who also brought us SGML (from which XML comes), SOAP, HTML, you name it. To say that XML and its related family of standards is complicated is a gross understatement – with XML, XML Schema, RELAX NG, XPath, XSLT, XML Signatures, and XML Encryption to name a few. XML also has been extended into XHTML, RSS, Atom, and KML, to name a few more standards. About the only good news I have is that XML and most of its family of standards are NOT Turing complete (unlike, say, PostScript), but you can embed some pretty funky logic into XML files that can cause problems in the various XML parsers.