XML parsing
XML is used for a few formats within Apertis, although not as many as JSON. It is more commonly used internally by various GLib systems and tools, such as GSettings and D-Bus. In situations where it is parsed by Apertis code, the XML being parsed typically comes from untrusted sources (untrusted web APIs or user input), so must be validated extremely carefully to prevent exploits.
Summary
- Use a standard library to parse XML, such as libxml2. (Parsing XML)
- Write an XML schema for each XML format in use. (Schema validation)
- Use xmllint to validate XML documents. (Schema validation)
Parsing XML
XML should be parsed using a standard library, such as libxml2. That will take care of checking the XML for well-formedness and safely parsing the values it contains. The output from libxml2 is a hierarchy of parsed XML elements — the Apertis code must extract the data it requires from this hierarchy. The navigation of this hierarchy is still security critical, as the parsed XML document may not conform to the expected format (the schema for that document). Strings should be checked to see if they’re empty or invalid UTF-8; integer parsing should check for failure or unparsable characters; the parser should error if required elements aren’t encountered or expected attributes are missing; etc.
Schema validation
Ideally, all XML formats will have an accompanying XML
schema which describes the
expected structure of the XML files. If a schema exists for an XML
document which is stored in git (such as a
GtkBuilder
UI definition), that document can be validated at compile time, which
can help catch problems without the need for runtime testing.
Schemas can be written in XSD or RelaxNG. The choice is a matter of personal preference, as both are equally expressive.
One tool for this is xmllint, which
allows validation of XML documents against schemas. Given a schema
called schema.xsd
and an XML document called example.xml
, the
following Makefile.am
snippet will validate them at compile time:
check-local: check-xml
check-xml: schema.xsd $(xml_files)
xmllint --noout --schema schema.xsd $(xml_files)
.PHONY: check-xml
Various existing autotools macros for systems which use XML, such as GSettings, already automatically validate the relevant XML files.