JSON parsing
JSON is used for various formats within Apertis, and potentially also for various web APIs. It is a well defined format, and several mature libraries exist for parsing it. However, the JSON being parsed typically comes from untrusted sources (user input or untrusted web APIs), so must be validated extremely carefully to prevent exploits.
Summary
- Use a standard library to parse JSON, such as json-glib. (Parsing JSON)
- Be careful to pair up JSON reader functions on all code paths. (Parsing JSON)
- Write a JSON schema for each JSON format in use. (Schema validation)
- Use Walbottle to validate JSON schemas and documents. (Schema validation)
- Use Walbottle to generate test vectors for unit testing JSON reader code. (Unit testing)
Parsing JSON
JSON should be parsed using a standard library, such as json-glib. That will take care of checking the JSON for well-formedness and safely parsing the values it contains. The output from json-glib is a hierarchy of parsed JSON nodes which may be values, arrays or objects. The Apertis code must then extract the data it requires from this hierarchy. This navigation of the hierarchy is still security critical, as the parsed JSON document may not conform to the expected format (the schema for that document). See Schema validation for more information on this.
When using json-glib, the
JsonReader
object is typically used to navigate a parsed JSON document and extract the
required data. A common pitfall is to not pair calls to
json_reader_read_member()
and
json_reader_end_member()
.
For example:
gint
read_some_member (JsonReader *reader)
{
gint retval;
/* This code is incorrect. */
if (!json_reader_read_member (reader, "member-name"))
{
return -1;
}
retval = json_reader_read_int (reader);
json_reader_end_member (reader);
return retval;
}
This code is incorrect because json_reader_end_member()
is not called on the
code path where the member-name
member doesn’t exist. That leaves the
JsonReader
in an error state, and any remaining read operations will silently
fail.
Instead, the following should be done:
gint
read_some_member (JsonReader *reader)
{
gint retval = -1;
if (json_reader_read_member (reader, "member-name"))
{
retval = json_reader_read_int (reader);
}
json_reader_end_member (reader);
return retval;
}
The same is true of other APIs, such as
json_reader_read_element()
.
Read the API documentation for json-glib functions carefully to check whether
the function will put the JsonReader
into an error state on failure and, if
so, how to get it out of that error state.
Schema validation
Ideally, all JSON formats will have an accompanying JSON schema which describes the expected structure of the JSON files. A JSON schema is analogous to an XML schema for XML documents. If a schema exists for a JSON document which is stored in git (such as a UI definition), that document can be validated at compile time, which can help catch problems without the need for runtime testing.
One tool for this is Walbottle, which
allows validation of JSON documents against schemas. Given a schema called
schema.json
and two JSON documents called example1.json
and
example2.json
, the following Makefile.am
snippets will validate them at
compile time:
json_schema_files = schema.json
json_files = example1.json example2.json
check-local: check-json-schemas check-json
check-json-schemas: $(json_schema_files)
json-schema-validate --ignore-errors $^
check-json: $(json_schema_files) $(json_files)
json-validate --ignore-errors $(addprefix --schema=,$(json_schema_files)) $(json_files)
.PHONY: check-json-schemas check-json
Unit testing
Due to the susceptibility of JSON handling code to break on invalid input (as it assumes the input follows the correct schema, which it may not, as it’s untrusted), it is important to unit test such code. See the Unit testing guidelines for suggestions on writing code for testing. The ideal is for the JSON parsing code to be separated from whatever code calls it, so that it can be linked into unit tests by itself, and passed JSON snippets to check what it retrieves from them.
Thinking of JSON snippets which thoroughly test parsing and validation code is hard, and is impossible to do without also using code coverage metrics (see the Tooling guidelines). However, given a JSON schema for the document, it is possible to automatically and exhaustively generate unit test vectors which can be easily copied into the unit tests to give good coverage.
This can be done using Walbottle:
json-schema-generate --valid-only schema.json
json-schema-generate --invalid-only schema.json
That command will generate sets of valid and invalid test vectors, each of which is a JSON instance which may or may not conform to the given schema.