There are a few anti-patterns to consider when accessing the filesystem. This
article assumes knowledge of the standard
GFile
,
GInputStream
and
GOutputStream
APIs.
Summary
- Always use asynchronous I/O for file access.
- Always use appropriate functions to construct file names and paths.
- Validate file paths to ensure expected results before using them.
- Use AppArmor profiles to enforce constraints on file access.
Asynchronous I/O
All I/O should be performed asynchronously. That is, without blocking the
GLib main context.
This can be achieved by always using the *_async()
and *_finish()
variants
of each I/O function. For example,
g_input_stream_read_async()
rather than
g_input_stream_read()
.
Synchronous I/O blocks the main loop, which means that other events, such as user input, incoming networking packets, timeouts and idle callbacks, are not handled until the blocking function returns.
Note that the alternative, running synchronous I/O in a separate thread, is highly discouraged; see the threading guidelines for more information.
File path construction
File names and paths are not normal strings: on some systems, they can use a character encoding other than UTF-8, while normal strings in GLib are guaranteed to always use UTF-8. For this reason, special functions should be used to build and handle file names and paths. (Modern Linux systems almost universally use UTF-8 for filename encoding, so this is not an issue in practice, but the file path functions should still be used.)
For example, file paths should be built using
g_build_filename()
rather than
g_strconcat()
.
Doing so makes it clearer what the code is meant to do, and also eliminates
duplicate directory separators, so the returned path is canonical (though not
necessarily absolute).
As another example, paths should be disassembled using
g_path_get_basename()
and
g_path_get_dirname()
rather than
g_strrstr()
and other manual searching functions.
Path validation and sandboxing
If a filename or path comes from external input, such as a web page or user
input, it should be validated to ensure that putting it into a file path will
not produce an arbitrary path. For example if a filename is constructed from
the constant string ~/
plus some user input, if the user inputs
../../etc/passwd
, they can (potentially) gain access to sensitive account
information, depending on which user the program is running as, and what it
does with data loaded from the constructed path.
This can be avoided by validating constructed paths before using them, using
g_file_resolve_relative_path()
to convert any relative paths to absolute ones, and then validating that the
path is beneath a given root sandboxing directory appropriate for the
operation. For example, if code downloads a file, it could validate that all
paths are beneath ~/Downloads
, using
g_file_has_parent()
.
As a second line of defence, all projects which access the filesystem should provide an AppArmor profile which limits the directories they can read from and write to. See the AppArmor guidelines for more information.