Table of Contents:
Design
Linux PStore and Ramoops modules allow to use memory to pass data from the dying breath of a crashing kernel to its successor. This can be read from Linux after restart, or for hardware using U-Boot directly from the U-Boot command line.
PStore
PStore is a generic interface to platform dependent persistent storage.
Platforms that provide a mechanism to preserve some data across system reboots can register with this driver to provide a generic interface to show records captured in the dying moments. In the case of a panic the last part of the console log is captured, but other interesting data can also be saved.
Captured records can be found under the /sys/fs/pstore
directory. Different
users of this interface will result in different filename prefixes.
Once the information in a file has been read, removing the file will signal to the underlying persistent storage device that it can reclaim the space for later re-use.
$ rm /sys/fs/pstore/dmesg-ramoops-0
The expectation is that all files in /sys/fs/pstore/
will be saved elsewhere
and erased from persistent store soon after boot to free up space ready for the
next catastrophe.
Pstore only supports one backend at a time. If multiple backends are available,
the preferred backend may be set by passing the pstore.backend=
argument to
the kernel at boot time.
Ramoops
Currently, only Ramoops PStore backend is supported as it is platform independent.
Ramoops is an oops/panic logger that writes its logs to RAM before the system crashes. It works by logging oopses and panics in a circular buffer. Ramoops needs a system with persistent RAM so that the content of that area can survive after a restart.
Ramoops uses a predefined memory area to store the dump. The start and size and type of the memory area are set using three variables:
mem_address
for the startmem_size
for the size. The memory size will be rounded down to a power of two.mem_type
to specify if the memory type (default ispgprot_writecombine
).
Typically the default value of mem_type=0
should be used as that sets the
pstore mapping to pgprot_writecombine
. Setting mem_type=1
attempts to use
pgprot_noncached
, which only works on some platforms. This is because pstore
depends on atomic operations. At least on ARM, pgprot_noncached
causes the
memory to be mapped strongly ordered, and atomic operations on strongly ordered
memory are implementation defined, and won’t work on many ARM based systems
such as OMAP.
The memory area is divided into record_size
chunks (also rounded down to
power of two) and each oops/panic writes a record_size
chunk of information.
Dumping both oopses and panics can be done by setting 1 in the dump_oops
variable while setting 0 in that variable dumps only the panics.
The module uses a counter to record multiple dumps but the counter gets reset on restart (i.e. new dumps after the restart will overwrite old ones).
Ramoops also supports software ECC protection of persistent memory regions. This might be useful when a hardware reset was used to bring the machine back to life (i.e. a watchdog triggered). In such cases, RAM may be somewhat corrupt, but usually it is restorable.
Set-up
Kernel set-up
The following entries should be added to kernel config file (e.g. debian/config/armhf/config) to build pstore support with ramoops backend:
CONFIG_PSTORE=y
# CONFIG_PSTORE_DEFLATE_COMPRESS is not set
CONFIG_PSTORE_CONSOLE=y
CONFIG_PSTORE_RAM=y
ramoops backend supports the following parameters:
name | description | Default |
---|---|---|
record_size | size of each dump done on oops/panic | 4K |
console_size | size of kernel console log | 4K |
ftrace_size | size of ftrace log | 4K |
pmsg_size | size of user space message log | 4K |
mem_address | start of reserved RAM used to store oops/panic logs | |
mem_size | size of reserved RAM used to store oops/panic logs | |
mem_type | set to 1 to try to use unbuffered memory | 0 |
dump_oops | set to 1 to dump oopses, 0 to only dump panics | 1 |
ecc | if non-zero, the option enables ECC support and specifies ECC buffer size in bytes (1 is a special value, means 16 bytes ECC) | 0 |
Do not to set the mem_address
value in the range dedicated to the boot
loader, in order to avoid overwriting the logs during reboot.
mem_size
should be a power of 2 and larger than the sum of all records
sizes. Records sizes should be a power of 2.
The dump space for oops/panic records is defined as the rest of mem_size
minus console
, ftrace
and pmsg
areas. This area is divided into
record_size
chunks and each oops/panic writes a record_size
chunk of
information.
For example, to reserve 64K starting at 0x30000000, with 6 oops/panic dumps and
1 console log of 8K, plus 1 ftrace and 1 pmsg logs of 4K, the following kernel
parameters need to be added, e.g. to /etc/kernel/cmdline
and run sudo u-boot-update
:
ramoops.mem_address=0x30000000 ramoops.mem_size=0x100000 ramoops.record_size=0x2000 ramoops.console_size=0x2000 memmap=0x100000$0x30000000
U-Boot set-up
The following entry should be added to U-Boot config file to build pstore command support:
CONFIG_CMD_PSTORE=y
The default configuration can be set in U-Boot configuration file, or at run-time using “pstore set” command.
Configuration parameters are:
Name | Default |
---|---|
CMD_PSTORE_ADDR | 0x0 |
CMD_PSTORE_SIZE | 0x0 |
CMD_PSTORE_RECORD_SIZE | 0x1000 |
CMD_PSTORE_CONSOLE_SIZE | 0x1000 |
CMD_PSTORE_FTRACE_SIZE | 0x1000 |
CMD_PSTORE_PMSG_SIZE | 0x1000 |
CMD_PSTORE_ECC_SIZE | 0 |
Parameters should be the same as the ones used by kernel. Records sizes should be a power of 2.
For example, to be able to display or save dumps generated by kernel parameters set in previous chapter, PStore module should be:
CMD_PSTORE_ADDR=0x30000000
CMD_PSTORE_SIZE=0x100000
CMD_PSTORE_RECORD_SIZE=0x2000
CMD_PSTORE_CONSOLE_SIZE=0x2000
CMD_PSTORE_FTRACE_SIZE=0x1000
CMD_PSTORE_PMSG_SIZE=0x1000
CMD_PSTORE_ECC_SIZE=0
Usage
Generate kernel crash
For test purpose, you can generate a kernel crash by setting reboot timeout to 10 seconds and trigger a panic:
$ sudo sh -c "echo 1 > /proc/sys/kernel/sysrq"
$ sudo sh -c "echo 10 > /proc/sys/kernel/panic"
$ sudo sh -c "echo c > /proc/sysrq-trigger"
Retrieve logs in Linux
After a crash induced reboot, logs can be found at /var/lib/systemd/pstore:
$ sudo ls -l /var/lib/systemd/pstore
total 0
-rw------- 1 root root 4084 Mar 1 08:32 console-ramoops-0
-rw------- 1 root root 4023 Mar 1 08:27 dmesg-ramoops-0
-rw-r----- 1 root root 4040 Mar 1 08:15 dmesg.txt
Retrieve logs in U-Boot
In U-Boot, logs can be displayed or saved using pstore command.
=> help pstore
pstore - Manage Linux Persistent Storage
Usage:
pstore set <addr> <len> [record-size] [console-size] [ftrace-size] [pmsg_size] [ecc-size]
- Set pstore reserved memory info, starting at 'addr' for 'len' bytes.
Default length for records is 4K.
'record-size' is the size of one panic or oops record ('dump' type).
'console-size' is the size of the kernel logs record.
'ftrace-size' is the size of the ftrace record(s), this can be a single
record or divided in parts based on number of CPUs.
'pmsg-size' is the size of the user space logs record.
'ecc-size' enables/disables ECC support and specifies ECC buffer size in
bytes (0 disables it, 1 is a special value, means 16 bytes ECC).
pstore display [record-type] [nb]
- Display existing records in pstore reserved memory. A 'record-type' can
be given to only display records of this kind. 'record-type' can be one
of 'dump', 'console', 'ftrace' or 'user'. For 'dump' and 'ftrace' types,
a 'nb' can be given to only display one record.
pstore save <interface> <dev[:part]> <directory-path>
- Save existing records in pstore reserved memory under 'directory path'
to partition 'part' on device type 'interface' instance 'dev'.
Filenames are automatically generated, depending on record type, like
in /sys/fs/pstore under Linux.
The 'directory-path' should already exist.
First of all, unless PStore parameters as been set during U-Boot configuration
and match kernel ramoops parameters, it needs to be set using pstore set
, e.g.:
=> pstore set 0x30000000 0x100000 0x2000 0x2000
Then all available dumps can be displayed using:
=> pstore display
**** Dump
Oops#1 Part1
<6>[ 17.315503] imx-media: subdev ipu2_ic_prpvf bound
<6>[ 17.324302] ipu2_csi0: Registered ipu2_csi0 capture as /dev/video6
…
<0>[ 105.641914] Code: e5834000 f57ff04e ebed244d e3a03000 (e5c34000)
<4>[ 105.651526] ---[ end trace 2ebf2b6dc03e53a0 ]---
**** Dump
Panic#2 Part1
<4>[ 105.332293] CPU: 3 PID: 460 Comm: ash Tainted: G C 4.19.0-6-armmp #1 Debian 4.19.67-2co5
<4>[ 105.341796] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
…
<4>[ 106.205233] [<c03845f8>] (cpu_startup_entry) from [<c031296c>] (secondary_start_kernel+0x160/0x188)
<4>[ 106.217624] [<c031296c>] (secondary_start_kernel) from [<10302c6c>] (0x10302c6c)
**** Console
er m25p80 snd spi_nor soundcore
[ 105.284255] pwm_imx imx_media_common(C) v4l2_fwnode imx_ldb dw_hdmi_imx etnaviv dw_hdmi imxdrm gpu_sched drm_kms_helper imx_ipu_v3 panel_simple cec drm fb_sys_fops evdev imx6q_cpufreq pwm_bl ip_tables x_tables autofs4 btrfs xor zstd_decompress zstd_compress xxhash zlib_deflate raid6_pq libcrc32c crc32c_generic cls_cgroup ahci_imx libahci_platform libahci ci_hdrc_imx ci_hdrc ulpi libata ehci_hcd udc_core scsi_mod sdhci_esdhc_imx sdhci_pltfm sdhci usbcore i2c_imx usbmisc_imx phy_mxs_usb anatop_regulator spi_imx dwc3_haps clk_pwm micrel
[ 105.332293] CPU: 3 PID: 460 Comm: ash Tainted: G C 4.19.0-6-armmp #1 Debian 4.19.67-2co5
…
[ 106.217624] [<c031296c>] (secondary_start_kernel) from [<10302c6c>] (0x10302c6c)
[ 106.230722] Rebooting in 10 seconds..
Or saved to an existing directory in an Ext2 or Ext4 partition using pstore save
, e.g. on root directory of 1st partition of the 2nd MMC:
=> pstore save mmc 1:1 /
File System is consistent
CACHE: Misaligned operation at range [4f867098, 4f869098]
update journal finished
8136 bytes written in 749 ms (9.8 KiB/s)
File System is consistent
CACHE: Misaligned operation at range [4f867098, 4f869098]
update journal finished
7856 bytes written in 724 ms (9.8 KiB/s)
File System is consistent
update journal finished
8180 bytes written in 719 ms (10.7 KiB/s)
Once booted, we can access the log file inside the mountpoint of 1st partition of the 2nd MMC.
Check the mountpoint using lsblk command and we can view the log files inside that.