After a kernel upgrade, my SSD stated going read-only. Here’s what seems to have solved the problem: (all fine, now > 1 day later of intensive use, whilst previously failing within a few minutes)
Adding this kernel boot option:
Explanation: Apparently some combinations of kernels and NVME SSDs cause problems, unless the above power save feature is disabled.
Since the SSD went read-only, no log messages related to the freeze, got saved — all lost after reboot. Using
sudo or switching to root to run
dmesg also didn’t work, because of read-only disk errors. — So, to figure out what was happening, I ran
journalctl -f and
dmesg -Hw to start tailing the system logs, before the error happened. And then started some apps that accessed the SSD intensively. Then the disk froze, and I could switch to the console running
dmesg and take a photo of the logs. Here’s the logs, in case anyone wants to compare:
In text: (OCR translated, could be “typos”)
nvme nvme0: controller is down; will reset: CSTS=0xffffffff, PCI_STATUS=0x10 nvme 0000:04:00.0: enabling device (0000 -> 0002) xen: registering gsi 16 triggering 0 polarity 1 Already setup the GSI :16 nvme nvme0: Removing after probe failure status: -19 nvme0n1: detected capacity change from ... to 0 EXT4-fs warning (device dm-4): ... I/O error 10 writing to inode ... starting block Buffer I/O error on device dm-4, logical block blk_update_request: I/O error, dev nme0n1, sector ... ... device-mapper: thin: process_cell: dm_thin_find_block() failed: error= -5 ...
If you websearch for
nvme nvme0 "controller is down" "will reset" "detected capacity change" "to 0"
you’ll find lots of things to read, I found this blog post helpful:
https://tekbyte.net/fixing-nvme-ssd-problems-on-linux/ (the same solution worked for him)
In someone else’s case, the PSU (power supply unit) was the problem:
Samsung SSD 980 NVMe controller shuts down : linuxhardware
A kernel patch with lots of links to this problem reported elsewhere:
And a nice overview of
p3 and what they are:
What the other boot options mean: dracut.cmdline(7) - Linux manual page — boot options you’ll see on the same boot line as