Non-destructively testing repairs to damaged filesystems

Let's say you have a damaged, unmountable btrfs filesystem. You run btrfs check which tell you about some problems, but the documentation strongly advises you not to run btrfs check --repair.

WARNING: the repair mode is considered dangerous and should not be used without prior analysis of problems found on the filesystem.

While btrfs check tells you the problems it finds, it does not tell you what changes it would make. It would be great if "repair" had a mode that would tell you what changes it would make, without committing yourself to those changes. Sadly there is no such mode - so what options do we have?

Brute force?

The brute force approach is to back up the devices comprising the damaged btrfs filesystem to another location, so that they could be restored if btrfs check --repair doesn't work out. This could be a lot of data to backup and restore, especially if the filesystem consists of multiple volumes comprising multiple terabytes of data.

What about...

Instead of brute force we could use virtualised discs and a hypervisor. Qemu's qcow2 disc format lets us set up virtual discs with a backing volume set to the damaged drive/partition. This means that virtual disc reads come from the backing volume, but any writes are made to the virtual disc rather than the backing volume. This means that we don't need huge quantities of storage to back up the data, and if we don't like the changes that were made by "repair" or other tools we can easily reset by deleting the virtual discs or by using qemu's snapshot feature.

This copy-on-write technique can be used not just with btrfs and "btrfs check --repair" but with other filesystems and other tools to repair those filesystems. For example, were you to edit some of the filesystem structure with a programme you'd written yourself, as I talk about in another post.

A complication with btrfs filesystems is that a device id is stored inside volumes of a filesystem. When mounting/checking btrfs will scan all discs and partitions to find devices, and mount them if the device ids match, so you do not want both the original disc and the virtual disc to be visible. You do this inside a virtual machine simply by ensuring only the virtual discs are mapped into the virtual machine. That is, don't map in the original file systems as raw discs!

Setting up a kvm virtual machine

You'll need a virtual machine to mount these virtual filesystems. If you don't already have a virtual machine set up, you'll need to create one. As we're using qemu virtual discs, this means you need to use the kvm/qemu/libvirtd hypervisor stack. There are several ways to set up a virtual machine and many good guides on the web about how to do this, so I won't repeat them here but I'll just say that on my Ubuntu system I used uvtool to quickly set up the VM to test my filesystem changes.

Setting up copy-on-write virtual discs

The qemu-img command creates these virtual filesystems. If a btrfs filesystem comprises multiple discs or partitions, we need to create a separate virtual disc for each physical disc/partition.

The following command creates a virtual disc that uses /dev/sdd2 as its backing file, and any changes to this virtual disc are written to the file ssd2_cow.qcow2.

qemu-img create -f qcow2 -o backing_file=/dev/sdd2 -o backing_fmt=raw sdd2_cow.qcow2

I ran similar comands once for each underlying disc/partition that was part of the btrfs filesystem.

To attach one of these discs disc to an existing VM the basic command is:

virsh attach-disk <VM NAME> /home/user/sdd2_cow.qcow2 vde --config --subdriver qcow2

However on ubuntu, I ran into some (reasonable) restrictions imposed by apparmor. I would receive an error attempting to attach the virtual discs, with an error from apparmor showing up in the logs:

[56880.068334] audit: type=1400 audit(1673655958.288:119): apparmor="DENIED" operation="open" profile="libvirt-c712b749-0f68-413f-9bd7-76ea061808eb" name="/dev/sde2" pid=58733 comm="qemu-system-x86" requested_mask="r" denied_mask="r" fsuid=64055 ouid=64055

The problem is that the hypervisor is by default prevented from accessing the host's raw discs. When the hypervisor attempts to read the backing file (or later, to lock it) apparmor prevents this. We solve this by adding lines to /etc/apparmor.d/local/abstractions/libvirt-qemu for each raw disc or partition that backs one of our virtual discs.

/dev/sde2 rk,

This line allows libvirt/qemu to open the raw disc for read access (r), and allows the raw disc to be locked (k). We do not grant write access to libvirt/qemu so this provides an extra line of defence against unintended changes to our original filesystems.

Once apparmor was configured, I could attach these discs to my virtual machine, and access the virtual devices at locations such as /dev/vde.

Final thoughts

With this or similar techniques you can safely test fixes to damaged filesystems without risking the underlying data. Once you're happy that your fix will have the desired effect, you can apply it to the real filesystem.


Building package from source:Cannot open for writing

I recently ran to an error when building a .deb package from source (on Ubuntu). I found a few people asking for help with the error message over the year, but I didn't find anyone offering an answer.

~/src/collectd-5.12.0$ debuild -us -uc -i -I
dpkg-buildpackage -us -uc -ui -i -I
dpkg-buildpackage: info: source package collectd
dpkg-buildpackage: info: source version 5.12.0-6
dpkg-buildpackage: info: source distribution unstable
dpkg-buildpackage: info: source changed by Bernd Zeimetz
dpkg-source -i -I --before-build .
dpkg-buildpackage: info: host architecture amd64
fakeroot debian/rules clean
rm -f build-stamp
[ ! -f Makefile ] || /usr/bin/make distclean
rm -f debian/README.Debian.plugins
rm -f src/.1 src/.5
rm -rf debian/pkgconfig
print() on closed filehandle OUT at /usr/share/intltool-debian/intltool-extract line 942.
print() on closed filehandle OUT at /usr/share/intltool-debian/intltool-extract line 942.
print() on closed filehandle OUT at /usr/share/intltool-debian/intltool-extract line 942.
print() on closed filehandle OUT at /usr/share/intltool-debian/intltool-extract line 942.
print() on closed filehandle OUT at /usr/share/intltool-debian/intltool-extract line 942.
print() on closed filehandle OUT at /usr/share/intltool-debian/intltool-extract line 942.
print() on closed filehandle OUT at /usr/share/intltool-debian/intltool-extract line 942.
print() on closed filehandle OUT at /usr/share/intltool-debian/intltool-extract line 942.
print() on closed filehandle OUT at /usr/share/intltool-debian/intltool-extract line 942.
print() on closed filehandle OUT at /usr/share/intltool-debian/intltool-extract line 942.
print() on closed filehandle OUT at /usr/share/intltool-debian/intltool-extract line 942.
print() on closed filehandle OUT at /usr/share/intltool-debian/intltool-extract line 942.
print() on closed filehandle OUT at /usr/share/intltool-debian/intltool-extract line 942.
print() on closed filehandle OUT at /usr/share/intltool-debian/intltool-extract line 942.
print() on closed filehandle OUT at /usr/share/intltool-debian/intltool-extract line 942.
print() on closed filehandle OUT at /usr/share/intltool-debian/intltool-extract line 942.
print() on closed filehandle OUT at /usr/share/intltool-debian/intltool-extract line 942.
print() on closed filehandle OUT at /usr/share/intltool-debian/intltool-extract line 942.
print() on closed filehandle OUT at /usr/share/intltool-debian/intltool-extract line 942.
Cannot open for writing at /usr/share/intltool-debian/intltool-update line 615.
make: *** [debian/rules:271: clean] Error 1
dpkg-buildpackage: error: fakeroot debian/rules clean subprocess returned exit status 2
debuild: fatal error at line 1182:
dpkg-buildpackage -us -uc -ui -i -I failed

In my case, the error occurred because I had downloaded the source package as root, so this caused permissions issues while building the package as a user.

$ sudo apt source collectd

If your source is owned by root as mine was, the solution is to change the ownership of the files to your user.

$ sudo chown <YOUR USERNAME>.<YOUR USERNAME> -R ~/src/collectd-5.12.0

Strip audio from a video in Ubuntu Linux

Lets say you have a video you recorded on your mobile phone, and you want to send it to someone but there is a noisy audio track which is irrelevant to the subject. Stripping out the audio track is fairly simple using the libav-tools package.

sudo apt-get install libav-tools

avconv -i INPUT.3gp -an -c:v copy OUTPUT.3gp

"-i INPUT.3gp" specifies the input file

"-an" specifies that there will be no audio track in the output file (or -vn would have no video track). Note that if "-an" appeared before "-i INPUT.3gp" we would copy no audio track from the input file (but there would still be an audio track in the output file, albeit empty possibly).

"-c:v copy" specifies the codec we use to encode the video track in the output file. Here we specify copy which means we don't transcode it. So rather than rendering the input video track to a buffer, then re-encoding it we are just copying it directly from the input file to the output file which means we don't lose any quality (whereas if we re-encoded it with a lossy codec then we would have a lower quality output file). Note that if we specified "-c:v <CODEC>" before the "-i INPUT.3gp" we would be specifying the codec used to decode the input file, rather than encode the output file.

"OUTPUT.3gp" specifies the output file. Obviously.


ubuntu 10.10 upgrade nvidia driver stops working

I recently updated a machine from ubuntu 10.04 to ubuntu 10.10. After reboot the nvidia driver wouldn't load and /var/log/Xorg.0.log contains:

[    25.820] (EE) NVIDIA: Failed to load the NVIDIA kernel module. Please check your
[    25.820] (EE) NVIDIA:     system's kernel log for additional error messages.
[    25.820] (II) UnloadModule: "nvidia"
[    25.820] (II) Unloading /usr/lib/xorg/extra-modules/
[    25.820] (EE) Failed to load module "nvidia" (module-specific error, 0)
[    25.820] (EE) No drivers available.
The solution was to run
$ sudo dpkg-reconfigure nvidia-current
which caused nvidia modules to be rebuilt. This showed me I was missing source for the running kernel version, so I then installed nvidia-185-kernel-source which was enough to get it back on its feet.