[SOLVED] Move the system to another disk over the network and boot it without errors

I don’t know if this thread should go in this category, as it’s a problem, but my server is working fine.

Description
The disk storage capacity is small. I need to transfer the entire system to another larger disk. That is my only purpose.

  • Source disk A: my current server. Hard disk with a btrfs partition, an efi partition, and a swap partition.
  • Target disk B: partitions similar to the previous ones (with a larger btrfs partition) plus an ext4 partition with valuable data that cannot be moved.

I couldn’t find any specific documentation that were relevant to my case. I know that the development team is discussing the best way to make it easier for users to do something similar. But I guess the btrfs replace option would delete the previous data from the destination disk when moving the system. In any case, my FB computer does not handle USB-connected disks correctly (perhaps due to the energy required). I must therefore do it via the network. I tried the btrfs send/receive option.

Attempted procedure

  1. btrfs subvolume snapshot -r / /path/to/source/snapshot # create read-only snapshot
  2. btrfs send -v /path/to/source/snapshot -f /srv/rsync/snapshot.btrfs # create file from snapshot
  3. rsync -vh /srv/rsync/snapshot.btrfs user@local.ip:/mnt/tmppart/ # send file over the network using rsync to temporary partition on target disk
  4. btrfs receive /mnt/btrfspart < /mnt/tmppart/snapshot.btrfs # receive snapshot on target disk destination btrfs partition (executed at destination)
  5. btrfs subvolume set-default nnn /mnt/btrfspart # convert the received snapshot into the default subvolume of the target btrfs partition
  6. btrfs prop set -f /mnt/btrfspart/snapshot ro false # this command is not in the history, but apparently, instead of creating another snapshot, I converted the read-only snapshot into a writable snapshot
  7. Then I adjusted the UUIDs in /etc/fstab and installed GRUB
  8. Then I rebooted and the system was able to start up, although many services failed, and ran update-grub

I’m stuck on this last point. Too many things are going wrong. Some people have recommended dd. But I don’t know what the right method is. I think my procedure is wrong at some point that can be pinpointed.

Expected Results
Transfer the FreedomBox system from disk A to disk B via the local network, without deleting the ext4 partition on the latter, using any method.

I might consider transferring the system to a USB flash drive, if it turns out to be a better option and btrfs compression allows me to do so.

Actual results
I’m stuck because I couldn’t find —or didn’t know how to find— accurate information on the subject.

Information

  • Software: 25.14~bpo13+1
  • Hardware: x86-64 (netbook for now)
  • How did you install FB?: apt install freedombox

The process looks correct to me. This is how I would have attempted btrfs send/receive. I am surprised to see that some services aren’t working. Could you post details to those? Either we can fix them each or find the root cause.

@Sunil I repeated the process because the snapshot file had become outdated. I still need to correct the EFI partition and GRUB. When starting up (without EFI yet), I noticed that two services failed: zram and firewalld. It has not yet been tested on the server hardware.

Make sure you create a GPT partition table, and set EFI partition type on the EFI partition. Simply copying the files from old EFI partition to new partition is enough if your original installation is from FreedomBox disk image. Otherwise, run grub-install.

Both zram and firewalld are unrelated to disk setup. We can see their logs and fix them.

  • I copied the EFI partition with cp -a, but a multitude of errors were corrected by running grub-install and grub-update from the server.
  • btrfs scrub repaired logical errors that were not on the source disk and discovered physical errors that btrfs check can perhaps repair.
  • It seems that systemd is looking for a UUID that, according to lsblk, does not exist.
  • The partition table for this disk was originally MBR, and then I converted it to GPT, but the system warns me that it is in the wrong place on the disk.

journalctl -b -p 3 returns:

nov 13 04:09:38 hostname kernel: integrity: Problem loading X.509 certificate -22
nov 13 04:10:06 hostname wpa_supplicant[889]: bgscan simple: Failed to enable signal strength monitoring
nov 13 04:27:15 hostname rspamd[3133]: <9oqwi6>; map; rspamd_map_dns_callback: cannot resolve sa-update.surbl.org: no records with this name
nov 13 04:28:02 hostname systemd[1]: Timed out waiting for device dev-disk-by\x2duuid-97749938\x2rs4cf\x2d45f2\x2d89d4\x2d5cceccf4b33b.device - /dev/dis>
nov 13 04:28:55 hostname kernel: BTRFS error (device sda4): fixed up error at logical 112197632 on dev /dev/sda4 physical 1194328064
nov 13 04:29:33 hostname systemd[1]: Failed to start snapper-boot.service - Take snapper snapshot of root on boot.
nov 13 04:30:18 hostname kernel: BTRFS error (device sda4): unable to fixup (regular) error at logical 14985986048 on dev /dev/sda4 physical 10699407360

I cannot access /.snapshots. If I understand correctly, this directory is a subvolume that was not sent via btrfs send/receive.

journalctl -u snapper-boot returns:

oct 13 00:47:08 hostname systemd[1]: snapper-boot.service - Take snapper snapshot of root on boot was skipped because of an unmet condition check (Condi>
-- Boot f4119179d35a4e672r1bdfcd40317d07 --
nov 13 04:09:45 hostname systemd[1]: Starting snapper-boot.service - Take snapper snapshot of root on boot...
nov 13 04:29:33 hostname snapper[837]: Error de E/S (open failed path://.snapshots errno:19 (No such device)).
nov 13 04:29:33 hostname systemd[1]: snapper-boot.service: Main process exited, code=exited, status=1/FAILURE
nov 13 04:29:33 hostname systemd[1]: snapper-boot.service: Failed with result 'exit-code'.
nov 13 04:29:33 hostname systemd[1]: Failed to start snapper-boot.service - Take snapper snapshot of root on boot.

The Plinth diagnosis is OK.

Try to remove the existing snapper configuration with ‘snapper list-configs’ and then running ‘snapper delete-config’ on the config you find. Then re-run the setup for Storage app. It should recreate the /.snapshots snapshot and then setup snapper properly.

Do you have multiple devices on your btrfs file system? Run btrfs device usage / see the output. Use btrfs device remove to remove the spurious device.

@Sunil snapper delete-config cannot delete the configuration. I tried both sudo and root. I also deleted /etc/snapper/configs/root, restarted services, reconfigured both storage and snapshots from Plinth (the latter was inaccessible when I deleted the configuration file), and finally restored the root file. Perhaps /.snapshots that came with the snapshot needs to be deleted. The service that does not work at all is snapper-boot (snapper-timeline and snapper-cleanup can sometimes be activated).


On the other hand, btrfs device usage / returns

/dev/sda4, ID: 1
   Device size:           463.90GiB
   Device slack:              0.00B
   Data,single:            37.00GiB
   Metadata,DUP:            4.00GiB
   System,DUP:             64.00MiB
   Unallocated:           422.83GiB

That’s my btrfs partition. The ext4 data partition has almost the same capacity (which I did not add to /etc/fstab and from which I must migrate data later).

The correct command to delete the configuration appears to be snapper -c <config> delete-config. Even so, it barely responds to commands. I have disabled services and snapshot creation, and I have tried to delete everything in /.snapshots (with btrfs tools). But resists any modification. I can’t even remove it without dragging the entire system. It seems that Snapper is the one managing the system.

I think I’ve done it. My first snapshot in the migrated system and all services appear to be working.

I was never able to create or delete the default configuration for snapper, which is tied to the freedombox metapackage. So I tried to destroy /.snapshots using the btrfs and snapper tools.

It took me a while to realize that the freedombox scripts and btrfs tools were failing because /.snapshots was still there as a normal directory and not as a subvolume.

This isn’t the cleanest procedure, but it seems to have solved my problem:

  1. rm -r /.snapshots # grotesquely delete the directory
  2. btrfs subvolume create /.snapshots # create the real subvolume to enable snapshots
  3. re-run the snapshot configuration from Plinth

The errors detected by btrfs scrub required deleting every file that the command couldn’t fix -searching with journalctl --dmesg --grep "checksum error"- and restoring from borg. The email issue resolved itself.

Thanks, Sunil.

1 Like

Awesome! Good to know everything is working again. I guess the file errors were responsible for all the problem caused.

1 Like

Perhaps the fact that the subvolume size was not immediately resized (btrfs filesystem resize max /) also had something to do with it.