Freedombox longevity ideas?

Last year my Freedombox crashed due to a corrupted SD-Card and I was not able to recover my backups due to the missing LDAP backup. This made me think about how to improve the durability and reliability of my setup. It brought me to three main catagories I thereafter changed:

  • data storage: moved the root partition from SD-Card to a 256 GB SSD (SD-Card free setup)
  • improve cooling: installed a solid alluminum passive heat-sink which keeps my Raspberry around 35°C
  • minimize write access to SSD: installed log2ram and configured the log syncing only on reboot/shutdown

I’m curious in what you are doing to ensure a reliable and durable Freedobox setup. Are there some measures I could add to those mentioned? My overall aim is to get my box up and running and basically forget about it afterwards (resp. to minimize the admin effort as much as possible).

This post Root filesystem completely in RAM (only data mounts) - #16 by Tido
and the following 4 maybe also of interest for you.

I run freebox as a VM on Proxmox Virtual Environment (PVE).

Proxmox is running on an array of ZFS disks in Raidz1. VM is backed up to Proxmox Backup Server (PBS) on network once a week (just had to run a restore yesterday after borking matrix really good).

Benefits-
VMs are easier to backup and restore vs other options. And it is also complete. You are not saving some folders or a config file. You are saving the entire system as it existed at a moment in time. This is akin to pulling an SD card out of an SBC and imaging it to a img file. Only faster, 100% more reliable, and 99.9999% less down time. (literally, the pause on a VM for running bitmap capture is less time than it would take me to pull out the SD card from an rpi)
-Want to try a risky change? Run a back up real quick (either local snapshot or full on backup) then let it rip. Snapshots and backups can be run while the VM is running. A quick pause for dirty bitmap gathering and its back up to serve.
-Did an automatic upgrade bork the system? Restore from the last backup or snapshot prior to the update.
-Accidentally deleted an important file? Restore individual files if needed.
-Out-grow the disk? Provisioning more space on the VM is fairly easy.
-Need to assign more RAM or CPU to accommodate a growing load? Piece of cake.

PVE community version is stable and installs on just about anything desktop size or better. Old gaming computer? Say hello to your new VM host. And since PVE supports ZFS out of the box, no need to buy and setup RAID hardware. Setup automated backups to happen how ever often you like and keep as many local snapshots as you please.

PBS is the same way. Simple to setup, connect it to the PVE server on network, and you have a physically separate backup server (with all the same ZFS benefits, as it supports ZFS as well). Setup verification, pruning, garbage collection to trim down the number of backups, preserve how ever much history you like.

OR you can forgo the PBS and simply copy the local snapshots of the VM to an external harddrive for safe keeping (ie in the event of a power surge, fire etc). Or copy them up to an AWS S3 instance, if that’s your fancy. The entire PVE server can bite the dust from a lightning strike and the saved snapshot can be restored on a new PVE server as if nothing happened. PBS setup can always be added later.

ZFS provides disk redundancy without requiring RAID hardware. Should PVE server OS disk fail (if not installed on array at setup), recovering the ZFS pool after replacement and install of new disk is fairly easy (no worries about failed hardware config resulting in data loss).
ZFS also provides protection against bitrot. A 4x1TB disks (very cheap) in a raidz1 config gives you about 2.6TB of storage with single disk failure. Resilvering a new disk can be done live and is also fairly straightforward.
Want to buy the cheapest of disks and take the risk off-brand disk having higher failures? Set it to Raidz2 (two disk failure) for a usable storage of 1.7TB. Now you are safe during a resilver from a second disk failure.
I’ve even used resilver to swap from 500GB disks to 1TB disks over the course of a day. Just pull and resilver one disk at a time until they are all replaced. Then make sure autoexpand is set on the pool.
A host of other advantages with ZFS, too many to list here.

Pick up cheap SSDs or HDDs or go with really nice stuff or even SAS drives. Your pick. Doesn’t matter. I’ve even got one server running a 3TB, 6TB, and two 8TB disks all in the same pool (experiment machine). ZFS rolls with it. Have old disks laying around? Repurpose them and still have insurance against one failing on you. Its free real estate then.

This setup also means you can spool up new VMs of other services that might interest you; the infrastructure is already there and VMs can run side-by-side. Or it frees up your SBCs for other jobs like Pihole or chat bot host. Experimentation is easy in a VM. Also, this may repurpose hardware that is otherwise worthless for its original use but overkill just to run a Freedombox setup. You can shop for free, ready-to-roll VMs at turnkeylinux.org.

You can temp shutdown the current live service and restore a backup to verify its working correctly, validating your backups are good to go. Down time is a minute on the front and backends for startup and shutdown, as the restoration can run concurrently with the live VM. Then just shutdown the production VM and start the restored back up. Once testing is complete, shutdown the restored VM and start the production one again. Delete the restored VM. Back up now verified good or back.

Drawbacks-
A PVE server is an obviously larger, more power hungry, and somewhat noisier setup than a little SBC running on 5V USB with passive cooling. (Not that my servers are loud by any means but there is a set of quiet fans running and that’s more than the two rpi4s with passive aluminum radiators a few feet away)
It requires space for the machine and either hardware laying around or free cash to spend on buying budget hardware to assemble. (I run a vertical mount on the wall holding 6U of surplus-ed servers)
If you go with a PBS as well, then double that.

PVE, VMs, et al may be a learning curve, depending on prior experience.
There is lots of helpful information out there, a supportive community, and loads of resources.
But these resources only shorten the time and effort to get going, not eliminate them.
It is also obviously more work to setup initially and verify the system is all arranged how you want it.

And there is some effort once a month (or whatever interval you wish to pick to verify) backups. But this is true for any arrangement with backups. A backup system with no functional verification is a lottery game on whether it works or not. Data/hash verification hasn’t failed me yet; however, “Trust but verify” is a motto for a reason. I’d like to not learn something isn’t right the hard way.

EDIT- Reasons I went this route.
1- I had hardware on hand and desire to learn more about VM operations so as host a number of services on network.

2- Raspberry pi storage space limited when using SD cards. Doing SSD hip-ons for all them starts to negate the small form factor and result in a physical space issue. Add enough Rpis and the resulting equipment cost and physical space consumption was as much as the repurposed hardware. Data storage required more drives and or larger/more costly setups. No way to share storage across each discreet pi without resorting to some central network storage. If a central network storage was employed, no point in having pis for running the services when the size/cost penalty of turning the NAS into a proper hypervisor running VMs was virtually nil.

3- Rpi operations simply were not reliable. Yes, Rpi4 and current gen raspbian have come a LONG way in reducing the random filesystem corruptions that used to occur. But they are ultimately bound by the SD card, if you are keeping the form-factor small.
And trying to burn a back up image of a 32GB card is painful. I had to do it three times, then hash the output files, and compare. Sometimes I had three different values which then required a 4th image write to see which one it matched. A top this, there was a hefty storage penalty for keeping more than one image per device, as I did not benefit from deduplication or other storage tricks stashing them on my computer or external harddrive.
This doesn’t even count writing that back up image back to a card. Not a perfect science either.

This isn’t a knock on raspberry pi. I love experimenting with them and one 4B has been running a pihole on network without issue for a year or more at this point (pihole has an easy backup tool to store the config in case you have to rebuild). Perfect hosts for services with no user/shared data.
But for hosting services that store user or shared data, they have their limitations. If that data only exists on the devices and backups you manage, it becomes a hassle vs stepping up to a serious hardware solution.