System unstable after upgrade to bullseye

Hi, Dave! The system was unstable after the upgrade with Bullseye, and this went on for a long time without being fixed. So I decided to contact Olimex about this. They replied that they were not aware of this problem and would check what they could do about it. Their reaction was lightning fast - on the same day they had uploaded the images with a properly updated bullseye and wrote to me that they would contact the Freedombox team. Without having to reinstall it, my pioneer started working stably after the last update, so I guess the problem is solved. Do you have any problems with yours?

1 Like

Hi Johnny, After upgrade to Bullseye, the system would crash and become unreachable from the outside world (web, ssh, etc.) every 1 to 3 days. Thinking that it might be hardware related, I purchased a new Olimex A20 Lime 2 board (Rev. L I believe) in December, installed the weekly FB stable image, and it too crashed after a few days. I can’t say if the latest update fixed the problem or not because I just wanted to get it running again after reading your post that Olimex issued a new Bullseys version. Only had Samba, OpenVPN, and Apache running, so it certainly was not overloaded, and the SD card was not filled with Snapshots. Just installed the latest version today and am waiting to see if it is now stable. Thanks for your fast reply.

Hello Oliver,
If you don’t already know, about three days ago, Olimex updated their Bullseye FreedomBox image to fix the random crashes I, and apparently you, have been suffering since the upgrade to Bullseye. I have heard that the normal updates have fixed the problem, but can’t verify that since I installed the latest weekly image and installed it today. Information from Johnny about update from Olimex
Just wanted to let you know.
Dave Oliver

hello,
I tried the 2022-01-07 img, and it lasted 1 day.
server crashed while using transmission.

I tried the 2022-01-07 img and it crashed three times over the last week. Still only running Cockpit, OpenVPN, and Samba. I did find 330 nearly identical log entries that occurred in rapid succession after the last two crashes:

" [1641941461.3557] dhcp4 (eth0): selecting lease failed: -131 NetworkManager" .

I am wondering if this could be some sort of external attack that renders the server unreachable. The router does indicate DOS attacks from various IP addresses. I have hardened my password and selected “Disable password authentication” under Secure Shell (SSH) Server in the System settings.

The first crash that occurred had something to do with either the automatic updates or the automatic backups which occur at night. I unscheduled automatic backups to see if that makes the FB more stable. I can report that the FB lasted through the night. I’ll report back if this solves the problem. The next thing to try is re-flash a new image and not select the recommended automatic updates.

Hello Johnny,
I believe my 10+ year old router/modem was letting DOS and DDOS attacks through and crashing my FBX within a day or two after rebooting or reinstalling newest FBX image. I replaced my modem with a new model and so-far-so-good; its been going for 3 whole days. Looking at the router/modem logs and FBX logs shows an increase in the attacks coming from around the world but I think most are being deflected. Only time will tell. If people are having crashes with old routers, it might be worthwhile to replace the insecure equipment. I will report back in a week or two. To see my experience with day one of the new router/modem go here https://discuss.freedombox.org/t/attacks-on-freedombox-from-around-the-world/1915

Hi all,
apologize for my late response. Unfortunately, The crashes remain so far, every 5 or 6 days my box does not respond anymore. The last time it went silent was today at 7:00. Here some lines from journalctl:

Feb 16 07:00:06 freedombox systemd[1]: Started Timeline of Snapper Snapshots.
Feb 16 07:00:06 freedombox systemd[1]: Started WordPress Scheduled Events Trigger (Cron).
Feb 16 07:00:06 freedombox dbus-daemon[366]: [system] Activating via systemd: service name='org.ope>
Feb 16 07:00:06 freedombox systemd[1]: Starting DBus interface for snapper...
Feb 16 07:00:06 freedombox dbus-daemon[366]: [system] Successfully activated service 'org.opensuse.>
Feb 16 07:00:06 freedombox systemd[1]: Started DBus interface for snapper.
Feb 16 07:00:06 freedombox systemd-helper[26412]: running timeline for 'root'.
-- Boot e7561208bb914a52b985bb8a66b9e17c --
Feb 16 08:10:50 freedombox kernel: Booting Linux on physical CPU 0x0
Feb 16 08:10:50 freedombox kernel: Linux version 5.10.0-11-armmp-lpae (debian-kernel@lists.debian.o>
Feb 16 08:10:50 freedombox kernel: CPU: ARMv7 Processor [410fc074] revision 4 (ARMv7), cr=30c5387d
Feb 16 08:10:50 freedombox kernel: CPU: div instructions available: patching division code
Feb 16 08:10:50 freedombox kernel: CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instructi>
Feb 16 08:10:50 freedombox kernel: OF: fdt: Machine model: Olimex A20-OLinuXino-LIME2
Feb 16 08:10:50 freedombox kernel: Memory policy: Data cache writealloc
Feb 16 08:10:50 freedombox kernel: efi: UEFI not found.
Feb 16 08:10:50 freedombox kernel: Reserved memory: created CMA memory pool at 0x000000004a000000, >
Feb 16 08:10:50 freedombox kernel: OF: reserved mem: initialized node default-pool, compatible id s>
Feb 16 08:10:50 freedombox kernel: Zone ranges:
Feb 16 08:10:50 freedombox kernel:   DMA      [mem 0x0000000040000000-0x000000006fffffff]
Feb 16 08:10:50 freedombox kernel:   Normal   empty
Feb 16 08:10:50 freedombox kernel:   HighMem  [mem 0x0000000070000000-0x000000007fffffff]
Feb 16 08:10:50 freedombox kernel: Movable zone start for each node
Feb 16 08:10:50 freedombox kernel: Early memory node ranges
Feb 16 08:10:50 freedombox kernel:   node   0: [mem 0x0000000040000000-0x000000007fffffff]
Feb 16 08:10:50 freedombox kernel: Initmem setup node 0 [mem 0x0000000040000000-0x000000007fffffff]
Feb 16 08:10:50 freedombox kernel: On node 0 totalpages: 262144
Feb 16 08:10:50 freedombox kernel:   DMA zone: 1728 pages used for memmap
Feb 16 08:10:50 freedombox kernel:   DMA zone: 0 pages reserved
Feb 16 08:10:50 freedombox kernel:   DMA zone: 196608 pages, LIFO batch:63
Feb 16 08:10:50 freedombox kernel:   HighMem zone: 65536 pages, LIFO batch:15
Feb 16 08:10:50 freedombox kernel: psci: probing for conduit method from DT.
Feb 16 08:10:50 freedombox kernel: psci: Using PSCI v0.1 Function IDs from DT
Feb 16 08:10:50 freedombox kernel: percpu: Embedded 21 pages/cpu s54668 r8192 d23156 u86016
Feb 16 08:10:50 freedombox kernel: pcpu-alloc: s54668 r8192 d23156 u86016 alloc=21*4096
Feb 16 08:10:50 freedombox kernel: pcpu-alloc: [0] 0 [0] 1 
Feb 16 08:10:50 freedombox kernel: Built 1 zonelists, mobility grouping on.  Total pages: 260416
Feb 16 08:10:50 freedombox kernel: Kernel command line: console=ttyS0,115200 quiet
Feb 16 08:10:50 freedombox kernel: Dentry cache hash table entries: 131072 (order: 7, 524288 bytes,>
Feb 16 08:10:50 freedombox kernel: Inode-cache hash table entries: 65536 (order: 6, 262144 bytes, l>
Feb 16 08:10:50 freedombox kernel: mem auto-init: stack:off, heap alloc:on, heap free:off
Feb 16 08:10:50 freedombox kernel: Memory: 896504K/1048576K available (12288K kernel code, 1680K rw>

At 8:10 I unplugged the power plug and reconnected a couple of seconds later. So the line before it went dumb reads:
freedombox systemd-helper[26412]: running timeline for 'root'

The next time my system goes down I’ll contact olimex myself.

Regards Oliver

1 Like

System keeps crashing. Anyone else still having this issue?

Jun 06 03:39:01 freedombox systemd[1]: Starting Clean php session files...
Jun 06 03:39:01 freedombox CRON[20022]: pam_unix(cron:session): session closed for user root
Jun 06 03:39:03 freedombox systemd[1]: phpsessionclean.service: Succeeded.
Jun 06 03:39:03 freedombox systemd[1]: Finished Clean php session files.
Jun 06 03:39:03 freedombox systemd[1]: phpsessionclean.service: Consumed 2.021s CPU time.
Jun 06 03:40:05 freedombox systemd[1]: Started WordPress Scheduled Events Trigger (Cron).
Jun 06 03:40:08 freedombox systemd[1]: wordpress-freedombox.service: Succeeded.
Jun 06 03:40:08 freedombox systemd[1]: wordpress-freedombox.service: Consumed 2.374s CPU time.
Jun 06 03:41:54 freedombox sshd[20101]: Received disconnect from 61.177.173.49 port 53246:11:  [preauth]
Jun 06 03:41:54 freedombox sshd[20101]: Disconnected from authenticating user root 61.177.173.49 port 53246 [preauth]
Jun 06 03:44:40 freedombox sshd[20107]: Unable to negotiate with 61.177.173.54 port 36886: no matching key exchange method fo>
Jun 06 03:50:05 freedombox systemd[1]: Started WordPress Scheduled Events Trigger (Cron).
Jun 06 03:50:08 freedombox systemd[1]: wordpress-freedombox.service: Succeeded.
Jun 06 03:50:08 freedombox systemd[1]: wordpress-freedombox.service: Consumed 2.254s CPU time.
Jun 06 03:52:58 freedombox /usr/bin/plinth[514]: # storage usage-info
Jun 06 03:52:58 freedombox sudo[20133]:   plinth : PWD=/ ; USER=root ; COMMAND=/usr/share/plinth/actions/storage usage-info
Jun 06 03:52:58 freedombox sudo[20133]: pam_unix(sudo:session): session opened for user root(uid=0) by (uid=110)
Jun 06 03:52:59 freedombox sudo[20133]: pam_unix(sudo:session): session closed for user root
-- Boot 78d0fad7344344b9bf5359ff0b68d2eb --
Jun 06 07:51:35 freedombox kernel: Booting Linux on physical CPU 0x0
Jun 06 07:51:35 freedombox kernel: Linux version 5.10.0-14-armmp-lpae (debian-kernel@lists.debian.org) (gcc-10 (Debian 10.2.1>
Jun 06 07:51:35 freedombox kernel: CPU: ARMv7 Processor [410fc074] revision 4 (ARMv7), cr=30c5387d
Jun 06 07:51:35 freedombox kernel: CPU: div instructions available: patching division code
Jun 06 07:51:35 freedombox kernel: CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache
Jun 06 07:51:35 freedombox kernel: OF: fdt: Machine model: Olimex A20-OLinuXino-LIME2
Jun 06 07:51:35 freedombox kernel: Memory policy: Data cache writealloc

As usual I am not able to see any hint as to why this happens in the log.

Did you try booting on an SD card with a generic Debian and then check the root file system on your SSD? (you could also use that opportunity to check file systems of your other disks).

Perhaps SSDs are more robust than SD card but power off by removing power probably increase the risk of file system issues.

I solved the problem by replacing my 10+ year old router/modem which was letting DOS and DDOS attacks through and crashing my FBX within a day or two after rebooting or reinstalling newest FBX image. My system has been up and running every since installing my Netgear CAX30. I subscribe to their Armor A.I. service that learns about attacks and blocks them. Good Luck. (Attacks on FreedomBox from Around the World?)

No I didn’t (yet) but I’m going to give it a try.
Removing power: I certainly don’t want to do that but at the moment it’s my only chance to ‘wake up’ my box, unfortunately.

I also see a lot of connection trials in the freedombox journal. My router is a Fritzbox 7590, 2 or 3 years old. But I had realised just now, that I had not updated its OS for a while so I updated my fritzbox a couple of minutes ago.

I have the last months the same problem after a few days no response

This is still ongoing for me and the crashes have been happening for quite some time, probably well over a year. The Pioneer FreedomBox crashed again this morning after being up for less than a day as I recall. I found this thread as I’ve been watching journalctl -f over an SSH login and I saw this message a while ago:

Feb 07 16:29:19 freedombox pcp-pmie[2076]: Severe demand for real memory 3.66pgsout/s@freedombox

This was after the system had been up about 2 hours and 50 minutes. The output of free shows:

$ free
               total        used        free      shared  buff/cache   available
Mem:         1014056      318904       67972       41444      627180      572492
Swap:         507024      129536      377488

Given that about half the RAM is marked as available, I’m not sure where the reported lack of real memory occurred.

I do have several services running but none are accessible to unauthorized users. At the time of that message in the logs the system was essentially idle as I was not accessing it and I’m the only authorized user.

The root file system is on a SATA HDD so I have reenabled syslog logging hoping to capture something there but I read that so far nothing has been written to the logs at the time of the crash.

This is an upgraded Bullseye system from the Buster release it originally came with.

I’m not sure what my plan is going forward. Clearly this system is not stable enough for continued use as it crashed again early this morning after being up for 1 day, 17 hours, and 40 minutes.

Examining what there are of the system logs and there are not any except the boot logs, and looking at what was captured in my terminal from SSH with journalctl -f, shows nothing unusual. Some sort of cron type job has excited and then the terminal was closed by the host (a message printed by the terminal on this computer).

I am running Bullseye on several other computers and have not experienced anything like this.

ETA: I have now reconfigured the journal to use persistent storage since I am using a SATA HDD for the root file system and have extended the retention time of its logs. Hopefully I can catch something after the next crash.

There is an update job that runs every day on a systemd timer. After it finishes, if anything it updated needs to be restarted it will automatically do a reboot. It is likely that is what it is happening–not a system crash, or any other expression of instability.

If the update/reboot frequency is problematic for you, you can change the systemd timer. I changed mine to run only once a week, because every time it reboots it kicks everyone off of the Tor bridge. The longer uptimes should make the bridge a more reliable resource for whoever is using it.

1 Like

I don’t know how your hardware is but you mentioned Pioneer with HDD. My Olinuxino lime 2 with an HDD (the thing sold by Olimex pre-assembled) was changing /dev/sdX letters for HDD and microSD while trying to write a lot to the HDD until I replaced the 2A power supply provided for it by OIimex with a 5A power supply (perhaps don’t need that much, this is what was easily available). I had tried the “add a second power supply via micro USB” idea, that did not help.

I can’t say how that would be visible on Freedombox, at the time I was trying to use it as a desktop computer and noticed write failure when running on the microSD and trying to install a system to the HDD. That makes me think this might not be your problem because you managed to put your root fs there but I prefer to mention this in case.

I understand that, but there was nothing in the journalcrl -f output on the terminal screen that indicated a reboot had been initiated. There were not the usual shutdown messages I would expect from systemd shutting down the services. It was just abruptly stopped with no messages whatsoever. This leads to the post by @Avron below.

I’ve given the power supply some thought as well so it is still on the list of potential culprits. I probably did not mention that the crashes started well before I moved the root FS to the SATA HDD a few weeks ago when the system was running from a 128 GB micro-SD card. I have not seen the crashes when I’ve been actively using the Box. It seems to happen at the oddest of times when the Box is otherwise idle.

Up thread @doliver10 mentioned a fix by Olimex for the random crashes he was experiencing which mine mirror exactly and have for quite some time. I’m not sure if the fix was supposed to be in the mainline kernel or patched in by the Debian/FreedomBox maintainers. If it is supposed to be downloaded from Olimex then my Box never got the memo as all the lines in my sources.list have debian.org as the base of the URL. I know there have been several kernel updates. Perhaps I should try a 6.x kernel from backports.

Logs are not preserved beyond boot because they are set to volatile to save SD cards from too many writes. You can change this behavior in System → Configuration. After that you should see logs for previous boots as well.

Do you have links to the fixes that Olimex did?