question

ektus avatar image
ektus asked

Cerbo GX random reboots

System setup: Cerbo GX MK1 with small touchscreen, wired ethernet connection
Power supply from 48V bus
Pylontech battery, 5x MPPT 150/35, 3~ Multiplus 2 5000, Lynx Shunt,
2x Shelly 3 EM (grid and AC Solar)

Operating system: Current Venus OS Large, ESS active (region: Germany), VRM active

Node-RED active with some flows to evalulate current production and consumtion and alter the grid setpoint in order to optimize daily production.
The old 9,5kWp AC-coupled solar generator may feed in and gets paid for it, the new 5,7kWp DC-coupled system is not allowed to feed in.

Symptoms: The system randomly reboots, sometimes leaving the grid setpoint at a large negative value.

dmesg -H only shows entries since reboot.

Reboot on connection loss in the VRM settings is off.

Sometimes, the reboot doesn't complete, the screen shows a small white square on black background located in the middle and remains unresponsive. The Node-RED flow and the VRM connection are active, but neither local nor remote console work. Power cycling the Cerbo resolves the problem, but sometimes needs two attempts.

Is there a log file somewhere that could give a hint at what might be going on?

Any ideas?

cerbo gxVenus OSNode-RED
2 |3000

Up to 8 attachments (including images) can be used with a maximum of 190.8 MiB each and 286.6 MiB total.

20 Answers
Alexandra avatar image
Alexandra answered ·

@ektus

Update the firmware, (the screen with the small white square problem) on of the recent ones was doing this exact thing on a few odd systems.

The grid set point not being where expected sound like a flow restart problem. So not a Victron os issue.

2 comments
2 |3000

Up to 8 attachments (including images) can be used with a maximum of 190.8 MiB each and 286.6 MiB total.

ektus avatar image ektus commented ·

I'm already running version 3.30 build 20240319094732, all of the reported problems are with this version. Checking for updates was the first thing to do. Most components in the system (Cerbo GX, MPPT, Lynx shunt) are up to date, the three Multiplus 2 5000 are on 506.

0 Likes 0 ·
Alexandra avatar image Alexandra ♦ ektus commented ·
@ektus

I would factory reset and start again.

0 Likes 0 ·
kevgermany avatar image
kevgermany answered ·

Could be an intermittent interruption of the DC power to the Cerbo. This will cause an unclean shutdown and unpredictable restart. And that white square means you have to do another unclean shutdown.

3 comments
2 |3000

Up to 8 attachments (including images) can be used with a maximum of 190.8 MiB each and 286.6 MiB total.

ektus avatar image ektus commented ·

The power supply uses the cable that came with the Cerbo and comes directly from the Lynx bus. That one should be stable enough with 8x Pylontech US5000 batteries and at the time of the problem 5x MPPT devices providing power.

0 Likes 0 ·
kevgermany avatar image kevgermany ♦♦ ektus commented ·

Yes it should be. But if there's a loose connection etc.

Also please check this link: https://community.victronenergy.com/storage/attachments/42304-2022-08-cerbo-gx-power-supply-issue-in-48v-systems.pdf



0 Likes 0 ·
ektus avatar image ektus commented ·

My Cerbo GX serial number starting with HQ2248 the power supply should be okay. I can add the extra capacitor and see if this would improve stability. The voltage at the USB hub is only 4.6V without its own power supply attached. Might this indicate a problem with the Cerbo's 5V rail? I'll measure again dirctly at the Cerbo's port, might also be a diode in the hub lowering the voltage.

0 Likes 0 ·
Alex Pescaru avatar image
Alex Pescaru answered ·

What is "top" command saying? Memory and CPU load ?...

Could be a watchdog triggering?...

2 comments
2 |3000

Up to 8 attachments (including images) can be used with a maximum of 190.8 MiB each and 286.6 MiB total.

ektus avatar image ektus commented ·

top reports anywhere between 40% and 75% CPU, memory currently sits at 504M free (about half). node-red at 14..18% and gui at 13%, python3 6% and a lot of smaller processes. This is after 1h uptime, I'll watch further.

If it were a watchdog triggering, would there be log entries somewhere? Debugging random reboots (once or twice a day, if that) without data isn't fun :-(

The 5V measures 5.03V directly at the Cerbo's USB connector, so all good there.

0 Likes 0 ·
Alex Pescaru avatar image Alex Pescaru ektus commented ·

Pretty high usage of the CPU... Especially from the gui.

Mine doesn't go higher than 2%. The same with python.

All the logs are in the /data/log folder.

1 Like 1 ·
nickdb avatar image
nickdb answered ·

Try reflash the unit. There is a reset to factory procedure that you can find via google or local search.

Done bother with the cap it won’t help.

Causes can only be - software issue, temp, missed wiring issue or a hardware fault. Eliminate the SW by flashing it from scratch.

2 |3000

Up to 8 attachments (including images) can be used with a maximum of 190.8 MiB each and 286.6 MiB total.

ektus avatar image
ektus answered ·

okay, we're up to something here. How to reduce load and/or increase the threshold?

From the file /data/logs/messages:

Apr 14 10:27:53 einstein daemon.info connmand[890]: ntp: time slew +0.042245 s
Apr 14 11:27:52 einstein daemon.err watchdog[670]: loadavg 11 7 4 is higher than the given threshold 0 6 6! Apr 14 11:27:53 einstein daemon.err watchdog[670]: repair binary /usr/sbin/store_watchdog_error.sh returned 253 = 'load average too high' Apr 14 11:27:53 einstein daemon.alert watchdog[670]: shutting down the system because of error 253 = 'load average too high' Apr 14 11:27:53 einstein daemon.err watchdog[670]: /usr/sbin/sendmail does not exist or is not executable (errno = 2) Apr 14 11:28:03 einstein syslog.info syslogd exiting Apr 14 11:30:18 einstein syslog.info syslogd started: BusyBox v1.31.1

The last screen of top running in the SSH session (might be from an earlier reboot):

Mem: 542428K used, 487656K free, 2972K shrd, 35304K buff, 135828K cached
CPU:  72% usr  10% sys   0% nic  14% idle   0% io   1% irq   1% sirq Load average: 10.20 7.20 5.36 2/307 7090   PID  PPID USER     STAT   VSZ %VSZ %CPU COMMAND  1960  1100 nodered  S     228m  23%  15% node-red  1975  1071 root     S     137m  14%  14% /opt/victronenergy/gui/gui -nomouse -display Multi: LinuxFb: VNC:size=800x480:depth=32:passwordFile=/data/conf/vncpassword.txt:0  1139  1116 root     S    22004   2%   7% {dbus_systemcalc} /usr/bin/python3 -u /opt/victronenergy/dbus-systemcalc-py/dbus_systemcalc.py   809   807 messageb S     4468   0%   6% dbus-daemon --system --nofork  1088  1061 root     S    38364   4%   5% {vrmlogger.py} /usr/bin/python3 -u /opt/victronenergy/vrmlogger/vrmlogger.py  2049  2038 root     S    32272   3%   5% python /data/dbus-shelly-3em-inverter/dbus-shelly-3em-inverter.py  2041  2039 root     S    32208   3%   5% python /data/dbus-shelly-3em-smartmeter/dbus-shelly-3em-smartmeter.py  1129  1106 root     S    26032   3%   4% {localsettings.p} /usr/bin/python3 -u /opt/victronenergy/localsettings/localsettings.py --path=/data/conf  1161  1147 root     S    21200   2%   4% {dbus_generator.} /usr/bin/python3 -u /opt/victronenergy/dbus-generator-starter/dbus_generator.py  2034  2028 root     S    55828   5%   3% /usr/bin/flashmq  1120  1063 root     S    19756   2%   2% /opt/victronenergy/venus-platform/venus-platform  1131  1110 root     S     8748   1%   2% /opt/victronenergy/hub4control/hub4control  2033  2024 root     S    21124   2%   2% {vesmart_server.} /usr/bin/python3 -u /opt/victronenergy/vesmart-server/vesmart_server.py -i hci0  1163  1149 root     S    11272   1%   2% /opt/victronenergy/dbus-fronius/dbus-fronius  1694  1658 root     S     3668   0%   1% /opt/victronenergy/mk2-dbus/mk2-dbus --log-before 25 --log-after 25 --banner -w -s /dev/ttyS4 -i -t mk3 --settings /data/var/lib/mk2-dbus/m  1277  1124 www-data S     6908   1%   1% nginx: worker process  2535  2533 root     S     3480   0%   1% /opt/victronenergy/vedirect-interface/vedirect-dbus -v --log-before 25 --log-after 25 -t 3 --banner -s /dev/ttyUSB1  1674  1672 root     S     3376   0%   1% /opt/victronenergy/vedirect-interface/vedirect-dbus -v --log-before 25 --log-after 25 -t 0 --banner -s /dev/ttyS5  1683  1681 root     S     3424   0%   0% /opt/victronenergy/vedirect-interface/vedirect-dbus -v --log-before 25 --log-after 25 -t 0 --banner -s /dev/ttyS7  1664  1659 root     S     3376   0%   0% /opt/victronenergy/vedirect-interface/vedirect-dbus -v --log-before 25 --log-after 25 -t 0 --banner -s /dev/ttyS6  1747  1737 root     S     3648   0%   0% /opt/victronenergy/vecan-dbus/vecan-dbus -c socketcan:can0 --banner --log-before 25 --log-after 25 -vv  1839  1827 root     S     3408   0%   0% /opt/victronenergy/vedirect-interface/vedirect-dbus -v --log-before 25 --log-after 25 -t 3 --banner -s /dev/ttyUSB0 13524 13496 root     R     2780   0%   0% top


1 comment
2 |3000

Up to 8 attachments (including images) can be used with a maximum of 190.8 MiB each and 286.6 MiB total.

nickdb avatar image nickdb ♦♦ commented ·
What modifications have you made?

If it is a vanilla install, reflash it like already suggested.

If you have customised it, remove/disable changes and see what that does to load.

The Cerbo should be fine with the devices you have.

0 Likes 0 ·
ektus avatar image
ektus answered ·

It has been modified, but only with things I need for operation. Notably the interface for my two shelly 3em smart meters and by activating node-red for tuning the grid setpoint. Removing those would lose functionality, especially as only part of the solar generators are allowed to feed to the grid. One of the shellies is needed for the grid metering in ESS, and with the standard ESS settings the battery would be full well before noon, reducing the production of my DC solar to zero during the rest of the day.

The second shelly monitors the AC solar (SMA inverter without usable data connection). This value is needed to dynamically set the grid setpoint.

As is now, I use the DC system for house and battery and sell everything from the AC system unless needed for charging the cars or days with low production. This is switched manually.

Currently, it's once again sitting at the small white square. In this state, everything but the GUI works and the system load looks okay. GUI seems to have used 40MB of RAM, as the free value now sits at 546MB (was 504MB earlier):

root@VictronCerboEinstein:/data/log# top
Mem: 483404K used, 546680K free, 1120K shrd, 24956K buff, 131004K cached CPU:  32% usr  16% sys   0% nic  44% idle   0% io   4% irq   4% sirq Load average: 0.87 2.24 2.53 1/305 19722   PID  PPID USER     STAT   VSZ %VSZ %CPU COMMAND   806   804 messageb S    31716   3%   7% dbus-daemon --system --nofork  2037  2028 root     S    31696   3%   7% python /data/dbus-shelly-3em-smartmeter/dbus-shelly-3em-smartmeter.py 19722  7157 root     R     2688   0%   7% top  1948  1091 nodered  S     224m  22%   4% node-red  2038  2031 root     S    31760   3%   4% python /data/dbus-shelly-3em-inverter/dbus-shelly-3em-inverter.py  1146  1107 root     S    21972   2%   4% {dbus_systemcalc} /usr/bin/python3 -u /opt/victronenergy/dbus-systemcalc-py/dbus_systemcalc.py  1144  1134 root     S    21068   2%   4% {dbus_generator.} /usr/bin/python3 -u /opt/victronenergy/dbus-generator-starter/dbus_generator.py  2026  2018 root     S    21044   2%   4% {vesmart_server.} /usr/bin/python3 -u /opt/victronenergy/vesmart-server/vesmart_server.py -i hci0  1111  1056 root     S    19808   2%   4% /opt/victronenergy/venus-platform/venus-platform  1153  1136 root     S    11272   1%   4% /opt/victronenergy/dbus-fronius/dbus-fronius  1123  1101 root     S     8632   1%   4% /opt/victronenergy/hub4control/hub4control  1680  1652 root     S     3668   0%   4% /opt/victronenergy/mk2-dbus/mk2-dbus --log-before 25 --log-after 25 --banner -w -s /dev/ttyS4 -i -t mk3 --settings /data/var/lib/mk2-dbus  1697  1688 root     S     3376   0%   4% /opt/victronenergy/vedirect-interface/vedirect-dbus -v --log-before 25 --log-after 25 -t 0 --banner -s /dev/ttyS7  1164  1145 root     S    64136   6%   0% python /data/SetupHelper/PackageManager.py  1112  1064 root     S    63704   6%   0% /opt/victronenergy/gui/gui -nomouse -display Multi: LinuxFb: VNC:size=800x480:depth=32:passwordFile=/data/conf/vncpassword.txt:0  2032  2020 root     S    55312   5%   0% /usr/bin/flashmq  1149  1132 root     S    45300   4%   0% {dbus-modbus-cli} /usr/bin/python3 -u /opt/victronenergy/dbus-modbus-client/dbus-modbus-client.py  2024  2016 root     S    40592   4%   0% {mqtt-rpc.py} /usr/bin/python3 -u /opt/victronenergy/mqtt-rpc/mqtt-rpc.py  1080  1054 root     S    38060   4%   0% {vrmlogger.py} /usr/bin/python3 -u /opt/victronenergy/vrmlogger/vrmlogger.py  1075  1060 root     S    34192   3%   0% {venus-button-ha} /usr/bin/python3 -u /opt/victronenergy/venus-button-handler/venus-button-handler -D  1148  1109 root     S    27708   3%   0% {dbus_shelly.py} /usr/bin/python3 /opt/victronenergy/dbus-shelly/dbus_shelly.py  1120  1097 root     S    26032   3%   0% {localsettings.p} /usr/bin/python3 -u /opt/victronenergy/localsettings/localsettings.py --path=/data/conf  1118  1095 root     S    23264   2%   0% {netmon} /usr/bin/python3 -u /opt/victronenergy/netmon/netmon   880     1 root     S    22740   2%   0% php-fpm: master process (/etc/php-fpm.conf)   881   880 www-data S    22740   2%   0% php-fpm: pool www   882   880 www-data S    22740   2%   0% php-fpm: pool www  1157  1138 root     S    21660   2%   0% {dbus_digitalinp} /usr/bin/python3 -u /opt/victronenergy/dbus-digitalinputs/dbus_digitalinputs.py --poll=poll /dev/gpio/digital_input_1 /  1129  1103 root     S    19532   2%   0% {dbus_vebus_to_p} /usr/bin/python3 -u /opt/victronenergy/dbus-vebus-to-pvinverter/dbus_vebus_to_pvinverter.py  1074  1066 simple-u S    12592   1%   0% /bin/simple-upnpd --xml /var/run/simple-upnpd.xml -d   118     1 root     S    12068   1%   0% /sbin/udevd -d  1072  1062 root     S    10220   1%   0% /opt/victronenergy/venus-access/venus-access   937     1 root     S     9072   1%   0% /usr/sbin/wpa_supplicant -u -O /var/run/wpa_supplicant -s  2897  2867 root     S     8308   1%   0% -sh  7157  7105 root     S     8308   1%   0% -sh  1938     1 root     S     8288   1%   0% -sh   809     1 root     S     7592   1%   0% /usr/sbin/haveged -w 1024 -v 1  1269  1115 www-data S     6908   1%   0% nginx: worker process  1115  1093 root     S     6640   1%   0% nginx: master process /usr/sbin/nginx


3 comments
2 |3000

Up to 8 attachments (including images) can be used with a maximum of 190.8 MiB each and 286.6 MiB total.

nickdb avatar image nickdb ♦♦ commented ·
Not much more then that we can assist you with, your choices are limited.

Either try make your customisations more efficient, try offload Node-red to its own hardware off-cerbo, or upgrade to more powerful hardware, like the Ekrano.

New Venus OS adds features and overhead and Victron are unlikely to consider the effects of mods on overall performance, the cerbo just isn't designed with that in mind.

I have a fair number of devices and run Node-red, the Cerbo runs fine, so it can't just be Node-red.

0 Likes 0 ·
pau1phi11ips avatar image pau1phi11ips nickdb ♦♦ commented ·

@ektus I agree with Nick. It sounds like the Shelly EM driver could do with some optimising. Check it's not writing lots to a log file or polling too often.

I use a Shelly PM driver that's on a small grid tie inverter. Even that uses a fair amount of CPU.

0 Likes 0 ·
ektus avatar image ektus pau1phi11ips commented ·
As can be seen in top, it does use some cpu, but only 3% each (or some such). Increasing the threshold for the watchdog seems to have stabilised things. uptime currently sits at 2 days, 4:20 and load average 2.77, 2.35, 2.27.


Load is not stable. It's creeping up and down anywhere between 1 and 10.


If I had known earlier the Cerbo GX is such a weak device, I might have gone for something better.
0 Likes 0 ·
Kevin Windrem avatar image
Kevin Windrem answered ·

The white square in the middle of the local screen indicates a problem with the gui task. Check /data/log/start-gui/current. Might provide a clue.



3 comments
2 |3000

Up to 8 attachments (including images) can be used with a maximum of 190.8 MiB each and 286.6 MiB total.

Kevin Windrem avatar image Kevin Windrem commented ·

There should be more entries for a normal boot.


If you use tail -50 /data/log/start-gui/current | tai64nlocal, the times will be converted to a readable format and you can more easily tell if things have changed since the last look at the logs. See for example if the gui is restarting over and over again.

Another thing to look at is dbus-spy and see if there's a com.victronenergy.gui service listed. It may be coming and going if the gui is restarting.

If you have not done so already, it may be time to reinstall the firmware. Go to the Firmware menu, select the backup firmware and after it's rebooted, go to online updates and download the latest firmware again. If there are any mods you've made to the system they may need to be reinstalled, but wait to see if the system comes up before installing them. Might also be "interesting" to turn off node red to see if that is somehow the source of your problems.

0 Likes 0 ·
ektus avatar image ektus Kevin Windrem commented ·
0 Likes 0 ·
Alexandra avatar image Alexandra ♦ ektus commented ·
Yeah it's a funny one.

I am running 3.3 with no issues.

I had to do a factory reset though on one of the recent versions. But stay with the one that worked for you most correctly for now.

0 Likes 0 ·
ektus avatar image
ektus answered ·
root@VictronCerboEinstein:/data/log/start-gui# more current
@40000000661b9fdc1ee9b434 [Service] installing "/service/websockify-c" @40000000661b9ffd14b5eb04 file:///opt/victronenergy/gui/qml/OverviewTiles.qml:148: TypeError: Result of expression 'stateTile.activeNotifications' [undefined] is not an object. @40000000661b9ffd14b90014 file:///opt/victronenergy/gui/qml/OverviewTiles.qml:144: TypeError: Result of expression 'stateTile.activeNotifications' [undefined] is not an object. @40000000661b9ffd14b91f54 file:///opt/victronenergy/gui/qml/OverviewTiles.qml:94: TypeError: Result of expression 'activeNotifications' [undefined] is not an object. @40000000661b9ffd14b9e68c file:///opt/victronenergy/gui/qml/OverviewTiles.qml:90: ReferenceError: Can't find variable: NotificationCenter @40000000661ba2230f78b75c ScreenSaver::timer @40000000661ba2230f7dfeec ScreenSaver::screenOff @40000000661bac862865b29c VePlatformVenus::wakeupEvent @40000000661bac8628685a4c ScreenSaver::enable @40000000661bac86286869ec ScreenSaver::screenOn @40000000661bac8628734efc VeQuickView::mousePressEvent: wakeupEvent @40000000661baede287f8bcc ScreenSaver::timer @40000000661baede287fd604 ScreenSaver::screenOff @40000000661bbe550e63096c *** CCGX booted (30253) *** @40000000661bbe552cedbb5c *** starting start-gui *** @40000000661bbe56017ada54 Error org.freedesktop.DBus.Error.ServiceUnknown: The name com.victronenergy.settings was not provided by any .service files @40000000661bbe5627b8636c Error org.freedesktop.DBus.Error.ServiceUnknown: The name com.victronenergy.settings was not provided by any .service files @40000000661bbe572fc2dc2c Error org.freedesktop.DBus.Error.ServiceUnknown: The name com.victronenergy.settings was not provided by any .service files @40000000661bbe5835579614 Error org.freedesktop.DBus.Error.ServiceUnknown: The name com.victronenergy.settings was not provided by any .service files @40000000661bbe5939e36f64 Error org.freedesktop.DBus.Error.ServiceUnknown: The name com.victronenergy.settings was not provided by any .service files @40000000661bbe5b043ec8cc Error org.freedesktop.DBus.Error.ServiceUnknown: The name com.victronenergy.settings was not provided by any .service files @40000000661bbe5c0bd4b59c Error org.freedesktop.DBus.Error.ServiceUnknown: The name com.victronenergy.settings was not provided by any .service files @40000000661bbe5d1234d5d4 Error org.freedesktop.DBus.Error.ServiceUnknown: The name com.victronenergy.settings was not provided by any .service files @40000000661bbe5e1e0251d4 Error org.freedesktop.DBus.Error.ServiceUnknown: The name com.victronenergy.settings was not provided by any .service files @40000000661bbe5f247fd57c Error org.freedesktop.DBus.Error.ServiceUnknown: The name com.victronenergy.settings was not provided by any .service files @40000000661bbe602a73aef4 Error org.freedesktop.DBus.Error.ServiceUnknown: The name com.victronenergy.settings was not provided by any .service files @40000000661bbe6231b34634 method return time=1713094232.833523 sender=:1.43 -> destination=:1.46 serial=9 reply_serial=2 @40000000661bbe6231b36d44    int32 0 @40000000661bbe6312bd0cec *** headless device=0 @40000000661bbe6312bd2c2c *** Waiting for localsettings... @40000000661bbe632317dd24 *** Localsettings is up, continuing... @40000000661bbe642d55201c *** Starting gui, with VNC enabled (VncLocal=1 VncInternet=1) @40000000661bbe65199ba80c QVNCServer created on port 5900 @40000000661bbe651d71970c Reloading input devices: "LinuxInput:/dev/input/event0 LinuxInput:/dev/input/event1 LinuxInput:/dev/input/event2" @40000000661bbe651f4bd934 "using /etc/venus for runtime features" @40000000661bbe651fa77e24 "running on einstein" @40000000661bbe6523d01d1c Connecting to deprecated signal QDBusConnectionInterface::serviceOwnerChanged(QString,QString,QString) @40000000661bbe6536ef609c Creating settings @40000000661bbe661d81db1c [VeQItemExportedDbusServices] Registered service "debug.victronenergy.gui" root@VictronCerboEinstein:/data/log/start-gui#

That's the current file with GUI not running.

2 |3000

Up to 8 attachments (including images) can be used with a maximum of 190.8 MiB each and 286.6 MiB total.

ektus avatar image
ektus answered ·

Reverted to 3.22, now top looks much nicer:

Mem: 514880K used, 515204K free, 3160K shrd, 20440K buff, 137632K cached
CPU:  46% usr   6% sys   0% nic  44% idle   0% io   1% irq   0% sirq Load average: 1.02 2.21 1.24 1/307 3657   PID  PPID USER     STAT   VSZ %VSZ %CPU COMMAND  1969  1115 nodered  S     216m  21%  12% node-red  1152  1131 root     S    21744   2%   5% {dbus_systemcalc} /usr/bin/python3 -u /opt/victronenergy/dbus-systemcalc-py/dbus_systemcalc.py  2055  2052 root     S    31696   3%   4% python /data/dbus-shelly-3em-smartmeter/dbus-shelly-3em-smartmeter.py   826   824 messageb S     4148   0%   4% dbus-daemon --system --nofork  2056  2054 root     S    31760   3%   4% python /data/dbus-shelly-3em-inverter/dbus-shelly-3em-inverter.py  1097  1076 root     S    38124   4%   3% {vrmlogger.py} /usr/bin/python3 -u /opt/victronenergy/vrmlogger/vrmlogger.py  1135  1087 root     S     144m  14%   3% /opt/victronenergy/gui/gui -nomouse -display Multi: LinuxFb: VNC:size=800x480:depth=32:passwordFile=/data/conf/vncpassword.txt:0  1178  1160 root     S    21068   2%   2% {dbus_generator.} /usr/bin/python3 -u /opt/victronenergy/dbus-generator-starter/dbus_generator.py  1148  1121 root     S    26032   3%   2% {localsettings.p} /usr/bin/python3 -u /opt/victronenergy/localsettings/localsettings.py --path=/data/conf  1146  1125 root     S     8632   1%   1% /opt/victronenergy/hub4control/hub4control  2047  2039 root     S    55308   5%   1% /usr/bin/flashmq  2053  2041 root     S    21104   2%   1% {vesmart_server.} /usr/bin/python3 -u /opt/victronenergy/vesmart-server/vesmart_server.py -i hci0  1180  1162 root     S    11272   1%   1% /opt/victronenergy/dbus-fronius/dbus-fronius  1711  1672 root     S     3664   0%   1% /opt/victronenergy/mk2-dbus/mk2-dbus --log-before 25 --log-after 25 --banner -w -s /dev/ttyS4 -i -t mk3 --settings /data/var/lib/mk2-dbus/mkxport-  1099  1090 simple-u S    12592   1%   1% /bin/simple-upnpd --xml /var/run/simple-upnpd.xml -d  2540  2538 root     S     3480   0%   0% /opt/victronenergy/vedirect-interface/vedirect-dbus -v --log-before 25 --log-after 25 -t 3 --banner -s /dev/ttyUSB1  1729  1718 root     S     3376   0%   0% /opt/victronenergy/vedirect-interface/vedirect-dbus -v --log-before 25 --log-after 25 -t 0 --banner -s /dev/ttyS7  1877  1866 root     S     3376   0%   0% /opt/victronenergy/vedirect-interface/vedirect-dbus -v --log-before 25 --log-after 25 -t 3 --banner -s /dev/ttyUSB0  1690  1688 root     S     3376   0%   0% /opt/victronenergy/vedirect-interface/vedirect-dbus -v --log-before 25 --log-after 25 -t 0 --banner -s /dev/ttyS5  1682  1678 root     S     3376   0%   0% /opt/victronenergy/vedirect-interface/vedirect-dbus -v --log-before 25 --log-after 25 -t 0 --banner -s /dev/ttyS6  1834  1826 root     S     3644   0%   0% /opt/victronenergy/vecan-dbus/vecan-dbus -c socketcan:can0 --banner --log-before 25 --log-after 25 -vv  3509  3238 root     R     2780   0%   0% top  1182  1158 root     S    45304   4%   0% {dbus-modbus-cli} /usr/bin/python3 -u /opt/victronenergy/dbus-modbus-client/dbus-modbus-client.py  1154  1127 root     S    19532   2%   0% {dbus_vebus_to_p} /usr/bin/python3 -u /opt/victronenergy/dbus-vebus-to-pvinverter/dbus_vebus_to_pvinverter.py  1828  1820 root     S     3176   0%   0% /opt/victronenergy/can-bus-bms/can-bus-bms --log-before 25 --log-after 25 -vv -c socketcan:can1 --banner  1194  1170 root     S    64136   6%   0% python /data/SetupHelper/PackageManager.py  1105  1078 root     S    13620   1%   0% /opt/victronenergy/venus-platform/venus-platform  3217  1830 root     S     5524   1%   0% sshd: root@pts/0  1192  1168 root     S     3484   0%   0% /opt/victronenergy/dbus-adc/dbus-adc --banner  1138  1096 root     S     3044   0%   0% {serial-starter.} /bin/bash /opt/victronenergy/serial-starter/serial-starter.sh    48     2 root     IW       0   0%   0% [kworker/u4:1-ev]    59     2 root     IW       0   0%   0% [kworker/u4:3-ev]  2051  2043 root     S    40324   4%   0% {mqtt-rpc.py} /usr/bin/python3 -u /opt/victronenergy/mqtt-rpc/mqtt-rpc.py  1108  1083 root     S    34200   3%   0% {venus-button-ha} /usr/bin/python3 -u /opt/victronenergy/venus-button-handler/venus-button-handler -D  1179  1133 root     S    27688   3%   0% {dbus_shelly.py} /usr/bin/python3 /opt/victronenergy/dbus-shelly/dbus_shelly.py
2 comments
2 |3000

Up to 8 attachments (including images) can be used with a maximum of 190.8 MiB each and 286.6 MiB total.

Hi @ektus,

The R&D team is aware of your report, thanks for the details.

I think your conclusion here to revert back to v3.22 is the best thing to do at the moment, I also recommend the same for anyone else experiencing the same issue until there is more info, or another release.

0 Likes 0 ·
ojack avatar image ojack Guy Stewart (Victron Community Manager) ♦♦ commented ·
I have a reboot issue at beta v3.40~2. Reverting to v3.40~1 everything is ok again.

All versions before were ok, too.


0 Likes 0 ·
semlohnhoj avatar image
semlohnhoj answered ·

This is interesting. I have the new Victron Energy Meter VM-3P75CT on ve.can and since then I've had regular random reboots. I put this down to the alleged "professional quality" ethernet cable I used and replaced it with a krone ethernet cable which did seem to reduce the occurrences but it is still happening maybe every other day now instead of several times a day. I'm using Node Red as well monitoring the meter. I didn't realise there were logs I could investigate so I'll look at that. I'll check them later when I get a moment.

This may well be unrelated but I thought I'd mention it.

2 comments
2 |3000

Up to 8 attachments (including images) can be used with a maximum of 190.8 MiB each and 286.6 MiB total.

semlohnhoj avatar image semlohnhoj commented ·
Also, in my case I've noticed it operates in Schrödinger's cat mode where if you never open the NodeRed dashboard it doesn't reboot. I just opened it now for the first time in a few days and within 30 seconds it has rebooted. I have noticed this before but just thought I was imagining it. This could just be load related pushing the CPU over the edge I suppose?
0 Likes 0 ·
ektus avatar image ektus semlohnhoj commented ·
Sounds plausible. For me with 3.22, the cpu jumps around 5% if I open the Node-red dashboard. Without it 63% to 68% and with the dashboard open more like 68% to 75%. This extra load might be too much in some cases.

It would be nice if the watchdog settings could be altered to trigger at a higher load. Or the efficiency of the system be improved to avoid the high loads altogether.


0 Likes 0 ·
ektus avatar image
ektus answered ·

For reference: With version 3.22 and nothing changed in the config, after 2:40 hours top looks as follows:

Mem: 532984K used, 497100K free, 2940K shrd, 32216K buff, 136200K cached
CPU:  60% usr   6% sys   0% nic  30% idle   0% io   1% irq   0% sirq Load average: 2.99 3.85 4.08 1/308 5973   PID  PPID USER     STAT   VSZ %VSZ %CPU COMMAND  1077  1063 root     R     137m  14%  14% /opt/victronenergy/gui/gui -nomouse -display Multi: LinuxFb: VNC:size=800x480:depth=32:passwordFile=/data/conf/vncpassword  1954  1089 nodered  S     226m  22%  10% node-red  1149  1125 root     S    21696   2%   7% {dbus_systemcalc} /usr/bin/python3 -u /opt/victronenergy/dbus-systemcalc-py/dbus_systemcalc.py   803   801 messageb S     4912   0%   5% dbus-daemon --system --nofork  1098  1053 root     S    38204   4%   5% {vrmlogger.py} /usr/bin/python3 -u /opt/victronenergy/vrmlogger/vrmlogger.py  2043  2040 root     S    31760   3%   4% python /data/dbus-shelly-3em-inverter/dbus-shelly-3em-inverter.py  2041  2037 root     S    31696   3%   4% python /data/dbus-shelly-3em-smartmeter/dbus-shelly-3em-smartmeter.py  1158  1133 root     S    21196   2%   3% {dbus_generator.} /usr/bin/python3 -u /opt/victronenergy/dbus-generator-starter/dbus_generator.py  1115  1095 root     S    26032   3%   2% {localsettings.p} /usr/bin/python3 -u /opt/victronenergy/localsettings/localsettings.py --path=/data/conf  1161  1135 root     S    11272   1%   2% /opt/victronenergy/dbus-fronius/dbus-fronius  1152  1113 root     S     8764   1%   2% /opt/victronenergy/hub4control/hub4control  2034  2024 root     S    55824   5%   1% /usr/bin/flashmq  2033  2028 root     S    21092   2%   1% {vesmart_server.} /usr/bin/python3 -u /opt/victronenergy/vesmart-server/vesmart_server.py -i hci0  1685  1652 root     S     3664   0%   1% /opt/victronenergy/mk2-dbus/mk2-dbus --log-before 25 --log-after 25 --banner -w -s /dev/ttyS4 -i -t mk3 --settings /data/v  1842  1834 root     S     3404   0%   1% /opt/victronenergy/vedirect-interface/vedirect-dbus -v --log-before 25 --log-after 25 -t 3 --banner -s /dev/ttyUSB0  2511  2509 root     S     3404   0%   0% /opt/victronenergy/vedirect-interface/vedirect-dbus -v --log-before 25 --log-after 25 -t 3 --banner -s /dev/ttyUSB1  1655  1651 root     S     3376   0%   0% /opt/victronenergy/vedirect-interface/vedirect-dbus -v --log-before 25 --log-after 25 -t 0 --banner -s /dev/ttyS6  1668  1666 root     S     3376   0%   0% /opt/victronenergy/vedirect-interface/vedirect-dbus -v --log-before 25 --log-after 25 -t 0 --banner -s /dev/ttyS5  1701  1694 root     S     3376   0%   0% /opt/victronenergy/vedirect-interface/vedirect-dbus -v --log-before 25 --log-after 25 -t 0 --banner -s /dev/ttyS7  1776  1771 root     S     3644   0%   0% /opt/victronenergy/vecan-dbus/vecan-dbus -c socketcan:can0 --banner --log-before 25 --log-after 25 -vv  5748  5733 root     R     2780   0%   0% top  1159  1131 root     S    45304   4%   0% {dbus-modbus-cli} /usr/bin/python3 -u /opt/victronenergy/dbus-modbus-client/dbus-modbus-client.py  1145  1117 root     S    19532   2%   0% {dbus_vebus_to_p} /usr/bin/python3 -u /opt/victronenergy/dbus-vebus-to-pvinverter/dbus_vebus_to_pvinverter.py  1168  1141 root     S     3484   0%   0% /opt/victronenergy/dbus-adc/dbus-adc --banner  1169  1143 root     S    64136   6%   0% python /data/SetupHelper/PackageManager.py  1808  1804 root     S     3176   0%   0% /opt/victronenergy/can-bus-bms/can-bus-bms --log-before 25 --log-after 25 -vv -c socketcan:can1 --banner  1107  1093 root     S    23264   2%   0% {netmon} /usr/bin/python3 -u /opt/victronenergy/netmon/netmon  5688  1810 root     S     5524   1%   0% sshd: root@pts/0  1097  1072 root     S     3044   0%   0% {serial-starter.} /bin/bash /opt/victronenergy/serial-starter/serial-starter.sh     7     2 root     IW       0   0%   0% [kworker/u4:0-ev]   788     2 root     SW       0   0%   0% [RTW_CMD_THREAD]  2039  2026 root     S    40324   4%   0% {mqtt-rpc.py} /usr/bin/python3 -u /opt/victronenergy/mqtt-rpc/mqtt-rpc.py  1075  1059 root     S    34200   3%   0% {venus-button-ha} /usr/bin/python3 -u /opt/victronenergy/venus-button-handler/venus-button-handler -D  1150  1127 root     S    27688   3%   0% {dbus_shelly.py} /usr/bin/python3 /opt/victronenergy/dbus-shelly/dbus_shelly.py   877     1 root     S    22740   2%   0% php-fpm: master process (/etc/php-fpm.conf)   878   877 www-data S    22740   2%   0% php-fpm: pool www   879   877 www-data S    22740   2%   0% php-fpm: pool www  1162  1137 root     S    21660   2%   0% {dbus_digitalinp} /usr/bin/python3 -u /opt/victronenergy/dbus-digitalinputs/dbus_digitalinputs.py --poll=poll /dev/gpio/di  1100  1055 root     S    13620   1%   0% /opt/victronenergy/venus-platform/venus-platform
2 |3000

Up to 8 attachments (including images) can be used with a maximum of 190.8 MiB each and 286.6 MiB total.

nunorgp avatar image
nunorgp answered ·

Have the exact same issue with that firmware :
1713198753492.png


but lately the gui as been very very slow ... even before the update to the latest firmware.


1713198753492.png (14.9 KiB)
2 |3000

Up to 8 attachments (including images) can be used with a maximum of 190.8 MiB each and 286.6 MiB total.

ektus avatar image
ektus answered ·

I'll probably go and test tomorrow, but in the meantime, I've had another unexpected reboot, this time with the older 3.22. Looks like it was once again due to some overload. Last refresh of top:

Mem: 632912K used, 397172K free, 2984K shrd, 83980K buff, 147084K cached
CPU:  62% usr   6% sys   0% nic  28% idle   0% io   1% irq   0% sirq Load average: 10.94 7.00 4.83 2/308 19822   PID  PPID USER     STAT   VSZ %VSZ %CPU COMMAND  1077  1063 root     R     146m  15%  15% /opt/victronenergy/gui/gui -nomouse -display Multi: LinuxFb: VNC:size=800x480:depth=32:passwordFile=/data/conf/vncpassword  1954  1089 nodered  S     246m  24%  11% node-red  1149  1125 root     S    21696   2%   6% {dbus_systemcalc} /usr/bin/python3 -u /opt/victronenergy/dbus-systemcalc-py/dbus_systemcalc.py   803   801 messageb R     4912   0%   5% dbus-daemon --system --nofork  1098  1053 root     S    38204   4%   5% {vrmlogger.py} /usr/bin/python3 -u /opt/victronenergy/vrmlogger/vrmlogger.py  2041  2037 root     S    32340   3%   5% python /data/dbus-shelly-3em-smartmeter/dbus-shelly-3em-smartmeter.py  2043  2040 root     S    32272   3%   5% python /data/dbus-shelly-3em-inverter/dbus-shelly-3em-inverter.py  1158  1133 root     S    21196   2%   3% {dbus_generator.} /usr/bin/python3 -u /opt/victronenergy/dbus-generator-starter/dbus_generator.py  1161  1135 root     S    11272   1%   2% /opt/victronenergy/dbus-fronius/dbus-fronius  1152  1113 root     S     8748   1%   2% /opt/victronenergy/hub4control/hub4control  2033  2028 root     S    21092   2%   2% {vesmart_server.} /usr/bin/python3 -u /opt/victronenergy/vesmart-server/vesmart_server.py -i hci0  1115  1095 root     S    26032   3%   2% {localsettings.p} /usr/bin/python3 -u /opt/victronenergy/localsettings/localsettings.py --path=/data/conf  2034  2024 root     S    55824   5%   2% /usr/bin/flashmq  1685  1652 root     S     3664   0%   1% /opt/victronenergy/mk2-dbus/mk2-dbus --log-before 25 --log-after 25 --banner -w -s /dev/ttyS4 -i -t mk3 --settings /data/v  1701  1694 root     S     3424   0%   1% /opt/victronenergy/vedirect-interface/vedirect-dbus -v --log-before 25 --log-after 25 -t 0 --banner -s /dev/ttyS7  1842  1834 root     S     3404   0%   1% /opt/victronenergy/vedirect-interface/vedirect-dbus -v --log-before 25 --log-after 25 -t 3 --banner -s /dev/ttyUSB0  1655  1651 root     S     3424   0%   1% /opt/victronenergy/vedirect-interface/vedirect-dbus -v --log-before 25 --log-after 25 -t 0 --banner -s /dev/ttyS6  1668  1666 root     S     3424   0%   1% /opt/victronenergy/vedirect-interface/vedirect-dbus -v --log-before 25 --log-after 25 -t 0 --banner -s /dev/ttyS5  2511  2509 root     S     3404   0%   1% /opt/victronenergy/vedirect-interface/vedirect-dbus -v --log-before 25 --log-after 25 -t 3 --banner -s /dev/ttyUSB1  1808  1804 root     S     3176   0%   0% /opt/victronenergy/can-bus-bms/can-bus-bms --log-before 25 --log-after 25 -vv -c socketcan:can1 --banner 26343 26323 root     R     2780   0%   0% top  1145  1117 root     S    19532   2%   0% {dbus_vebus_to_p} /usr/bin/python3 -u /opt/victronenergy/dbus-vebus-to-pvinverter/dbus_vebus_to_pvinverter.py  1776  1771 root     S     3752   0%   0% /opt/victronenergy/vecan-dbus/vecan-dbus -c socketcan:can0 --banner --log-before 25 --log-after 25 -vv  1169  1143 root     S    64136   6%   0% python /data/SetupHelper/PackageManager.py  1159  1131 root     S    47608   5%   0% {dbus-modbus-cli} /usr/bin/python3 -u /opt/victronenergy/dbus-modbus-client/dbus-modbus-client.py  1097  1072 root     S     3044   0%   0% {serial-starter.} /bin/bash /opt/victronenergy/serial-starter/serial-starter.sh  1100  1055 root     S    13620   1%   0% /opt/victronenergy/venus-platform/venus-platform  1168  1141 root     S     3484   0%   0% /opt/victronenergy/dbus-adc/dbus-adc --banner 14317     2 root     IW       0   0%   0% [kworker/u4:2-ev] 19363     2 root     IW       0   0%   0% [kworker/u4:1-ev]  2039  2026 root     S    40324   4%   0% {mqtt-rpc.py} /usr/bin/python3 -u /opt/victronenergy/mqtt-rpc/mqtt-rpc.py  1075  1059 root     S    34200   3%   0% {venus-button-ha} /usr/bin/python3 -u /opt/victronenergy/venus-button-handler/venus-button-handler -D  1150  1127 root     S    27688   3%   0% {dbus_shelly.py} /usr/bin/python3 /opt/victronenergy/dbus-shelly/dbus_shelly.py  1107  1093 root     S    23264   2%   0% {netmon} /usr/bin/python3 -u /opt/victronenergy/netmon/netmon   877     1 root     S    22740   2%   0% php-fpm: master process (/etc/php-fpm.conf)   878   877 www-data S    22740   2%   0% php-fpm: pool www   879   877 www-data S    22740   2%   0% php-fpm: pool www  1162  1137 root     S    21660   2%   0% {dbus_digitalinp} /usr/bin/python3 -u /opt/victronenergy/dbus-digitalinputs/dbus_digitalinputs.py --poll=poll /dev/gpio/di  1082  1066 simple-u S    12856   1%   0% /bin/simple-upnpd --xml /var/run/simple-upnpd.xml -d   115     1 root     S    11944   1%   0% /sbin/udevd -d Connection to 192.168.0.80 closed by remote host.ctronenergy/venus-access/venus-access Connection to 192.168.0.80 closed.
2 |3000

Up to 8 attachments (including images) can be used with a maximum of 190.8 MiB each and 286.6 MiB total.

mvader (Victron Energy) avatar image
mvader (Victron Energy) answered ·

Hi @ektus ,

I see nothing unusual in your dumps of top output. Also the gui being at 14% is quite normal, having it open at the overview page, with the “moving ants” does that.

Also, I don’t expect that the problem that we fixed in v3.31 has anything to do with what you’re seeing. And that you’re still experiencing reboots with v3.31~beta confirms that.

Looks to me that you simply have a quite busy Cerbo GX. There is nothing that we (Victron) are going to do about that any time soon I’m afraid.

There are no trivial improvements to cpu load; if they would exist we’d already have implemented them.

To resolve this, get a faster GX device: Ekrano GX.

Or check how often a second there are dbus messages sent by your Shelly system. Since that is a mod you’re running.

Hope this helps, and if nothing else then for making clear that its best not to hope that this is because of a bug soon solved.


Matthijs

2 |3000

Up to 8 attachments (including images) can be used with a maximum of 190.8 MiB each and 286.6 MiB total.

ektus avatar image
ektus answered ·

The Shelly 3EM are queried once every 750ms. They are in the low single digits CPU load. I've been watching top for quite some time, and the load average is slowly creeping up and down. While watching, I've seen as high as 5.7 and as low as 1.8 for the 5min average. And anything between 0.9 and 10 for the realtime value.

One trivial improvement would be to not run any fronius-related tasks if there is nothing from this vendor present in the system.

Now I've got the 5min limit set to 8 while the 15min limit remains at 6. Let's see what happens.

For reference, at 3 hours uptime in the middle of the night:

Mem: 546632K used, 483452K free, 2980K shrd, 36936K buff, 136696K cached
Mem: 546312K used, 483772K free, 2980K shrd, 37248K buff, 136688K cached
CPU:  57% usr   7% sys   0% nic  33% idle   0% io   1% irq   0% sirq
Load average: 0.96 1.81 2.57 2/309 17346
  PID  PPID USER     STAT   VSZ %VSZ %CPU COMMAND
 1113  1082 root     S     138m  14%  12% /opt/victronenergy/gui/gui -nomouse -display Multi: LinuxFb: VNC:size=800x480:depth=32:passwordFile=/data/conf/vncpassword.txt:0
 1946  1094 nodered  S     228m  23%   9% node-red
 1107  1072 root     S    38284   4%   8% {vrmlogger.py} /usr/bin/python3 -u /opt/victronenergy/vrmlogger/vrmlogger.py
 1152  1145 root     S    21980   2%   5% {dbus_systemcalc} /usr/bin/python3 -u /opt/victronenergy/dbus-systemcalc-py/dbus_systemcalc.py
 2046  2041 root     S    32340   3%   4% python /data/dbus-shelly-3em-smartmeter/dbus-shelly-3em-smartmeter.py
 2045  2042 root     S    32272   3%   4% python /data/dbus-shelly-3em-inverter/dbus-shelly-3em-inverter.py
  822   820 messageb S     4548   0%   4% dbus-daemon --system --nofork
 1186  1159 root     S    21068   2%   2% {dbus_generator.} /usr/bin/python3 -u /opt/victronenergy/dbus-generator-starter/dbus_generator.py
 2038  2027 root     S    55852   5%   2% /usr/bin/flashmq
 1127  1120 root     S    26036   3%   2% {localsettings.p} /usr/bin/python3 -u /opt/victronenergy/localsettings/localsettings.py --path=/data/conf
 2036  2029 root     S    21104   2%   1% {vesmart_server.} /usr/bin/python3 -u /opt/victronenergy/vesmart-server/vesmart_server.py -i hci0
 1104  1074 root     S    19792   2%   1% /opt/victronenergy/venus-platform/venus-platform
 1126  1124 root     S     8736   1%   1% /opt/victronenergy/hub4control/hub4control
 1172  1161 root     S    11272   1%   1% /opt/victronenergy/dbus-fronius/dbus-fronius
 1707  1674 root     S     3668   0%   1% /opt/victronenergy/mk2-dbus/mk2-dbus --log-before 25 --log-after 25 --banner -w -s /dev/ttyS4 -i -t mk3 --settings /data/var/lib/mk2-dbu
 1805  1793 root     S     3648   0%   0% /opt/victronenergy/vecan-dbus/vecan-dbus -c socketcan:can0 --banner --log-before 25 --log-after 25 -vv
 1686  1684 root     S     3376   0%   0% /opt/victronenergy/vedirect-interface/vedirect-dbus -v --log-before 25 --log-after 25 -t 0 --banner -s /dev/ttyS5
 1716  1712 root     S     3376   0%   0% /opt/victronenergy/vedirect-interface/vedirect-dbus -v --log-before 25 --log-after 25 -t 0 --banner -s /dev/ttyS7
 1873  1869 root     S     3376   0%   0% /opt/victronenergy/vedirect-interface/vedirect-dbus -v --log-before 25 --log-after 25 -t 3 --banner -s /dev/ttyUSB0
 1187  1180 root     S    64136   6%   0% python /data/SetupHelper/PackageManager.py
 2802  2798 root     S     3408   0%   0% /opt/victronenergy/vedirect-interface/vedirect-dbus -v --log-before 25 --log-after 25 -t 3 --banner -s /dev/ttyUSB1
 1676  1669 root     S     3376   0%   0% /opt/victronenergy/vedirect-interface/vedirect-dbus -v --log-before 25 --log-after 25 -t 0 --banner -s /dev/ttyS6
 1168  1157 root     S    45300   4%   0% {dbus-modbus-cli} /usr/bin/python3 -u /opt/victronenergy/dbus-modbus-client/dbus-modbus-client.py
 1191  1178 root     S     3484   0%   0% /opt/victronenergy/dbus-adc/dbus-adc --banner
 1142  1136 root     S    19532   2%   0% {dbus_vebus_to_p} /usr/bin/python3 -u /opt/victronenergy/dbus-vebus-to-pvinverter/dbus_vebus_to_pvinverter.py
 1861  1856 root     S     3176   0%   0% /opt/victronenergy/can-bus-bms/can-bus-bms --log-before 25 --log-after 25 -vv -c socketcan:can1 --banner
 1109  1090 root     S     3044   0%   0% {serial-starter.} /bin/bash /opt/victronenergy/serial-starter/serial-starter.sh
17324  9639 root     R     2780   0%   0% top
   48     2 root     IW       0   0%   0% [kworker/u4:1-ev]
  807     2 root     SW       0   0%   0% [RTW_CMD_THREAD]
13993     2 root     IW       0   0%   0% [kworker/u4:0-ev]
13699     2 root     IW       0   0%   0% [kworker/0:2-eve]
 2037  2025 root     S    40592   4%   0% {mqtt-rpc.py} /usr/bin/python3 -u /opt/victronenergy/mqtt-rpc/mqtt-rpc.py
 1096  1078 root     S    34192   3%   0% {venus-button-ha} /usr/bin/python3 -u /opt/victronenergy/venus-button-handler/venus-button-handler -D
 1155  1147 root     S    27708   3%   0% {dbus_shelly.py} /usr/bin/python3 /opt/victronenergy/dbus-shelly/dbus_shelly.py
 1130  1118 root     S    23264   2%   0% {netmon} /usr/bin/python3 -u /opt/victronenergy/netmon/netmon
  896     1 root     S    22740   2%   0% php-fpm: master process (/etc/php-fpm.conf)
  897   896 www-data S    22740   2%   0% php-fpm: pool www
  898   896 www-data S    22740   2%   0% php-fpm: pool www
 1182  1174 root     S    21660   2%   0% {dbus_digitalinp} /usr/bin/python3 -u /opt/victronenergy/dbus-digitalinputs/dbus_digitalinputs.py --poll=poll /dev/gpio/digital_input_1 
 1101  1084 simple-u S    12592   1%   0% /bin/simple-upnpd --xml /var/run/simple-upnpd.xml -d
  116     1 root     S    11680   1%   0% /sbin/udevd -d
 1112  1080 root     S    10220   1%   0% /opt/victronenergy/venus-access/venus-access
  947     1 root     S     9072   1%   0% /usr/sbin/wpa_supplicant -u -O /var/run/wpa_supplicant -s
 2020     1 root     S     8288   1%   0% -sh
 9639  9603 root     S     8172   1%   0% -sh
  825     1 root     S     7592   1%   0% /usr/sbin/haveged -w 1024 -v 1


2 |3000

Up to 8 attachments (including images) can be used with a maximum of 190.8 MiB each and 286.6 MiB total.

Alex Pescaru avatar image
Alex Pescaru answered ·

Hi @ektus

You can always try to systematically kill the PID process you don't want and see if in that session the watchdog based reboot happens or not.

Then, once you find the culprit, invalidate its start at the boot time.

Although killing one process and reducing CPU load could mask another problem with another process.

Alex

2 |3000

Up to 8 attachments (including images) can be used with a maximum of 190.8 MiB each and 286.6 MiB total.

ektus avatar image
ektus answered ·

Looks like raising the 5min threshold to 8 has helped avoiding at least one reboot today, as I've seen the 5min average reach 7.3 or so half an hour ago and no reboot yet. Now it's back down to 3.77. No processes stand out, so probably just a lot of simultaneous starts randomly occurring at the same time. I'll try and see if I can spread out the "once per second" tasks in Node-red some.

2 |3000

Up to 8 attachments (including images) can be used with a maximum of 190.8 MiB each and 286.6 MiB total.

ektus avatar image
ektus answered ·

Still running. Uptime is now 34hrs and counting. Nothing changed besides the watchdog parameters. I've even seen averages above 8, but those have come down relatively quickly again.

2 |3000

Up to 8 attachments (including images) can be used with a maximum of 190.8 MiB each and 286.6 MiB total.

Dierk Grossfeld avatar image
Dierk Grossfeld answered ·

I had those unforseen reboots also quite a while ago.
That load average watchdog reboots the cerbo hard, so headshot if you want to call it like this.


I had that mostly due to the custom component for my go-e Charger wallbox.

The python script is fetching the data via curl from the charger and is based on the shelly script you run.
Since wireless is sometimes due to key exchanges and stuff a bit unreliable it comes to connections that could not be established.

Since the script misses to define a timeout for a connection, the script stalls and waits forever for a response.

That in turn leads to a daemonize state for that process which in turn causes the load to rise (sysload 1/5/15 expresses an amount of waiting processes more or less).
I cant remember 100% for sure if the cerbo runs the process multiple time then, but at least it causes the sysload based watchdog to reboot the device.

Also at least the goecharger script also causes much load due to heavily logging in a log file. File writes on a nand also cause high load die to the relatively low speed of that storage, so they are also expensive from a sysload perspective.


I havent checked the sources for the shelly script, but check for things like requests.get (curl)


request_data = requests.get(url = URL)

and try adding a timeout:

request_data = requests.get(url = URL, timeout=3)
2 |3000

Up to 8 attachments (including images) can be used with a maximum of 190.8 MiB each and 286.6 MiB total.

ektus avatar image
ektus answered ·

There is only one instance of requests.get, and that one has timeout=5 as parameter. Line 134 in

https://github.com/fabian-lauer/dbus-shelly-3em-smartmeter/blob/main/dbus-shelly-3em-smartmeter.py

There are lengthy logfiles, but logging is set to ERROR. The bigger one is 12MB, the smaller one 3MB but that one hasn't been installed as long. I've just now renamed both of these to have a fresh start.

The Shelly scripts might contribute to the problem, but I haven't seen excess CPU usage from these. The system is still running, now at 2 days and 16 hours. That's with watchdog threshold raised to 0 8 6.

1 comment
2 |3000

Up to 8 attachments (including images) can be used with a maximum of 190.8 MiB each and 286.6 MiB total.

Dierk Grossfeld avatar image Dierk Grossfeld commented ·

so timeout 5 seconds.

Thats good and should at least avoid high load.
but nevertheless, there has to be a reason for such high load.

I have that goe charger script + a little bit of node red running to show a dashboard and control the goe for excess pv charging.


normal load on mine is around 1 to 1.5

root@einstein:~# uptime
 14:01:20 up 33 days,  7:02,  load average: 1.87, 1.41, 1.50
0 Likes 0 ·