question

tylera avatar image
tylera asked

CCGX Load too high

I live on a boat, and my CCGX crashes when underway because the load is too high.


When I'm underway, I have the following devices connected to the CCGX:

- Via can bus:

- Oceanvolt Motor, Starboard

- Oceanvolt Motor, Port

- Valence BMS (48V battery BMS)

- Victron BMV (24V battery)

- Victron MPPT (24V)

- Victron MPPT (48V)

- Victron Quattro

- DMC (for Victron Quattro)

- Ethernet connection (connected to router, but no wifi)

- GPS Antenna


I'm trying to figure out why when I'm underway the CCGX crashes repeatedly. Logging in, I see the following in /var/log/messages.1


Apr 27 12:03:41 ccgx daemon.err watchdog[488]: repair binary /usr/sbin/store_watchdog_error.sh returned 253 = 'load average too high'
Apr 27 12:03:41 ccgx daemon.alert watchdog[488]: shutting down the system because of error 253 = 'load average too high'
Apr 27 12:03:41 ccgx daemon.err watchdog[488]: /usr/sbin/sendmail does not exist or is not executable (errno = 2)
Apr 27 12:03:51 ccgx syslog.info syslogd exiting
Apr 27 12:04:16 ccgx syslog.info syslogd started: BusyBox v1.24.1


[I'm amused it attempts to send an email with its dying breath.]


I've been trying to rule out what might cause this.


- When not underway (motors powered off, not reporting to CCGX), there is no crashing


I can identify the crash because when the CCGX starts up after the crash, the menu bar (Pages...Menu) is visible on the bottom of the screen when displaying the "Overview" page that comes on at start-up.


root@ccgx:/var/log# cat /proc/loadavg
4.22 3.77 3.75 6/250 26842


I've seen both repeated crashing (I'm defining this as crashing before the GPS picks up the satellite, thus it's clearly visible on my GPS tracks) and sporadic crashing (I'm defining this as one or more crashes during a 4 hour period).


Update: I ran the above command once every 5 seconds for a period of 20 minutes with the motors on and the motors off. So the two graphs here are a distribution plot of the average loads. Granted, Oceanvolt wrote their own module for monitoring their motors on the CCGX. [Caveat: The data sampling period might include other things such as cronjobs, though I didn't run over an hourly boundary.]


I'm a little concerned about the load being this high in either configuration. I'm not sure what is normal for a CCGX, but I expect there is only a single CPU (though sadly I see all the real-time monitoring is written in python).

screen-shot-2021-04-28-at-42447-pm.png


CCGX Color Control
4 comments
2 |3000

Up to 8 attachments (including images) can be used with a maximum of 190.8 MiB each and 286.6 MiB total.

jeroen avatar image jeroen ♦ commented ·

It is not crashing, it is the watchdog kicking in because the average load is above 6 for quite some time, see /etc/watchdog.conf. Load in the linux world is a bit complicated, it includes processes waiting for IO as well e.g. So even on a single core, you can run fine with a load above 1. Apparently your setup pushes that well over the limit. I don't know why that is.

-------------------------
watchdog-device = /dev/watchdog

log-dir = /var/volatile/log/watchdog
min-memory = 2500
max-load-5 = 6
max-load-15 = 6

repair-binary = /usr/sbin/store_watchdog_error.sh
test-prescaler = 60
retry-timeout = 0

0 Likes 0 ·
tylera avatar image tylera jeroen ♦ commented ·

The 'why' you don't know is what I'm trying to figure out :). I bumped the 6/6 numbers to 10/8 for now, although I think I'll also write a script to dump /proc/loadavg periodically to see what 'normal' is over the course of a day (versus when I'm sailing).


Regarding "it's not crashing": I'd still call CFIT a crash. :)

0 Likes 0 ·
jeroen avatar image jeroen ♦ tylera commented ·

You can bump those numbers what you want, but they are there for a reason. If the device cannot keep up it will get busier and busier and load will continue to increase.

0 Likes 0 ·
tylera avatar image tylera commented ·

TIL: The CCGX's "gui" process uses a considerable amount of CPU on the overview page. **even if the display is off**. However, changing the page to something less intensive. I'm not sure the effect on load this will have, but since I care more about the data reporting than what I can read on the display at any current moment, I'll try keeping it on the "Notifications" page (drops CPU from ~13% -> 5%).

0 Likes 0 ·
1 Answer
mvader (Victron Energy) avatar image
mvader (Victron Energy) answered ·

Hi @TylerA the only real solution will be to get a Cerbo I’m afraid. That has much more CPU power.

Wrt your issue on the Venus repo: yes the GUI uses cpu even when the display is off. But its needed for Remote Console also; and uses same (or even more, I’m not sure) cpu for that. What you could do there is make aure to bever leave your CCGX while on the system overview page.


in general: there are various ways we could reduce the load a bit here and there. But often it either adds complexity, or requires a significant amount of work or testing, or only reduces the load by a little bit. And I have only limited R&D resources to spend, and choose to spend them on support of new products as well as other new features instead. Sorry

3 comments
2 |3000

Up to 8 attachments (including images) can be used with a maximum of 190.8 MiB each and 286.6 MiB total.

tylera avatar image tylera commented ·

Does the Cerbo work with a CCGX? i.e. can I connect everything to the Cerbo then keep the CCGX I already bought for reporting, or did I waste my money on the CCGX?

0 Likes 0 ·
Kevin Windrem avatar image Kevin Windrem tylera commented ·

Cerbo is a replacement for the CCGX. Really to replace CCGX, you need Cerbo plus a Touch 50 display. You can run Cerbo faceless and use a tablet/computer for the GUI using a web browser. There are also interfaces to several of the common marine display systems.

An alternative is to run Venus OS on a Raspberry Pi 3B+. You'd also need a display and USB dongles for all your connections and the cost for these should be considered. It's also a DYI project to get the PI powered from your boat's electrical system, get it mounted and everything connected. Once it's in though it works really well. That's what I'm doing.

0 Likes 0 ·
tylera avatar image tylera Kevin Windrem commented ·

@Kevin Windrem Okay, that's a little disappointing, but thanks for the answer.

0 Likes 0 ·