Cerbo GX corrupted, reinstalled, worked for a couple of days and now reboots

Hi all,

I installed a Cerbo GX in my camper van back in 2022. It worked fine for three years. Then, last week I touched the screen (GX Touch 50) and nothing happened.

Looking at the Cerbo, I saw flashing amber lights in a repeating three flash sequence - fast, fast, slow.

I tried power-cycling, no change.

I tried wiping the configuration (USB stick with blank config), no change.

I tried re-installing (SD card with install image written to it), it sprung into life and found all the attached devices with the exception of my Ruuvi Tags. After some searching, I found the issue with on-board Bluetooth (serial numbers before ā€œHQ2207ā€), ordered a TP-Link dongle and when it arrived, the Ruuvi Tags appeared.

The Cerbo then stayed working for four days. Today it started constantly rebooting. Sometimes the boot process would complete and you’d get, maybe, 5 seconds of uptime before it rebooted and then other times the boot process just hung and the unit did nothing until (presumably) a watchdog kicked in and rebooted it again.

After reading some of the other similar posts here, I can confirm that
a) Unplugging the GX Touch USB cable does not make it stop rebooting,
b) It is not configured to reboot when VRM is unavailable, however when it does reboot, it connects to VRM anyway.
c) It is definitely rebooting (pings fail, screen shows reboot animation etc.)

Given the need to completely re-flash it, I’m assuming that the on-board storage is dead or dying and as I’m just inside the five year warranty, I’m going to have to send it back. I’m posting here, however, just in case anybody has any ideas about what it could be. I don’t hold out much hope though - it’s been re-flashed and reset, so I don’t imagine there’s much else I can try, but…

Thanks in advance of your answers!

Cheers,

N.

1 Like

@NiccyB , you might want to check for water damage before you install the new one. I had a similar problem caused by water damage, and wished I had found and fixed that issue before installing the new unit!

Thank you for the suggestion but it’s in a place where if there was water, I’d have a whole load of other problems!

I think it may be heat related. It only got up to mid-20’s today, but now the temperature’s dropped off a bit, it’s stopped rebooting and seems to be working properly again. I wonder if this is related to the Bluetooth over-temperature problem. I did have a good feel around it when it was rebooting in case it was obviously overheating, but it was just slightly warm to the touch - certainly nothing I’d worry about.

I’ve ordered a new Mk2, so we’ll see how that goes and when this one’s repaired/replaced at least I’ll have a spare.

N.

A lot of mine are running in temperatures over 40, so if it’s heat related it’s still something Victron should replace. Hopefully you’ll be able to figure it out! Let us know what you find.

1 Like

Hate to be the bearer of bad news.
Electrical gear is all based around 25C optimal ambient temps.
The warranty specifically excludes high temps, so it is your responsibility to ensure proper airflow. From the warranty exclusion list:

Use in an inappropriate environment (dust, corrosive vapour, humidity, high temperature, biological infestation, etc.).

Installation or operation issues are not covered in any form, and too many people expect the manufacturer to cover them when a problem has been self-created.

That said, over temp results in constant rebooting of a GX, which would be a log entry on the device itself.

I have the same issue since I updated to the latest firmware today, except that if I unplug my touch 70 device it is stable. I do not have an SD card in the device. The device is brand new.

Hi,

Thank you for your comment, but I’m not sure you understood my post.

The ambient temperature was around 25C degrees (probably a little lower, but let’s say 25). The temperature was not ā€˜high’. There was plenty of air flow in the area, but even if this wasn’t the case (for emphasis, in this case there was plenty of airflow), the Cerbo is a plastic box with no vents and no heat-sink surfaces, so it likely wouldn’t benefit from airflow anyway unless it was really really really hot).

If you’re suggesting that the Cerbo isn’t appropriate for ambient temperatures higher than 25C, then given there’s no form of refrigerated cooling available for it, I’d question why it’s for sale in most of the world.

However, all that aside, it started rebooting again overnight when the ambient temperature was below 20C, so I can rule out temperature as a cause. Looking at it, it seems to go through phases of working for a period and then continually rebooting and then working for a period and then…

Anyway, the new Cerbo has arrived, I’ll get that plumbed in and put the old one on the workbench to try to work out exactly what causes it and if there’s anything in the logs.

N.

The comment wasn’t directed at your issue as it was towards another comment implying if something heat related happens to a GX Victron should replace it. Which is incorrect.

For your issue, there are many reasons a GX can restart, a common one is CPU load. if you ssh to the GX and review the entries in var/log it will report the cause when it has decided to restart itself.

It can also happen with a reboot on comms loss setting - some users forget this is set.
Where cpu is a trigger, it is typically due to modifications - additional drivers like Shelly etc.
If your system is modified, on a current GX version and the new UI, you can check on settings → general → support/modifications.

Hi,

Thank you for your reply.

CPU usage seems to hover around 10%. Load average is 0/0/0.

I’ve gone through /var/log and while I can see the messages in /var/log/messages* that relate to the device rebooting, there doesn’t seem to be anything which indicates the cause for the reboot.

/var/log/crash-logger/current just shows a whole load of lines like:

...
@40000000688fa685031279e4 *** CCGX booted (0) ***
@40000000688fa8ca2650e044 *** CCGX booted (0) ***
@40000000688faa013a532124 *** CCGX booted (0) ***
@40000000688fac150103e534 *** CCGX booted (0) ***
@40000000688fac4406267c5c *** CCGX booted (0) ***
@40000000688fac6913a3c36c *** CCGX booted (0) ***
...

The only thing that looks out of the ordinary are the kern.err lines in /var/log/messages. For example:

root@einstein:/data/log# grep kern.err messages*

messages:Aug  3 19:09:20 einstein kern.err kernel: [    6.843320] sun4i-emac 1c0b000.ethernet (unnamed net_device) (uninitialized): failed to request dma channel. dma is disabled
messages:Aug  3 19:09:20 einstein kern.err kernel: [    9.594112] Bluetooth: hci1: command 0xfc18 tx timeout
messages:Aug  3 19:09:20 einstein kern.err kernel: [    9.599359] Bluetooth: hci1: BCM: failed to write update baudrate (-110)
messages:Aug  3 19:09:20 einstein kern.err kernel: [    9.606112] Bluetooth: hci1: Failed to set baudrate
messages:Aug  3 19:09:20 einstein kern.err kernel: [   11.674094] Bluetooth: hci1: command 0xfc18 tx timeout
messages:Aug  3 19:09:20 einstein kern.err kernel: [   11.679333] Bluetooth: hci1: BCM: Reset failed (-110)
messages.0:Aug  3 18:52:11 einstein kern.err kernel: [    7.028815] sun4i-emac 1c0b000.ethernet (unnamed net_device) (uninitialized): failed to request dma channel. dma is disabled
messages.0:Aug  3 18:52:11 einstein kern.err kernel: [   13.114180] Bluetooth: hci1: command 0xfc18 tx timeout
messages.0:Aug  3 18:52:11 einstein kern.err kernel: [   13.119440] Bluetooth: hci1: BCM: failed to write update baudrate (-110)
messages.0:Aug  3 18:52:11 einstein kern.err kernel: [   15.194161] Bluetooth: hci1: command 0xfc18 tx timeout
messages.0:Aug  3 18:52:11 einstein kern.err kernel: [   15.199410] Bluetooth: hci1: BCM: Reset failed (-110)
messages.0:Aug  3 18:58:42 einstein kern.err kernel: [  412.437938] Bluetooth: hci1: command 0xfc18 tx timeout
messages.0:Aug  3 18:58:42 einstein kern.err kernel: [  412.443178] Bluetooth: hci1: BCM: failed to write update baudrate (-110)
messages.0:Aug  3 18:58:42 einstein kern.err kernel: [  412.450133] Bluetooth: hci1: Failed to set baudrate
messages.0:Aug  3 18:58:44 einstein kern.err kernel: [  414.517951] Bluetooth: hci1: command 0xfc18 tx timeout
messages.0:Aug  3 18:58:44 einstein kern.err kernel: [  414.523263] Bluetooth: hci1: BCM: Reset failed (-110)
messages.0:Aug  3 18:58:46 einstein kern.err kernel: [  416.997960] Bluetooth: hci1: command tx timeout
messages.0:Aug  3 18:58:55 einstein kern.err kernel: [  425.477958] Bluetooth: hci1: BCM: failed to write update baudrate (-110)
messages.0:Aug  3 18:58:55 einstein kern.err kernel: [  425.484737] Bluetooth: hci1: Failed to set baudrate
messages.0:Aug  3 18:59:05 einstein kern.err kernel: [  435.717945] Bluetooth: hci1: BCM: Reset failed (-110)
messages.0:Aug  3 18:59:53 einstein kern.err kernel: [  484.117994] Bluetooth: hci2: command 0xfc18 tx timeout
messages.0:Aug  3 18:59:53 einstein kern.err kernel: [  484.123272] Bluetooth: hci2: BCM: failed to write update baudrate (-110)
messages.0:Aug  3 18:59:53 einstein kern.err kernel: [  484.130123] Bluetooth: hci2: Failed to set baudrate
messages.0:Aug  3 18:59:55 einstein kern.err kernel: [  486.197953] Bluetooth: hci2: command 0xfc18 tx timeout
messages.0:Aug  3 18:59:55 einstein kern.err kernel: [  486.203237] Bluetooth: hci2: BCM: Reset failed (-110)
messages.0:Aug  3 18:59:58 einstein kern.err kernel: [  488.677956] Bluetooth: hci2: command tx timeout
messages.0:Aug  3 19:00:06 einstein kern.err kernel: [  497.157961] Bluetooth: hci2: BCM: failed to write update baudrate (-110)
messages.0:Aug  3 19:00:06 einstein kern.err kernel: [  497.164720] Bluetooth: hci2: Failed to set baudrate
messages.0:Aug  3 19:00:17 einstein kern.err kernel: [  507.397966] Bluetooth: hci2: BCM: Reset failed (-110)
messages.0:Aug  3 19:03:18 einstein kern.err kernel: [  688.917944] Bluetooth: hci1: command 0xfc18 tx timeout
messages.0:Aug  3 19:03:18 einstein kern.err kernel: [  688.923200] Bluetooth: hci1: BCM: failed to write update baudrate (-110)
messages.0:Aug  3 19:03:18 einstein kern.err kernel: [  688.929991] Bluetooth: hci1: Failed to set baudrate
messages.0:Aug  3 19:03:20 einstein kern.err kernel: [  690.998004] Bluetooth: hci1: command 0xfc18 tx timeout
messages.0:Aug  3 19:03:20 einstein kern.err kernel: [  691.003364] Bluetooth: hci1: BCM: Reset failed (-110)
messages.0:Aug  3 19:03:23 einstein kern.err kernel: [  693.637933] Bluetooth: hci1: command tx timeout
messages.0:Aug  3 19:03:31 einstein kern.err kernel: [  701.957966] Bluetooth: hci1: BCM: failed to write update baudrate (-110)
messages.0:Aug  3 19:03:31 einstein kern.err kernel: [  701.964723] Bluetooth: hci1: Failed to set baudrate
messages.0:Aug  3 19:03:33 einstein kern.err kernel: [  704.037965] Bluetooth: hci1: command tx timeout
messages.0:Aug  3 19:03:41 einstein kern.err kernel: [  712.197948] Bluetooth: hci1: BCM: Reset failed (-110)
messages.0:Aug  3 19:03:59 einstein kern.err kernel: [  729.878005] Bluetooth: hci2: command 0xfc18 tx timeout
messages.0:Aug  3 19:03:59 einstein kern.err kernel: [  729.883290] Bluetooth: hci2: BCM: failed to write update baudrate (-110)
messages.0:Aug  3 19:03:59 einstein kern.err kernel: [  729.890125] Bluetooth: hci2: Failed to set baudrate
messages.0:Aug  3 19:04:01 einstein kern.err kernel: [  731.957960] Bluetooth: hci2: command 0xfc18 tx timeout
messages.0:Aug  3 19:04:01 einstein kern.err kernel: [  731.963210] Bluetooth: hci2: BCM: Reset failed (-110)
messages.0:Aug  3 19:04:04 einstein kern.err kernel: [  734.357948] Bluetooth: hci2: command tx timeout
messages.0:Aug  3 19:04:12 einstein kern.err kernel: [  742.917939] Bluetooth: hci2: BCM: failed to write update baudrate (-110)
messages.0:Aug  3 19:04:12 einstein kern.err kernel: [  742.924708] Bluetooth: hci2: Failed to set baudrate
messages.0:Aug  3 19:04:14 einstein kern.err kernel: [  744.998021] Bluetooth: hci2: command tx timeout
messages.0:Aug  3 19:04:22 einstein kern.err kernel: [  753.157944] Bluetooth: hci2: BCM: Reset failed (-110)
messages.0:Aug  3 19:07:15 einstein kern.err kernel: [    0.002494] /cpus/cpu@0 missing clock-frequency property
messages.0:Aug  3 19:07:15 einstein kern.err kernel: [    0.002543] /cpus/cpu@1 missing clock-frequency property
messages.0:Aug  3 19:07:15 einstein kern.err kernel: [    6.295744] sun4i-emac 1c0b000.ethernet (unnamed net_device) (uninitialized): failed to request dma channel. dma is disabled
messages.0:Aug  3 19:07:15 einstein kern.err kernel: [    8.794231] Bluetooth: hci1: command 0xfc18 tx timeout
messages.0:Aug  3 19:07:15 einstein kern.err kernel: [    8.799476] Bluetooth: hci1: BCM: failed to write update baudrate (-110)
messages.0:Aug  3 19:07:15 einstein kern.err kernel: [    8.806278] Bluetooth: hci1: Failed to set baudrate
messages.0:Aug  3 19:07:15 einstein kern.err kernel: [   10.874119] Bluetooth: hci1: command 0xfc18 tx timeout
messages.0:Aug  3 19:07:15 einstein kern.err kernel: [   10.879368] Bluetooth: hci1: BCM: Reset failed (-110)
messages.0:Aug  3 19:09:20 einstein kern.err kernel: [    0.002505] /cpus/cpu@0 missing clock-frequency property
messages.0:Aug  3 19:09:20 einstein kern.err kernel: [    0.002554] /cpus/cpu@1 missing clock-frequency property

Apart from that, there seems to be no indication of a problem in the logs.

There is no comms loss (it reboots when VRM is showing realtime data). The switch it’s connected to doesn’t show any port faults or state changes. ā€œReboot device when no contactā€ is turned off.

It is not modified (other than running the large image and NodeRed being enabled). Interestingly, Settings–>General–>Modification checks shows:


and

All I have done is enabled NodeRed, enabled SSH on LAN and logged in.

Clicking this:

runs swupdate, reboots the device and then shows:

Enabling NodeRed doesn’t change this.

Access remains at ā€˜superuser’, so this wasn’t the cause either.

Ah. OK. Changing the root password is the cause of the ā€˜firmware integrity’ message. So we can rule that out.

So… in summary, the system is unmodified, the CPU usage looks fine, it’s not a connectivity issue and there are no relevant messages in the log files.

Hmm…

Current versions will mark the system as modified if you set an ssh password.
Is there anything in /var/log/venus-platform/current ?

Nothing unusual in the current one (just my tinkering from an hour or so ago), but the old ones contain:

...
@40000000688faf4a29ed3cfc *** CCGX booted (0) ***
@40000000688faf692e20c4ec *** CCGX booted (0) ***
@40000000688fafe618c7587c *** CCGX booted (0) ***
...

Sadly nothing which indicates the cause for any reboots.

I replaced the faulty Cerbo with a brand new Mk2 yesterday. The new unit is working flawlessly. I’ve opened a support/RMA request with the original dealer using the Victron support process and we’ll see what happens…

This morning I received a brand new Cerbo Mk2 from the dealer, so that’s a positive. No other communication though and it would have been nice to have been kept informed, but I have the result I want and so I’m happy.

They should replace it as long as you are respecting the temperature range listed in the datasheet, which is -20 to 50 C. I’m glad @NiccyB got a new unit, would be nice to know what happened to the old one though.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.