Cerbo: VECAN1 error messages in /var/log/messages

Hi, two month old Victron install on my 55’ motor yacht. Have been experiencing weekly slow downs/sluggishness and sometimes crashing of my Cerbo V3.55 typically once every 7 to 10 days. My Victron Sales engineer thinks my Cerbo may be overwhelmed by our NMEA2000 traffic coming in via VE.Can2 port. I disconnected this but am still seeing VECAN1 error messages in /var/log/messages. IS VECAN1 error messages for the physical VE.Can1 port on the Cerbo, generic VECan error messages, something else?

Pic diagram of Victron Gear, not shown are 4 - 300Ah Victron NGX Batteries, two for each BMS

Before last crash I was seeing alerts like this, no BMS is usually the death knell requiring a reboot to recover:

Right before our last crash @ 13:12:55 I was seeing these VECAN1 error messages over and over in /var/log/messages

  At my Victron Sales engineer recommendation I have unplugged my N2K Network connection from VE.Can2 port but am still seeing VECAN1 errors where can I find details about these?

Bus errors seem pegged at 4/min over and over:

 I'm up and running four days since last crash and just waiting for it to happen again before reconnecting my NMEA2000 network connection.   It's painful waiting and hard to trust this new Victron gear given the repeated loss of comm to the BMS's and low battery alert messages.  I have looked at some of the notes about creating maps to change N2K Instance numbers and I think possible filter files I can create to reduce the amount of N2K traffic that reaches the Cerbo?   I really only like Tank levels being displayed in Cerbo and don't see the need for lat/lon which is the noisiest traffic on my N2K network.   These crashes have been happening at the dock in my backyard without any Navigation equipment running i.e. Chart Plotter, Radar, AIS, Nav PC, scanning depth sounder.    Our N2K network has sensors for weather, water depth, tank levels and command and control of lights but the bus traffic at the dock is low so I'm doubting this overload theory to the Cerbo.    Here is a high level view of current devices/traffic on my N2K network bandwidth utilization averages 30%

  I have read a lot of posts about Cerbo crashing due to power supply issues.   My BMS's are providing the power to the Cerbo and battery voltage is pretty constant at the dock 26-27V with a pair of MPII's in sustain mode charging batteries and 1000W of Solar on the roof of the boat.   I don't have Reboot enabled on loss of VRM comm's so it's not this.    Cerbo temperature is below decks in a warm but not hot isolated battery compartment space below the waterline.    I have run Top to see if Cerbo is loverloaded and loads look low to me for the short period of time I watch.   Looking for suggestions where to look as it appears to me to be a memory leak/shortage of resources that results in a crash/reboot or me manually rebooting to get services back.

Top shows 40% CPU utilization

root@einstein:/opt/victronenergy# uptime
12:48:33 up 3 days, 23:35, load average: 2.01, 2.56, 2.77

Scott in Sunny FL

Try the current 3.60 beta, it should be a production release very soon. There are a number of fixes, particularly affecting the mk2 Cerbo and CAN.

1 Like

Nick thanks I’m a little timid about jumping to Beta releases without knowing more. Anyway to get a preliminary list of bugfixes?

To test my Sales Engineers theory I just plugged my N2K network connection back into VE.Can2 port on my Cerbo. The same errors appear in /var/log/messages for VECAN1 even though I have the cable plugged into VE.Can2 port. I did some more community reading and learned about the vup and vreg commands and see the VE.Can ports are referenced as vecan0 and vecan1 suggesting they map to the physical ports VE.Can1 and VE.Can2

root@einstein:/opt/victronenergy# vup
None or more than one physical gateway found.
Use -c to specify which gateway should be used.
The following gateways were found:

socketcan:vecan1 vecan1
socketcan:vecan0 vecan0

I see a ton of devices on my N2K network with the Cerbo connected but no specific errors/problems denoted.

The log messages suggest excessive CRC errors and the VECan1 Port gets shutdown and restarted every 100ms

Jun 7 14:39:45 einstein user.err kernel: [350814.299079] mcp251xfd spi0.0 vecan1: IRQ handler mcp251xfd_handle_rxif() returned -74.
Jun 7 14:39:45 einstein user.info kernel: [350814.308010] retrying later: vecan1: excessive CRC errors in interrupt
Jun 7 14:39:45 einstein user.info kernel: [350814.314797] mcp251xfd spi0.0 vecan1: bus-off, scheduling restart in 100 ms
Jun 7 14:39:45 einstein daemon.info connmand[668]: vecan1 {newlink} index 3 address 00:00:00:00:00:00 mtu 16
Jun 7 14:39:45 einstein daemon.info connmand[668]: vecan1 {newlink} index 3 operstate 2
Jun 7 14:39:45 einstein daemon.info connmand[668]: vecan1 {newlink} index 3 address 00:00:00:00:00:00 mtu 16
Jun 7 14:39:45 einstein daemon.info connmand[668]: vecan1 {newlink} index 3 operstate 6

When I plug my N2K Bus analysis tool into the same port the Cerbo gets plugged into it doesn’t report any errors.

Smile face no bus errors reported:

Average bus utilization seen on this port the Cerbo connects to:

The beta category and venus OS sub catergory has the full change log. We’re at iteration 86, it is quite stable and expected for release imminently.

Have a look here:

  • VE.Can / NMEA2000 port: fix issue causing GX to queue up messages while the CAN-bus is down to thereafter send them all out in case connection is up/restored. This is mostly an issue in combination with for example an NMEA2000 network.
  • Includes Cerbo GX MK2 related canbus stability fixes.
  • Fix Cerbo GX MK2 VE.Can 2 port issues

Amongst several others that contribute to BMS lost. Worth a shot, and the rollback is easy via the backup copy kept on the GX.

This confuses me should there be a Victron Terminator in my second VE.Can2 port where I plug the N2K connection?

We have terminators in the main bus but this is a branch drop to the Cerbo so do I need a terminator in the second VE.Can2 port? The text says "do not install a terminator in any of the VE.Can ports yet for the connections to the BMS’s the VE.Can1 requires the terminator.

For anyone curious this is our NMEA 2K Network design, the Cerbo is plugged into the Branch drop in the Forepeak

K thanks Nick I like what I see! Will download/install another day, today has other chores in my future…

NMEA is beyond my scope, but from what you have shown and the manual, it appears the first requirement applies, ie no terminator on the GX, but I would rather one of the experienced marine guys give you guidance.

Hey all, to give a hand:

  • indeed, no terminator. A can-bus is supposed to have two terminators. That is sufficient. The connection to the GX is then called a “drop cable”, and drop cables should not get their own additional termination.

  • the port you are using on the Cerbo GX MK2 is VE.Can 2, for which it is known that it can have issues in v3.55 on the Cerbo GX MK2 model. Already less issues then a software version a bit prior to v3.55 had, but still issues. And the logs you show indicate exactly the issue that is now (finally! :tada:) solved.

Before spending more time on this, I recommend to try v3.60 latest beta or wait for it to become official. That is a matter of weeks, or better yet, days. But impossible to promise.

For more details on termination and such, see here: Marine Integration Guide [Victron Energy]

All the best, Matthijs / Victron

Matthijs,

thanks for chiming in here and I took your advice and joined the beta program and upgraded to V3.60-86 I read the sticky about reporting issues and IDK if I have an issue or this is expected behavior…I want to give feedback and not sure/don’t think you want that added to the Beta release announcement.

After upgrading my Cerbo rebooted and came up fine. Like when I discovered the V2 GUI I was in awe at some of the new features. I logged in looked at /var/log/messages and saw a clean boot no errors. I then went and plugged my VC.Can to N2K cable back in. I returned to my Putty session and it seemed sluggish. Top showed CPU at 50% and /var/log/messages had by my count 32 instances of “vecan-dbus: vecan-dbus: potentially unexpected fatal signal 6” process restart. Over the next several minutes I lost my Putty Connection and the V2-GUI in my browser reconnected several times. Then things quieted down and seem good.

This is the last process restart and now things are quiet/stable for past 30 mins but IMO the load seems quite high:

I’ll let it run for a while while discovering all of the other new features you rolled in…looks great IMO!

Matthijs, et-al

No love on Mariah here after upgrading to the 86 beta release. All systems appeared good until today when the my Cerbo stopped reporting to VRM, the V2 GUI stopped responding and I can’t SSH in via Putty to get to the root cause . This is what I have been seeing every 7-10 days since converting my boat to a Victron LiPoe4 solution. A reboot resolves the problems but not how I want to run my boat having to reboot weekly to have a working solution.

I’m physically on the boat connected to the Internet and this shows Cerbo stopped talking to VRM about 4 hours ago:

V2 GUI is not responding:

Putty Times out:

I was getting 7-10 days before upgrading, now after upgrading to the Beta release I got 2 days.

I forced a reboot via the GX Touch and you can see the reboot was at Jun11 00:16 UTC but prior to that about 6 hours ago there were the same “vecan-dbus: vecan-dbus: potentially unexpected fatal signal 6.” I saw prior to the upgrade.

I’ll be traveling over the next couple of days but would welcome a remote Engineering support session when I return, if you want to login and check things out in a real world/end user scenario.

Hi @MotoringMariah ,

Thanks for the detailed report!

That doesn’t look good. I’ll get someone to look into it.

Is it ok if we login remotely to check logs and such ourselves?

I see you have Remote support enabled.

We’ll contact you in case we want to do something that risks shutting inverters down.

Matthijs

(vp #541)

Matthijs,

YES I’m ok with remote support however I will be away from the boat for the next 5 days so
please be careful not to brick anything while I’m gone. We are connected to shore power with Solar Priority enabled. I rebooted the Cerbo last night and removed my NMEA2000 connection from VE.Can2

This morning I still can’t get the V2 GUI to render on my desktop, and VRM Remote console shows everything off yet it’s not i.e. the GX Touch display doesn’t match the VRM Remote console display.

VRM direct to Cerbo IP never loads/opens

VRM Good

VRM Remote Console NOT GOOD

Top average 51%

This morning I went ahead and installed V3.60 official release from -86 BETA and will leave the NMEA2000 connection disconnected until I return next week…after upgrading to V3.60 official and rebooting now V2 GUI loads from local IP as well as VRM Remote Console. I’m afraid now your troubleshooting something that happened in the past and the firmware has changed AND I have removed the NMEA2K connection. For examples of the vecan-dbus errors I was seeing grep messages*

thanks! we’ll login as well as be carefull.

Hey again @MotoringMariah , issue found and will be solved soon.

The GX is having issues with one of your devices having a so-called UTF8 character in its NMEA2000 label or installation free field.

(By the way, as a work around you can also find what N2K device that is and change that label).

Good weekend!

For completeness, it seems like there is an extended ASCII character in the product info of NAD 0x14 / a Tides Marine product That is rather rare, but fine by itself though. It needs to be converted to utf8 before sending it over the dbus.

Hi @MotoringMariah a fix is available as beta version v3.70~2 now.

Thank you for the report again - this issue has always been in the code and never found / escalated. Good to have made a better product!

Matthijs/Jeroen,

Just returned from a trip and I’m impressed you found the issue so quickly AND that’s with the NMEA2K connection removed. So the Tides Marine device is a seal monitoring system that measures the temperature of our propeller shaft seals and puts that data on the N2K bus. Up in the Pilot House where we steer the boat from we have a Tides Marine Alarm that alerts on over temperature AND we have a Maretron Display logging/showing those seal temps over the last hour. We had a situation on our previous boat where there was an air pocket/bubble IDK what that caused no cooling water to be reach one shaft seal while we were motoring and the seal melted to the point we had smoke in the engine room. Had the seal completely failed we would have had sea water pouring in and could have sunk the boat. Point is we don’t want to turn that system off! I can try using N2KView to change the name BUT I suspect that device is a proprietary system that will not except configuration changes from N2KView/Actisense. I’ll try it and report back.

I was super impressed how the Cerbo/GX Touch display picked up our blackwater, freshwater, and fuel tank info and started displaying it. There is a lot of data/PGN’s on our bus you probably don’t need to see/read/receive like the Shaft Seal Temperature. I recall reading in the Cerbo User Manual (14.8 NMEA2000-out technical details) that you support maps for mapping instance ID’s (dbus -out) and wonder do you offer filtering (dbus -in)? My thinking here is to put a filter in to stop the Tides Marine PGN’s from being processed when seen on the bus OR the reverse put a filter in that only allows tank data through. I worry when we are under way our John Deere Engines are going to be putting J1939 engine data on the N2K bus and your going to have similar problems with engine oil pressure, transmission position, who knows what. Right now I’m having problems at the dock hate to see it get worse when we are underway. We have weather data, gps data, autopilot/radar, lighting command and control data all PGN noise to you IMO. I just saw the post you have a fix for me to try! I’ll get on that later today/tomorrow at the latest. THANKS

1 Like

Matthijs, et-al

The V3.70-2 beta release seems to have solved the NMEA2K UTF8 issue. Applied update 12 hours ago and reconnected my NMEA2K network to VE.CAN.2 /var/log/messages is clean showing only the reboot messages from yesterday. Top is reporting 30% CPU avg utilization which is better than before AND the system seems more snappy responsive. I tried N2K Analyzer and ActiSense NMEA reader to see if I could change the Tides Marine fields and the only option is to change the Instance numbers or installation details.

I will power on all of our navigation equipment and John Deere Engines next to see if there are any other N2K devices/PGN’s that might cause you problems.

I have more questions but will start new threads for those…

THANKS!