VRM outages and slowness - a central thread

Dear all, especially VRM users,

As you have noticed, VRM is not behaving to its usual performance. In this central thread, I would like to shed some light on the situation and give you updates. Hopefully soon, I can update the post-mortem of what happened and how we solved it soon.

Influx database issues

About 2 weeks ago, we started noticing that on Monday mornings, the Influx database was slower than usual. To try and improve it, we upgraded to a new Influx release. Unfortunately, that led to a worsened state. With the amount of data flowing into VRM, a small queue of database queries quickly escalates to overall slow performance and sometimes even downtime. For some days in between, we thought the database recovered, only to be followed up with more queries being slow and eventually timing out. We are now in the situation where our devs as well as Influx engineers need to manually tend to the database for it to run smoothly.

Temporary damage control

Some things that we did (and still do) as soon as we notice the database having issues is:

  • Disabling forecasts
  • Putting ‘State advanced widgets’ behind a button instead of auto-loading them

All to prevent you from noticing the issues in your day to day.

Obviously in the regressed state of the database in the last couple of days, you did notice it everywhere in VRM.

Fallbacks / Long term solutions

I have already seen some suggestions as to how we can in the future make sure you have some data in VRM, even when our database has issues like this. I truly appreciate your suggestions, but I also hope that you understand that at this moment in time, all of our efforts are going into fixing the database. After it is stable again, we will definitely look into long term solutions and plan B (C and D) in case this happens again. We have the blessing and the curse to work on a platform that doesn’t have many equals in the wild, so it is not as easy to borrow ideas from how other platforms deal with these kind of things.

Thank you for your patience

Rest me to thank you for your patience. We know that many of you are using VRM every day, and we want it to be as fast as it can for you. We are really proud of the VRM platform and how it kept up with the growth in data so far, so I am sure we will get back to that state very soon.

On behalf of the VRM team,

Barbara

52 Likes

You truly have something to be proud of — it’s a brilliant, versatile platform that would make your competitors green with envy! Everyone runs into problems now and then, but there’s no such thing as an unsolvable issue.

9 Likes

Thank you very much for the insights and the way forward. Best of luck!

4 Likes

Good to see a direct link to this post in VRM. If it all possible to make it stand out in BOLD RED even better.

3 Likes

When I was a MySQL database administrator on a site I ran,n when we hit about 1 million users we had constant crashes when we ran routines at night to clean up and settle credi card batches and whatnot. Our solution was to load balance the database and run the routines on one of the slave databases, I don’t know if your database allows slaves to update tables. Also adding more ram and indexing almost everything helped

For what it is worth:
VRM, VRM-API, Node-RED_DESS-API all sprung back to life just now!
That is in and by itself an accomplicement worth mentioning. Even though more outages may occur coming hours, this is good.

A quick update:

We’ve refreshed the forecasts, so DESS should now be working as expected.

The database is running faster, and we’ve cleared all writing delays: luckily no data was lost, and all processing queues are now empty.

However, reading information from the database is still slow.

The team is working on:

  • Adding a faster memory layer to speed things up

  • Combining old and new data for smoother access

  • Creating a minimalistic homepage for people with over 10 installations, to reduce load on the database

10 Likes

2 posts were split to a new topic: Problem changing energy meter

memcached and mem tables are your friends :rofl::ok_hand:

We have now added a caching layer to the database to help reduce delays caused by the main database.

This saves the results of certain requests, so they can be shown faster next time. This is now live and we see the positive impact of it on page loads.

It only helps for graphs and dashboard that have been viewed before.

4 Likes

Just to confirm from my setup I use Node Red on OS large, my flow failed last night as I have consumption and PV forecast nodes which were showing a request time out error.

It is now back up and running.. :blush: Thanks

I have no data at all on my App. This can not happen at all, I don’t care what’s wrong. I need to know the status of my system ASAP and at any time I need/want to know. No data means no alarms. Luckily it is not winter but today it is raining and I need to control my off-grid Victron system.

Did you try to restart the app/the phone or login in into VRM via the normal browser.

Try clearing the app’s cache, and the data should load. The system is recovering, and many features are already working. Please don’t be a grumpy and impatient customer—show a bit more patience with the people who ensure your comfort.
Thank you.

4 Likes

Just to echo the others. You should have access. Even from this neck of the woods it is working.

1 Like

Dear all,

VRM is back to a usable state. For now we have removed the banner informing users of performance issues. Not all performance is to our own standards, but it is definitely a lot better than the past two week.

Some known issues:

  • Consumption forecast is disabled, impacting DESS
  • Some solar forecasts are duplicated far-future date ranges

We are looking into these.

The caching solution holds up well, the new minimal homepage for people with over 5 installations will go live tomorrow. With those counter-measures, we hope to allow ourselves a bit of room to find out a solution for the root cause of these problems.

The team will continue to keep a closed eye on the systems and database, and I will keep this topic pinned for 2 days longer in case performance degrades again.

Thank you again for your patience!

12 Likes

Thank you, you’re appreciated

1 Like

Well done @Barbara and Victron team.

Please don ‘t take (mine and others) criticism as being all negative or dissatisfied. Au contraire, speaking for myself I love the platform and community, all my comments are well meant with future improvements for all of us in mind.

Cheers, Jan

5 Likes

Dear Victron team,

I’m quite new to Victron. I’ve had a Dess system since February 2025.
I continue to be amazed by what Victron can do and does. Time and time again, I’m pleasantly surprised!!
Truly fantastic!!

Of course, it wasn’t pleasant to find that VRM wasn’t working properly yesterday and even before.
But from the very first moment, the blue bars with notifications. You indicated you were aware of it and were working on it. That reassured me.

And then today, noticing that VRM is working again. And reading this post with explanations and information.

Of course, every system has its problems now and then…
But the way the Victron team is handling this…
It’s truly fantastic!!
This post with text and explanation is a very positive move!!
Many large companies with systems could learn a thing or two from this!!

I completely understand that the team, in a state of high alert, has been sweating…
And working incredibly hard to get everything working again!!

And you’ve succeeded!!
You really deserve a big compliment!!
And thank you so much!!

2 Likes