Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate CSIP-AUS limit "dips" #75

Open
longzheng opened this issue Jan 18, 2025 · 11 comments
Open

Investigate CSIP-AUS limit "dips" #75

longzheng opened this issue Jan 18, 2025 · 11 comments

Comments

@longzheng
Copy link
Owner

longzheng commented Jan 18, 2025

I've noticed in a production site that the export limit is not always at 10kW but sometimes "dips" down to ~9kW.

I suspect this has something to do with the software-based ramping logic interacting with maybe some timing/schedule issue with the CSIP-AUS server.

Image
@longzheng longzheng changed the title Investigate CSIP-AUS limit "spikes" Investigate CSIP-AUS limit "dips" Jan 18, 2025
@PaulSchulz
Copy link

Interesting. Can you say which region/state this is in? I am wondering if it would be possible to correlate this with a reduction in network demand.

Most of those operating envelope changes seem to coincide with early afternoon (solar peak).

@CpuID
Copy link
Contributor

CpuID commented Jan 19, 2025

I agree, if this is Energex the prices have been low lately and lots of sun.

Middle of the day progressive export limiting is plausible.

@longzheng
Copy link
Owner Author

longzheng commented Jan 19, 2025

Interesting. Can you say which region/state this is in? I am wondering if it would be possible to correlate this with a reduction in network demand.

Most of those operating envelope changes seem to coincide with early afternoon (solar peak).

This is Gold Coast, QLD.

I'm 99% sure these were not actually intentional limit changes, but some sort of bug/quirk/edge case with how the software handles non-consecutive schedules.

I haven't had time to debug it further but I suspect the server is maybe sending something like

  • schedule 1 9:00:00am - 9:30:00am
  • schedule 2 9:30:04am - 10:00:00am

And as a result my software thinks it's going into a fallback/non-active limit which requires ramping down. Then a few seconds later it sees the new active schedule and begins ramping back up.

@PaulSchulz
Copy link

PaulSchulz commented Jan 19, 2025 via email

@longzheng
Copy link
Owner Author

Hmm ok.. so need to implement something like 'continue with previous envelop until told otherwise' or does the system need to fall back if no message received?

The spec says it has to fallback if there is no active schedule.

I just don't know if I have a bug in the software or the schedule is actually misaligned (a bug with the CSIP server).

@CpuID
Copy link
Contributor

CpuID commented Jan 19, 2025

Is there sufficient debug data (logs) to be able to prove if the DNSP (Energex) is providing gaps in schedules? Eg a few seconds at a time?

If they are the cause, might be worth escalating to them to resolve at the source?

Might be better than tiptoeing around it on the receiving end...

@longzheng
Copy link
Owner Author

There are probably sufficient logs but it's a bit verbose and hard to trace back.

I've been busy but I'm going to add custom some logging specifically when it does this to make it easier.

@longzheng
Copy link
Owner Author

I think I figured out at least one root cause of the problem.

The Energex CSIP-AUS server has been quite flakey recently, it returns 502 errors randomly a lot. Normally this isn't a problem, because client requests have retries and most requests work off a "polling" mechanism which means there's usually either no data or stale data.

However there's 2 specific calls (get DefaultDerControl and get DerControlList) nested inside the response of a pollable resource (get DerProgramList) which are not polled. If any of these two requests fail (after retries) then it will crash the entire application because there's no try/catch.

The application crashing is not a problem because Docker will auto restart it, but when it is restarted the application has to ramp up from a default control export limit (1500W) to a new active control export limit (10000W) which causes the dip in the reporting.

I think I can change this behaviour by simplying ignoring if either DefaultDerControl and DerControl fail to load, then we just use whatever stale data (or no data) we had already loaded previously. Worst case scenario it'll try again at the next polling cycle.

longzheng added a commit that referenced this issue Jan 23, 2025
@longzheng
Copy link
Owner Author

longzheng commented Jan 24, 2025

I've found another edge case that's causing dips related to sending DER control event responses to the utility server.

As above, for polling requests, a server error is usually fine. However when switching from one control schedule to another, my client expects to successfully send a DER control event response (due to the await) when a new event starts, and if unable to do so, it falls back to the default control.

In practice, if the sending of the control event response fails, it would effectively ramp down to default control and then ramp back up to active control in a few seconds (assuming it's successful in sending the DER control event response).

I'm thinking of changing the behaviour of the client to be either

  • Try send the DER control event response with retry. Upon failure just ignore and never retry again.
  • Try send the DER control event response with infinite retry.
  • or keep existing behaviour fallback to default control

I've asked the utility what their preference is.

@JimboHamez
Copy link

Hi Long, hopefully this will provide some value.

I have a Fronius 8kw Gen 24 inverter that uses the HA-Sunspec add-on to Home Assistant, using Modbus TCP to collect statics for the energy page. Since moving to this add-on and making some changes to the polling interval, currently 1 sec I am seeing a similar issue where some register values are returned as zero's.

Just to note when I run your server to poll the inverter the issue of zero's in some registers returned from the inverter is a whole lot worse.

Based on code inspection of both your project and that of the HA-Sunspec, including tracing with Wireshark is that both projects scan and poll the inverter for all of the available Sunspec Models and associated registers (while I note that your project caches some responses) at each polling interval.

In essence and as noted in this projects code I am of the opinion that the Inverter cannot handle the amount of Modbus traffic it is being asked to perform.

Based on this, it might be better to target the registers your project requires instead of collecting all of the registers in each Sunspec model (cached or not).

What might inform the direction of this project (while noting the specific inverter restrictions as a result) is that Victron and Fronius have worked together on an AC coupled Energy Storage System that controls Grid Feed In. In that project it specifically notes that on Gen24 inverters the user must activate the Solar API interface, and for Zero feed in raise the "Controlling via Modbus" to the number 1 position. So in this case the Victron inverter uses the Solar API interface to collect the statistics required to determine the Inverters power settings and then sets it via Modbus.

https://www.victronenergy.com/live/ac_coupling:fronius#setup_local_area_network_lan_configuration

Cheers,
Jim

@longzheng
Copy link
Owner Author

@JimboHamez thanks for the comment.

I have a Fronius 8kw Gen 24 inverter that uses the HA-Sunspec add-on to Home Assistant, using Modbus TCP to collect statics for the energy page. Since moving to this add-on and making some changes to the polling interval, currently 1 sec I am seeing a similar issue where some register values are returned as zero's.

Yep I've observed this with my Fronius inverter as well.

I've raised this with a rep from Fronius Australia and they mentioned their Modbus probably can't handle the (combined) polling frequency.

both projects scan and poll the inverter for all of the available Sunspec Models and associated registers (while I note that your project caches some responses) at each polling interval.

As you've noted I've already tried to reduce the frequency of data polled by caching some models.

Based on this, it might be better to target the registers your project requires instead of collecting all of the registers in each Sunspec model (cached or not).

It is my impression it's not the amount of registers being polled (as it's still more efficient to poll a bigger range than a bunch of smaller ranges), but rather the actual number of Modbus requests. I did some benchmarking a while ago and it was more efficient to poll for example registers 1-100 than registers 1-30 & 50-70.

What might inform the direction of this project (while noting the specific inverter restrictions as a result) is that Victron and Fronius have worked together on an AC coupled Energy Storage System that controls Grid Feed In. In that project it specifically notes that on Gen24 inverters the user must activate the Solar API interface, and for Zero feed in raise the "Controlling via Modbus" to the number 1 position. So in this case the Victron inverter uses the Solar API interface to collect the statistics required to determine the Inverters power settings and then sets it via Modbus.

I also had the same idea. I was thinking of even taking it a step further than using the internal web portal API to setting the export limit dynamically rather than using Modbus. However in the end I opted against it because at least with the SunSpec/Modbus implementation it's standardised whereas the SolarvAPI is very specific to Fronius.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants