Investigate CSIP-AUS limit "dips" #75

longzheng · 2025-01-18T10:42:45Z

I've noticed in a production site that the export limit is not always at 10kW but sometimes "dips" down to ~9kW.

I suspect this has something to do with the software-based ramping logic interacting with maybe some timing/schedule issue with the CSIP-AUS server.

PaulSchulz · 2025-01-19T01:45:09Z

Interesting. Can you say which region/state this is in? I am wondering if it would be possible to correlate this with a reduction in network demand.

Most of those operating envelope changes seem to coincide with early afternoon (solar peak).

CpuID · 2025-01-19T03:54:55Z

I agree, if this is Energex the prices have been low lately and lots of sun.

Middle of the day progressive export limiting is plausible.

longzheng · 2025-01-19T11:24:15Z

Interesting. Can you say which region/state this is in? I am wondering if it would be possible to correlate this with a reduction in network demand.

Most of those operating envelope changes seem to coincide with early afternoon (solar peak).

This is Gold Coast, QLD.

I'm 99% sure these were not actually intentional limit changes, but some sort of bug/quirk/edge case with how the software handles non-consecutive schedules.

I haven't had time to debug it further but I suspect the server is maybe sending something like

schedule 1 9:00:00am - 9:30:00am
schedule 2 9:30:04am - 10:00:00am

And as a result my software thinks it's going into a fallback/non-active limit which requires ramping down. Then a few seconds later it sees the new active schedule and begins ramping back up.

PaulSchulz · 2025-01-19T11:54:50Z

Hmm ok.. so need to implement something like 'continue with previous envelop until told otherwise' or does the system need to fall back if no message received?

…

On Sun, 19 Jan 2025, 21:54 Long Zheng, ***@***.***> wrote: Interesting. Can you say which region/state this is in? I am wondering if it would be possible to correlate this with a reduction in network demand. Most of those operating envelope changes seem to coincide with early afternoon (solar peak). This is Gold Coast, QLD. I'm 99% sure these were not actually intentional limit changes, but some sort of bug/quirk/edge case with how the e handles non-consecutive schedules. I haven't had time to debug it further but I suspect the server is maybe sending something like - schedule 1 9:00:00am - 9:30:00am - schedule 2 9:30:04am - 10:00:00am And as a result my software thinks it's going into a fallback/non-active limit which requires ramping down. Then a few seconds later it sees the new active schedule and begins ramping back up. — Reply to this email directly, view it on GitHub <#75 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAAVLAMD5MBLNYQUG2EFY6D2LODPLAVCNFSM6AAAAABVNP67PWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMMBQHAYTONJVGY> . You are receiving this because you commented.Message ID: ***@***.***>

longzheng · 2025-01-19T21:07:54Z

Hmm ok.. so need to implement something like 'continue with previous envelop until told otherwise' or does the system need to fall back if no message received?

The spec says it has to fallback if there is no active schedule.

I just don't know if I have a bug in the software or the schedule is actually misaligned (a bug with the CSIP server).

CpuID · 2025-01-19T22:16:28Z

Is there sufficient debug data (logs) to be able to prove if the DNSP (Energex) is providing gaps in schedules? Eg a few seconds at a time?

If they are the cause, might be worth escalating to them to resolve at the source?

Might be better than tiptoeing around it on the receiving end...

longzheng · 2025-01-20T02:09:50Z

There are probably sufficient logs but it's a bit verbose and hard to trace back.

I've been busy but I'm going to add custom some logging specifically when it does this to make it easier.

longzheng · 2025-01-23T09:19:19Z

I think I figured out at least one root cause of the problem.

The Energex CSIP-AUS server has been quite flakey recently, it returns 502 errors randomly a lot. Normally this isn't a problem, because client requests have retries and most requests work off a "polling" mechanism which means there's usually either no data or stale data.

However there's 2 specific calls (get DefaultDerControl and get DerControlList) nested inside the response of a pollable resource (get DerProgramList) which are not polled. If any of these two requests fail (after retries) then it will crash the entire application because there's no try/catch.

The application crashing is not a problem because Docker will auto restart it, but when it is restarted the application has to ramp up from a default control export limit (1500W) to a new active control export limit (10000W) which causes the dip in the reporting.

I think I can change this behaviour by simplying ignoring if either DefaultDerControl and DerControl fail to load, then we just use whatever stale data (or no data) we had already loaded previously. Worst case scenario it'll try again at the next polling cycle.

…trolList Related to #75

longzheng · 2025-01-24T22:21:41Z

I've found another edge case that's causing dips related to sending DER control event responses to the utility server.

As above, for polling requests, a server error is usually fine. However when switching from one control schedule to another, my client expects to successfully send a DER control event response (due to the await) when a new event starts, and if unable to do so, it falls back to the default control.

In practice, if the sending of the control event response fails, it would effectively ramp down to default control and then ramp back up to active control in a few seconds (assuming it's successful in sending the DER control event response).

I'm thinking of changing the behaviour of the client to be either

Try send the DER control event response with retry. Upon failure just ignore and never retry again.
Try send the DER control event response with infinite retry.
or keep existing behaviour fallback to default control

I've asked the utility what their preference is.

JimboHamez · 2025-01-29T23:34:23Z

Hi Long, hopefully this will provide some value.

I have a Fronius 8kw Gen 24 inverter that uses the HA-Sunspec add-on to Home Assistant, using Modbus TCP to collect statics for the energy page. Since moving to this add-on and making some changes to the polling interval, currently 1 sec I am seeing a similar issue where some register values are returned as zero's.

Just to note when I run your server to poll the inverter the issue of zero's in some registers returned from the inverter is a whole lot worse.

Based on code inspection of both your project and that of the HA-Sunspec, including tracing with Wireshark is that both projects scan and poll the inverter for all of the available Sunspec Models and associated registers (while I note that your project caches some responses) at each polling interval.

In essence and as noted in this projects code I am of the opinion that the Inverter cannot handle the amount of Modbus traffic it is being asked to perform.

Based on this, it might be better to target the registers your project requires instead of collecting all of the registers in each Sunspec model (cached or not).

What might inform the direction of this project (while noting the specific inverter restrictions as a result) is that Victron and Fronius have worked together on an AC coupled Energy Storage System that controls Grid Feed In. In that project it specifically notes that on Gen24 inverters the user must activate the Solar API interface, and for Zero feed in raise the "Controlling via Modbus" to the number 1 position. So in this case the Victron inverter uses the Solar API interface to collect the statistics required to determine the Inverters power settings and then sets it via Modbus.

https://www.victronenergy.com/live/ac_coupling:fronius#setup_local_area_network_lan_configuration

Cheers,
Jim

longzheng · 2025-01-30T09:32:05Z

@JimboHamez thanks for the comment.

I have a Fronius 8kw Gen 24 inverter that uses the HA-Sunspec add-on to Home Assistant, using Modbus TCP to collect statics for the energy page. Since moving to this add-on and making some changes to the polling interval, currently 1 sec I am seeing a similar issue where some register values are returned as zero's.

Yep I've observed this with my Fronius inverter as well.

I've raised this with a rep from Fronius Australia and they mentioned their Modbus probably can't handle the (combined) polling frequency.

both projects scan and poll the inverter for all of the available Sunspec Models and associated registers (while I note that your project caches some responses) at each polling interval.

As you've noted I've already tried to reduce the frequency of data polled by caching some models.

Based on this, it might be better to target the registers your project requires instead of collecting all of the registers in each Sunspec model (cached or not).

It is my impression it's not the amount of registers being polled (as it's still more efficient to poll a bigger range than a bunch of smaller ranges), but rather the actual number of Modbus requests. I did some benchmarking a while ago and it was more efficient to poll for example registers 1-100 than registers 1-30 & 50-70.

What might inform the direction of this project (while noting the specific inverter restrictions as a result) is that Victron and Fronius have worked together on an AC coupled Energy Storage System that controls Grid Feed In. In that project it specifically notes that on Gen24 inverters the user must activate the Solar API interface, and for Zero feed in raise the "Controlling via Modbus" to the number 1 position. So in this case the Victron inverter uses the Solar API interface to collect the statistics required to determine the Inverters power settings and then sets it via Modbus.

I also had the same idea. I was thinking of even taking it a step further than using the internal web portal API to setting the export limit dynamically rather than using Modbus. However in the end I opted against it because at least with the SunSpec/Modbus implementation it's standardised whereas the SolarvAPI is very specific to Fronius.

longzheng changed the title ~~Investigate CSIP-AUS limit "spikes"~~ Investigate CSIP-AUS limit "dips" Jan 18, 2025

longzheng added a commit that referenced this issue Jan 23, 2025

Fix crashing when CSIP server errors for DefaultDerControl and DerCon…

1742a2d

…trolList Related to #75

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate CSIP-AUS limit "dips" #75

Investigate CSIP-AUS limit "dips" #75

longzheng commented Jan 18, 2025 •

edited

Loading

PaulSchulz commented Jan 19, 2025

CpuID commented Jan 19, 2025

longzheng commented Jan 19, 2025 •

edited

Loading

PaulSchulz commented Jan 19, 2025 via email

longzheng commented Jan 19, 2025

CpuID commented Jan 19, 2025

longzheng commented Jan 20, 2025

longzheng commented Jan 23, 2025

longzheng commented Jan 24, 2025 •

edited

Loading

JimboHamez commented Jan 29, 2025

longzheng commented Jan 30, 2025

Investigate CSIP-AUS limit "dips" #75

Investigate CSIP-AUS limit "dips" #75

Comments

longzheng commented Jan 18, 2025 • edited Loading

PaulSchulz commented Jan 19, 2025

CpuID commented Jan 19, 2025

longzheng commented Jan 19, 2025 • edited Loading

PaulSchulz commented Jan 19, 2025 via email

longzheng commented Jan 19, 2025

CpuID commented Jan 19, 2025

longzheng commented Jan 20, 2025

longzheng commented Jan 23, 2025

longzheng commented Jan 24, 2025 • edited Loading

JimboHamez commented Jan 29, 2025

longzheng commented Jan 30, 2025

longzheng commented Jan 18, 2025 •

edited

Loading

longzheng commented Jan 19, 2025 •

edited

Loading

longzheng commented Jan 24, 2025 •

edited

Loading