Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

replaced repeated progress() calculation calls with a variable #4256

Merged
merged 4 commits into from
Dec 15, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 5 additions & 2 deletions wled00/FX.h
Original file line number Diff line number Diff line change
Expand Up @@ -363,6 +363,7 @@ typedef struct Segment {
};
uint8_t startY; // start Y coodrinate 2D (top); there should be no more than 255 rows
uint8_t stopY; // stop Y coordinate 2D (bottom); there should be no more than 255 rows
uint16_t transitionprogress; // current transition progress 0 - 0xFFFF
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be a static variable as it will be modified in each handleTransition() call at the start of service() loop.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the current behaviour if for example the palette of one segment is changed with 10s transition, and after 5s palette of a second segment is changed? Will that stop transition of the first segment? i.e. are transition start/stop per segment or global?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The transition time is bound to strip ATM but each segment may start transition at its own (point in) time. So each segment should transition on its own (that was my goal when implementing current transitions, compared to previous, limited transitions) independent from others.

Copy link
Collaborator

@softhack007 softhack007 Nov 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be a static variable as it will be modified in each handleTransition() call at the start of service() loop.

Rather than make it static, I'd say put it into strip.

Static member attributes of a class are always a good source of confusion when trying to read code written by someone else - because a 'static' member is technicially not even part of the object you work with. It survives delete segment, changing it in one object instances also changes the value in all other instances. Coping one segment to another will not create a copy of static members attributes - there will still be only one value.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than make it static, I'd say put it into strip.

That would be a worse choice IMO. It can be considered in the same way as _vLength, etc in speed improvements branch by @DedeHai . It is used as a speedup, pre-calculated value.
If you think this will cause confusion, keep it as an instance member rather than strip member.
But that is just my opinion, no need to take it into account.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

progress() determines the state where transition is. It should not be "frame" based. It only depends on time since transition (for that segment) started. BTW it is possible to have multiple segments in transition with differing progress() value.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the reason progress() should depend on the order of segment processing or when does this give an advantage aver calculating it once per frame? If for example there are lots of segments with heavy FX calculation (>1ms) later segments would already be further in the transition in the same frame, resulting in different palette or brightness for example. That does not appear to be the correct approach IMHO. It would only be right if startTransition() is called with the same time delay, which I don't think it is. Using the 'per segment' update is more consistent code though, as you said. I think your suggestion of following the Segment::beginDraw() is a good compromise.

Copy link
Collaborator

@blazoncek blazoncek Nov 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the reason progress() should depend on the order of segment processing

It should not be dependant on the order of segment processing. Each "segment" (if you want it) has its own "next time" for update in the form of effect function return value. So some effects may run at independent frame rate. Look at Android and a few others (i.e. Game of life).

What matters is the time difference between the start of transition and current time (in regards to the transition duration). This is not frame or segment based. It is true that segment holds both start time and transition duration and these two are unique to each segment. This was the prime reason to have progress() as a function that calculates how far transition has progressed. If you impose a limit that progress is calculated at the start of frame (for optimisation for speed) then it needs to be pre-calculated for each segment separately. Hence the suggestion for beginDraw().

Why do you think that having different progress value for is wrong? Each segment operates independently and displays effect and/or palette independently of other segments. Even if overlaid. It is totally acceptable for two segments to be at different progress level.

And it is quite easy to achieve that.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are different scenarios, your reasoning seems sound to me. My scenario was this: have 10 segments, all displaying the same FX, change palette on all of them at the same time -> a transition starts on all segments. Palette change may now not be simulatanuous on all segments, the last ones may already have progressed further. Is my thinking here correct for current code? I am not arguing against your suggestion to keep it 'per segment', which I think is the way to go, just trying to understand where a 'once per frame' would be better and where it would be worse.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is correct and in most cases this will be the case (same start for all segments). But not in all cases (and not all segments may have same effect and/or palette).

So, if you have an exception, treat it as a regular behaviour (not all segments may start transition at the same time).

char *name;

// runtime data
Expand Down Expand Up @@ -468,6 +469,7 @@ typedef struct Segment {
check3(false),
startY(0),
stopY(1),
transitionprogress(0xFFFF),
name(nullptr),
next_time(0),
step(0),
Expand Down Expand Up @@ -559,12 +561,13 @@ typedef struct Segment {
// transition functions
void startTransition(uint16_t dur); // transition has to start before actual segment values change
void stopTransition(); // ends transition mode by destroying transition structure (does nothing if not in transition)
inline void handleTransition() { if (progress() == 0xFFFFU) stopTransition(); }
inline void handleTransition() { updateTransitionProgress(); if (progress() == 0xFFFFU) stopTransition(); }
#ifndef WLED_DISABLE_MODE_BLEND
void swapSegenv(tmpsegd_t &tmpSegD); // copies segment data into specifed buffer, if buffer is not a transition buffer, segment data is overwritten from transition buffer
void restoreSegenv(tmpsegd_t &tmpSegD); // restores segment data from buffer, if buffer is not transition buffer, changed values are copied to transition buffer
#endif
[[gnu::hot]] uint16_t progress() const; // transition progression between 0-65535
[[gnu::hot]] void updateTransitionProgress(); // set current progression of transition
inline uint16_t progress() const { return transitionprogress; }; // transition progression between 0-65535
[[gnu::hot]] uint8_t currentBri(bool useCct = false) const; // current segment brightness/CCT (blended while in transition)
uint8_t currentMode() const; // currently active effect/mode (while in transition)
[[gnu::hot]] uint32_t currentColor(uint8_t slot) const; // currently active segment color (blended while in transition)
Expand Down
6 changes: 3 additions & 3 deletions wled00/FX_fcn.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -317,12 +317,12 @@ void Segment::stopTransition() {
}

// transition progression between 0-65535
uint16_t IRAM_ATTR Segment::progress() const {
void IRAM_ATTR Segment::updateTransitionProgress() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you could use IRAM_ATTR_YN here - it means that esp32 puts the function into IRAM, while 8266 doesn't. We'll save some IRAM space especially on the "_compat" builds.

As the function is only called once per frame in WS2812FX::service() - via seg.handleTransition() - it might even be better to remove the IRAM_ATTR as this call is not performance critical any more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about removing the attribute but left it as is since I have no way to check the difference. I once as a test removed all IRAM_ATTR and on my setup there was zero performance change.
I think removing it here is safe, as you say, this is only called once per frame.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

esp8266 will thank you ;-)

Copy link
Collaborator Author

@DedeHai DedeHai Nov 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe WLED_SAVE_IRAM should also be defined on ESP32 C3: it is not as performant as the ESP32 any way and putting stuff in IRAM uses a lot more flash for some reason. If I enable WLED_SAVE_IRAM on the C3 that saves 1.6k of flash. On the ESP32 it only saves 68 bytes of flash.
Any suggestions for performance tests that would show if this is a valid option?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any function marked with IRAM_ATTR will always be kept in fast SRAM and will never be fetched from flash. The basic idea for IRAM_ATTR is to be used in ISR or functions that may access (write to) flash directly.
The benefit of using it elsewhere is to speed up access to such function as it will never go to cache hit/miss logic.

Contrary to what @softhack007 is saying I still think adding IRAM_ATTR to functions that are called very often is beneficial. I am telling this from experience with over 50 installed ESP8266's with various options and usermods installed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Contrary to what @softhack007 is saying I still think adding IRAM_ATTR to functions that are called very often is beneficial.

It might be beneficial, however we talked about the new progress() that's only called a few hundred times per second (max) now. The function is not time critical any more with this PR, so why use IRAM_ATTR for it?

Copy link
Collaborator

@softhack007 softhack007 Nov 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(side-topic)

I once as a test removed all IRAM_ATTR and on my setup there was zero performance change.

This aligns with my own experiments on -S3 and esp32 with 80mhz flash - no noticeable performance impact, however sometimes IRAM_ATTR increases program size. This can be explained because the compiler cannot inline such a function, even when there would be a benefit for program size.

Maybe it's also depending a lot on the CPU caches. In fact a function that's called really often has a good chance to be cached by the CPU already. Also a board with fast flash (qio 80mhz) is like 4x faster on flash reading, compared to slow flash (dout 40mhz).

Many cheap 8266 still have 40mhz dout, plus smaller caches, so it makes sense that there is still some benefit of IRAM_ATTR on boards with slow flash.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no doubt that any ESP32 performs adequately without IRAM_ATTR.
However ESP8266 is another thing and while it may be old and lacking it is still used by many users (including me) who keep attaching plenty of peripherals to it while running WLED. Hence I strongly urge to keep IRAM_ATTR as many times as possible.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you test if this PR with latest commit has any impact on ESP8266? i.e. removed IRAM_ATTR from updateTransitionProgress()

The other question was to add WLED_SAVE_IRAM to C3 builds as it would save on flash size but may have negative performance impacts in certain situations / setups. If there is a way to test that on the C3 I could check. After all, the C3 is more of an upgraded ESP8266 compared to other ESP32 variants.

transitionprogress = 0xFFFFU;
if (isInTransition()) {
unsigned diff = millis() - _t->_start;
if (_t->_dur > 0 && diff < _t->_dur) return diff * 0xFFFFU / _t->_dur;
if (_t->_dur > 0 && diff < _t->_dur) transitionprogress = diff * 0xFFFFU / _t->_dur;
softhack007 marked this conversation as resolved.
Show resolved Hide resolved
}
return 0xFFFFU;
}

#ifndef WLED_DISABLE_MODE_BLEND
Expand Down