Skip to content

Speed improvements for discussion #4138

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 44 commits into from
Dec 20, 2024
Merged
Show file tree
Hide file tree
Changes from 15 commits
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
c3f472f
some improvements to consider
DedeHai Sep 11, 2024
9341768
more improvements to color_scale() now even faster.
DedeHai Sep 12, 2024
feac45f
improvement in color_add
DedeHai Sep 12, 2024
992d11b
Improvements in get/set PixelColor()
DedeHai Sep 12, 2024
b07658b
improved Segment::setPixelColorXY a tiny bit
DedeHai Sep 12, 2024
09428dc
inlined getMappedPixelIndex, improved color_add, bugfix in colorFromP…
DedeHai Sep 12, 2024
ec938f2
removed old code
DedeHai Sep 12, 2024
d45b4ad
fixes and consistency
DedeHai Sep 13, 2024
2afff05
minor tweak (break instead of continue in setPixelColorXY)
DedeHai Sep 14, 2024
6a37f25
memory improvement: dropped static gamma table
DedeHai Sep 14, 2024
0e5bd4e
remove test printout
DedeHai Sep 14, 2024
f3137eb
updated Segment::color_from_palette
DedeHai Sep 14, 2024
686866c
Merge remote-tracking branch 'upstream/0_15' into 0_15__speed_improve…
DedeHai Sep 18, 2024
6962905
cleanup and improved color_add()
DedeHai Sep 18, 2024
a88436c
revert removal of adding with saturation, renamed 'fast' to 'saturate'
DedeHai Sep 19, 2024
17d59d3
adding initialization to vStrip, added comment on padding bytes
DedeHai Sep 22, 2024
0a54002
removed IRAM_ATTR from inlined function
DedeHai Sep 22, 2024
33cf82a
Indentations and a few optimisations
blazoncek Sep 23, 2024
906f8fc
Fix C3 compiler issue.
blazoncek Sep 25, 2024
bef1ac2
Added HSV2RGB and RGB2HSV functions for higher accuracy conversions
DedeHai Sep 25, 2024
c44b9f8
Merge remote-tracking branch 'upstream/0_15' into 0_15__speed_improve…
DedeHai Sep 26, 2024
b404458
fixed one forgotten replacement of rgb2hsv_approximate
DedeHai Sep 26, 2024
a76a895
bugfix
DedeHai Sep 27, 2024
7c0fe12
updated setPixelColor() and getPixelColor() functions
DedeHai Sep 28, 2024
202901b
bugfix, ESP32 compiler requires the color order to be identical
DedeHai Sep 28, 2024
c842994
Pre-calculate virtual
blazoncek Sep 28, 2024
9114867
Fix compiler error
blazoncek Sep 28, 2024
ffbc8c5
Reverting addition of `bool unScale`, added new improvements and fixes
DedeHai Sep 29, 2024
336da25
Private global _colorScaled
blazoncek Sep 29, 2024
8e78fb4
Merge branch '0_15' into 0_15__speed_improvements
blazoncek Sep 29, 2024
0ae7329
Update comment
blazoncek Sep 29, 2024
ee380c5
Replace uint16_t with unsigned for segment data
blazoncek Sep 30, 2024
ba3a61f
Reduced code size by:
blazoncek Oct 2, 2024
a15c391
Improvement to `setPixelColorXY` and some flash optimisations
DedeHai Oct 3, 2024
ca06214
removed todo.
DedeHai Oct 3, 2024
eb5ad23
Minor tweaks and whitespace
blazoncek Oct 5, 2024
be64930
Indentation and shadowed variable.
blazoncek Oct 7, 2024
210191b
Fix for realtime drawing on main segment
blazoncek Oct 7, 2024
ef1e24c
Bugfix & code reduction
blazoncek Nov 9, 2024
5c2bac4
Merge branch '0_15' into 0_15__speed_improvements
blazoncek Nov 9, 2024
0a05611
more improvements to setPixelColor
DedeHai Nov 26, 2024
cae9845
Merge remote-tracking branch 'upstream/main' into 0_15__speed_improve…
DedeHai Dec 20, 2024
7b9b3f1
merge fix
DedeHai Dec 20, 2024
3323d2e
another merge fix
DedeHai Dec 20, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 12 additions & 12 deletions wled00/FX.h
Original file line number Diff line number Diff line change
Expand Up @@ -595,21 +595,21 @@ typedef struct Segment {
void fadeToBlackBy(uint8_t fadeBy);
inline void blendPixelColor(int n, uint32_t color, uint8_t blend) { setPixelColor(n, color_blend(getPixelColor(n), color, blend)); }
inline void blendPixelColor(int n, CRGB c, uint8_t blend) { blendPixelColor(n, RGBW32(c.r,c.g,c.b,0), blend); }
inline void addPixelColor(int n, uint32_t color, bool fast = false) { setPixelColor(n, color_add(getPixelColor(n), color, fast)); }
inline void addPixelColor(int n, byte r, byte g, byte b, byte w = 0, bool fast = false) { addPixelColor(n, RGBW32(r,g,b,w), fast); }
inline void addPixelColor(int n, CRGB c, bool fast = false) { addPixelColor(n, RGBW32(c.r,c.g,c.b,0), fast); }
inline void addPixelColor(int n, uint32_t color, bool saturate = false) { setPixelColor(n, color_add(getPixelColor(n), color, saturate)); }
inline void addPixelColor(int n, byte r, byte g, byte b, byte w = 0, bool saturate = false) { addPixelColor(n, RGBW32(r,g,b,w), saturate); }
inline void addPixelColor(int n, CRGB c, bool saturate = false) { addPixelColor(n, RGBW32(c.r,c.g,c.b,0), saturate); }
inline void fadePixelColor(uint16_t n, uint8_t fade) { setPixelColor(n, color_fade(getPixelColor(n), fade, true)); }
[[gnu::hot]] uint32_t color_from_palette(uint16_t, bool mapping, bool wrap, uint8_t mcol, uint8_t pbri = 255) const;
[[gnu::hot]] uint32_t color_wheel(uint8_t pos) const;

// 2D Blur: shortcuts for bluring columns or rows only (50% faster than full 2D blur)
inline void blurCols(fract8 blur_amount, bool smear = false) { // blur all columns
const unsigned cols = virtualWidth();
for (unsigned k = 0; k < cols; k++) blurCol(k, blur_amount, smear);
for (unsigned k = 0; k < cols; k++) blurCol(k, blur_amount, smear);
}
inline void blurRows(fract8 blur_amount, bool smear = false) { // blur all rows
const unsigned rows = virtualHeight();
for ( unsigned i = 0; i < rows; i++) blurRow(i, blur_amount, smear);
for ( unsigned i = 0; i < rows; i++) blurRow(i, blur_amount, smear);
}

// 2D matrix
Expand All @@ -632,10 +632,10 @@ typedef struct Segment {
// 2D support functions
inline void blendPixelColorXY(uint16_t x, uint16_t y, uint32_t color, uint8_t blend) { setPixelColorXY(x, y, color_blend(getPixelColorXY(x,y), color, blend)); }
inline void blendPixelColorXY(uint16_t x, uint16_t y, CRGB c, uint8_t blend) { blendPixelColorXY(x, y, RGBW32(c.r,c.g,c.b,0), blend); }
inline void addPixelColorXY(int x, int y, uint32_t color, bool fast = false) { setPixelColorXY(x, y, color_add(getPixelColorXY(x,y), color, fast)); }
inline void addPixelColorXY(int x, int y, byte r, byte g, byte b, byte w = 0, bool fast = false) { addPixelColorXY(x, y, RGBW32(r,g,b,w), fast); }
inline void addPixelColorXY(int x, int y, CRGB c, bool fast = false) { addPixelColorXY(x, y, RGBW32(c.r,c.g,c.b,0), fast); }
inline void fadePixelColorXY(uint16_t x, uint16_t y, uint8_t fade) { setPixelColorXY(x, y, color_fade(getPixelColorXY(x,y), fade, true)); }
inline void addPixelColorXY(int x, int y, uint32_t color, bool saturate = false) { setPixelColorXY(x, y, color_add(getPixelColorXY(x,y), color, saturate)); }
inline void addPixelColorXY(int x, int y, byte r, byte g, byte b, byte w = 0, bool saturate = false) { addPixelColorXY(x, y, RGBW32(r,g,b,w), saturate); }
inline void addPixelColorXY(int x, int y, CRGB c, bool saturate = false) { addPixelColorXY(x, y, RGBW32(c.r,c.g,c.b,0), saturate); }
inline void fadePixelColorXY(uint16_t x, uint16_t y, uint8_t fade) { setPixelColorXY(x, y, color_fade(getPixelColorXY(x,y), fade, true)); }
void box_blur(unsigned r = 1U, bool smear = false); // 2D box blur
void blur2D(uint8_t blur_amount, bool smear = false);
void blurRow(uint32_t row, fract8 blur_amount, bool smear = false);
Expand Down Expand Up @@ -670,9 +670,9 @@ typedef struct Segment {
inline uint32_t getPixelColorXY(int x, int y) { return getPixelColor(x); }
inline void blendPixelColorXY(uint16_t x, uint16_t y, uint32_t c, uint8_t blend) { blendPixelColor(x, c, blend); }
inline void blendPixelColorXY(uint16_t x, uint16_t y, CRGB c, uint8_t blend) { blendPixelColor(x, RGBW32(c.r,c.g,c.b,0), blend); }
inline void addPixelColorXY(int x, int y, uint32_t color, bool fast = false) { addPixelColor(x, color, fast); }
inline void addPixelColorXY(int x, int y, byte r, byte g, byte b, byte w = 0, bool fast = false) { addPixelColor(x, RGBW32(r,g,b,w), fast); }
inline void addPixelColorXY(int x, int y, CRGB c, bool fast = false) { addPixelColor(x, RGBW32(c.r,c.g,c.b,0), fast); }
inline void addPixelColorXY(int x, int y, uint32_t color, bool saturate = false) { addPixelColor(x, color, saturate); }
inline void addPixelColorXY(int x, int y, byte r, byte g, byte b, byte w = 0, bool saturate = false) { addPixelColor(x, RGBW32(r,g,b,w), saturate); }
inline void addPixelColorXY(int x, int y, CRGB c, bool saturate = false) { addPixelColor(x, RGBW32(c.r,c.g,c.b,0), saturate); }
inline void fadePixelColorXY(uint16_t x, uint16_t y, uint8_t fade) { fadePixelColor(x, fade); }
inline void box_blur(unsigned i, bool vertical, fract8 blur_amount) {}
inline void blur2D(uint8_t blur_amount, bool smear = false) {}
Expand Down
49 changes: 23 additions & 26 deletions wled00/FX_2Dfcn.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -181,39 +181,36 @@ void IRAM_ATTR_YN Segment::setPixelColorXY(int x, int y, uint32_t col)
if (reverse ) x = virtualWidth() - x - 1;
if (reverse_y) y = virtualHeight() - y - 1;
if (transpose) { std::swap(x,y); } // swap X & Y if segment transposed

x *= groupLength(); // expand to physical pixels
y *= groupLength(); // expand to physical pixels

int W = width();
int H = height();
if (x >= W || y >= H) return; // if pixel would fall out of segment just exit

uint32_t tmpCol = col;
int yY = y;
for (int j = 0; j < grouping; j++) { // groupping vertically
if (yY >= H) break;
int xX = x;
for (int g = 0; g < grouping; g++) { // groupping horizontally
int xX = (x+g), yY = (y+j);
if (xX >= W || yY >= H) continue; // we have reached one dimension's end

if (xX >= W) continue; // we have reached one dimension's end
#ifndef WLED_DISABLE_MODE_BLEND
// if blending modes, blend with underlying pixel
if (_modeBlend) tmpCol = color_blend(strip.getPixelColorXY(start + xX, startY + yY), col, 0xFFFFU - progress(), true);
if (_modeBlend) col = color_blend(strip.getPixelColorXY(start + xX, startY + yY), col, 0xFFFFU - progress(), true);
#endif

strip.setPixelColorXY(start + xX, startY + yY, tmpCol);

strip.setPixelColorXY(start + xX, startY + yY, col);
if (mirror) { //set the corresponding horizontally mirrored pixel
if (transpose) strip.setPixelColorXY(start + xX, startY + height() - yY - 1, tmpCol);
else strip.setPixelColorXY(start + width() - xX - 1, startY + yY, tmpCol);
if (transpose) strip.setPixelColorXY(start + xX, startY + height() - yY - 1, col);
else strip.setPixelColorXY(start + width() - xX - 1, startY + yY, col);
}
if (mirror_y) { //set the corresponding vertically mirrored pixel
if (transpose) strip.setPixelColorXY(start + width() - xX - 1, startY + yY, tmpCol);
else strip.setPixelColorXY(start + xX, startY + height() - yY - 1, tmpCol);
if (transpose) strip.setPixelColorXY(start + width() - xX - 1, startY + yY, col);
else strip.setPixelColorXY(start + xX, startY + height() - yY - 1, col);
}
if (mirror_y && mirror) { //set the corresponding vertically AND horizontally mirrored pixel
strip.setPixelColorXY(start + width() - xX - 1, startY + height() - yY - 1, tmpCol);
strip.setPixelColorXY(start + width() - xX - 1, startY + height() - yY - 1, col);
}
xX++;
}
yY++;
}
}

Expand Down Expand Up @@ -296,8 +293,8 @@ void Segment::blurRow(uint32_t row, fract8 blur_amount, bool smear){
curnew = color_fade(cur, keep);
if (x > 0) {
if (carryover)
curnew = color_add(curnew, carryover, true);
uint32_t prev = color_add(lastnew, part, true);
curnew = color_add(curnew, carryover);
uint32_t prev = color_add(lastnew, part);
if (last != prev) // optimization: only set pixel if color has changed
setPixelColorXY(x - 1, row, prev);
} else // first pixel
Expand Down Expand Up @@ -329,15 +326,15 @@ void Segment::blurCol(uint32_t col, fract8 blur_amount, bool smear) {
curnew = color_fade(cur, keep);
if (y > 0) {
if (carryover)
curnew = color_add(curnew, carryover, true);
uint32_t prev = color_add(lastnew, part, true);
curnew = color_add(curnew, carryover);
uint32_t prev = color_add(lastnew, part);
if (last != prev) // optimization: only set pixel if color has changed
setPixelColorXY(col, y - 1, prev);
} else // first pixel
setPixelColorXY(col, y, curnew);
lastnew = curnew;
last = cur; //save original value for comparison on next iteration
carryover = part;
carryover = part;
}
setPixelColorXY(col, rows - 1, curnew);
}
Expand All @@ -359,8 +356,8 @@ void Segment::blur2D(uint8_t blur_amount, bool smear) {
uint32_t part = color_fade(cur, seep);
curnew = color_fade(cur, keep);
if (x > 0) {
if (carryover) curnew = color_add(curnew, carryover, true);
uint32_t prev = color_add(lastnew, part, true);
if (carryover) curnew = color_add(curnew, carryover);
uint32_t prev = color_add(lastnew, part);
// optimization: only set pixel if color has changed
if (last != prev) setPixelColorXY(x - 1, row, prev);
} else setPixelColorXY(x, row, curnew); // first pixel
Expand All @@ -378,14 +375,14 @@ void Segment::blur2D(uint8_t blur_amount, bool smear) {
uint32_t part = color_fade(cur, seep);
curnew = color_fade(cur, keep);
if (y > 0) {
if (carryover) curnew = color_add(curnew, carryover, true);
uint32_t prev = color_add(lastnew, part, true);
if (carryover) curnew = color_add(curnew, carryover);
uint32_t prev = color_add(lastnew, part);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I overlooked something in the previous discussions - I guess that setting the last parameter to true was done on purpose ... are you sure it does the same thing?

Copy link
Collaborator Author

@DedeHai DedeHai Sep 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this change is to be approved. It does not do 100% the same thing but it does it faster. I tested this with fire2012 at full blur and the 'overflow' happens only once (i.e. for one pixel) every few frames. There may be situations where this change actually is visible but I could not find one (except the ones I updated with 'smear' functionality in the other PR, smearing looks better without saturation, but this is a new feature so no backwards compatibility required)

// optimization: only set pixel if color has changed
if (last != prev) setPixelColorXY(col, y - 1, prev);
} else setPixelColorXY(col, y, curnew); // first pixel
lastnew = curnew;
last = cur; //save original value for comparison on next iteration
carryover = part;
carryover = part;
}
setPixelColorXY(col, rows - 1, curnew);
}
Expand Down
39 changes: 23 additions & 16 deletions wled00/FX_fcn.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -712,11 +712,16 @@ void IRAM_ATTR_YN Segment::setPixelColor(int i, uint32_t col)
{
if (!isActive()) return; // not active
#ifndef WLED_DISABLE_2D
int vStrip = i>>16; // hack to allow running on virtual strips (2D segment columns/rows)
int vStrip;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a big fan of leaving locals uninitialized... are you sure that vStrip is never used without first assigning a value?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good call.

#endif
i &= 0xFFFF;

if (i >= virtualLength() || i<0) return; // if pixel would fall out of segment just exit
if (i >= virtualLength() || i<0) // pixel would fall out of segment, check if this is a virtual strip NOTE: this is almost always false if not virtual strip, saves the calculation on 'standard' call
{
#ifndef WLED_DISABLE_2D
vStrip = i>>16; // hack to allow running on virtual strips (2D segment columns/rows)
#endif
i &= 0xFFFF; //truncate vstrip index
if (i >= virtualLength() || i<0) return; // if pixel would still fall out of segment just exit
}

#ifndef WLED_DISABLE_2D
if (is2D()) {
Expand All @@ -730,7 +735,7 @@ void IRAM_ATTR_YN Segment::setPixelColor(int i, uint32_t col)
case M12_pBar:
// expand 1D effect vertically or have it play on virtual strips
if (vStrip>0) setPixelColorXY(vStrip - 1, vH - i - 1, col);
else for (int x = 0; x < vW; x++) setPixelColorXY(x, vH - i - 1, col);
else for (int x = 0; x < vW; x++) setPixelColorXY(x, vH - i - 1, col);
break;
case M12_pArc:
// expand in circular fashion from center
Expand Down Expand Up @@ -791,7 +796,7 @@ void IRAM_ATTR_YN Segment::setPixelColor(int i, uint32_t col)
// Odd rays start further from center if prevRay started at center.
static int prevRay = INT_MIN; // previous ray number
if ((i % 2 == 1) && (i - 1 == prevRay || i + 1 == prevRay)) {
int jump = min(vW/3, vH/3); // can add 2 if using medium pinwheel
int jump = min(vW/3, vH/3); // can add 2 if using medium pinwheel
posx += inc_x * jump;
posy += inc_y * jump;
}
Expand Down Expand Up @@ -907,8 +912,7 @@ uint32_t IRAM_ATTR_YN Segment::getPixelColor(int i) const
if (!isActive()) return 0; // not active
#ifndef WLED_DISABLE_2D
int vStrip = i>>16;
#endif
i &= 0xFFFF;
#endif

#ifndef WLED_DISABLE_2D
if (is2D()) {
Expand All @@ -919,7 +923,7 @@ uint32_t IRAM_ATTR_YN Segment::getPixelColor(int i) const
return getPixelColorXY(i % vW, i / vW);
break;
case M12_pBar:
if (vStrip>0) return getPixelColorXY(vStrip - 1, vH - i -1);
if (vStrip>0) { i &= 0xFFFF; return getPixelColorXY(vStrip - 1, vH - i -1); }
else return getPixelColorXY(0, vH - i -1);
break;
case M12_pArc:
Expand Down Expand Up @@ -1141,8 +1145,8 @@ void Segment::blur(uint8_t blur_amount, bool smear) {
uint32_t part = color_fade(cur, seep);
curnew = color_fade(cur, keep);
if (i > 0) {
if (carryover) curnew = color_add(curnew, carryover, true);
uint32_t prev = color_add(lastnew, part, true);
if (carryover) curnew = color_add(curnew, carryover);
uint32_t prev = color_add(lastnew, part);
// optimization: only set pixel if color has changed
if (last != prev) setPixelColor(i - 1, prev);
} else // first pixel
Expand All @@ -1165,7 +1169,7 @@ uint32_t Segment::color_wheel(uint8_t pos) const {
pos = 255 - pos;
if (pos < 85) {
return RGBW32((255 - pos * 3), 0, (pos * 3), w);
} else if(pos < 170) {
} else if (pos < 170) {
pos -= 85;
return RGBW32(0, (pos * 3), (255 - pos * 3), w);
} else {
Expand All @@ -1184,18 +1188,21 @@ uint32_t Segment::color_wheel(uint8_t pos) const {
* @returns Single color from palette
*/
uint32_t Segment::color_from_palette(uint16_t i, bool mapping, bool wrap, uint8_t mcol, uint8_t pbri) const {
uint32_t color = gamma32(currentColor(mcol));

uint32_t color = currentColor(mcol);
// default palette or no RGB support on segment
if ((palette == 0 && mcol < NUM_COLORS) || !_isRGB) return (pbri == 255) ? color : color_fade(color, pbri, true);
if ((palette == 0 && mcol < NUM_COLORS) || !_isRGB) {
color = gamma32(color);
return (pbri == 255) ? color : color_fade(color, pbri, true);
}

unsigned paletteIndex = i;
if (mapping && virtualLength() > 1) paletteIndex = (i*255)/(virtualLength() -1);
// paletteBlend: 0 - wrap when moving, 1 - always wrap, 2 - never wrap, 3 - none (undefined)
if (!wrap && strip.paletteBlend != 3) paletteIndex = scale8(paletteIndex, 240); //cut off blend at palette "end"
CRGB fastled_col = ColorFromPalette(_currentPalette, paletteIndex, pbri, (strip.paletteBlend == 3)? NOBLEND:LINEARBLEND); // NOTE: paletteBlend should be global

return RGBW32(fastled_col.r, fastled_col.g, fastled_col.b, W(color));
return RGBW32(fastled_col.r, fastled_col.g, fastled_col.b, gamma8(W(color)));
}


Expand Down Expand Up @@ -1877,7 +1884,7 @@ bool WS2812FX::deserializeMap(uint8_t n) {
return (customMappingSize > 0);
}

uint16_t IRAM_ATTR WS2812FX::getMappedPixelIndex(uint16_t index) const {
__attribute__ ((always_inline)) inline uint16_t IRAM_ATTR WS2812FX::getMappedPixelIndex(uint16_t index) const {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non obligatory: I would prefer __attribute__ at the end but [[...]] in front.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I honestly have no Idea what the precompiler instructions mean in detail, I copied this from what they use in fastled...
the idea behind this is to get rid of the 'function entry' instructions that are added when a function is called. When I added the inline flash size increased by a few bytes, telling me that it is actually inlined. Since this short function is only called from two places and is called A LOT this may be faster. I have no way to check (would need a proper debugger that shows assembly instructions being executed line by line).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I've recently learned these are compiler attributes.

Copy link
Collaborator

@softhack007 softhack007 Sep 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 not sure if always_inline plays well with IRAM_ATTR .... the first tells the compiler to always inline the function, the latter says "put the function into IRAM" which means that a real function is needed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is what I was wondering too, this is just a suggestion, i.e. to inline this for speed but how to tell the compiler to inline it to the functions that are in ram... not sure how it will do it.

Copy link
Collaborator

@softhack007 softhack007 Sep 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To my understanding:

  • inline is a hint/suggestion to the compiler. So it might get inlined, or not.
  • __attribute__((always_inline)) is a directive. So the compiler must inline this function, no matter if its efficient or not.

If you want to optimize function calls, its sometime useful to add __attribute__((pure)) or __attribute__((const)) to the function declaration. But only do this after double-checking that the code is actually "pure" (no side-effects) or "const" (solely depends on arguments). I did this in the MoonModules fork, but honestly it does not give you more than 1 or 2 fps even if you apply it to lots of functions.

See MoonModules@7f9da30

// convert logical address to physical
if (index < customMappingSize
&& (realtimeMode == REALTIME_MODE_INACTIVE || realtimeRespectLedMaps)) index = customMappingTable[index];
Expand Down
11 changes: 5 additions & 6 deletions wled00/cfg.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -436,13 +436,12 @@ bool deserializeConfig(JsonObject doc, bool fromFS) {
else gammaCorrectBri = false;
if (light_gc_col > 1.0f) gammaCorrectCol = true;
else gammaCorrectCol = false;
if (gammaCorrectVal > 1.0f && gammaCorrectVal <= 3) {
if (gammaCorrectVal != 2.8f) NeoGammaWLEDMethod::calcGammaTable(gammaCorrectVal);
} else {
gammaCorrectVal = 1.0f; // no gamma correction
gammaCorrectBri = false;
gammaCorrectCol = false;
if (gammaCorrectVal <= 1.0f || gammaCorrectVal > 3) {
gammaCorrectVal = 1.0f; // no gamma correction
gammaCorrectBri = false;
gammaCorrectCol = false;
}
NeoGammaWLEDMethod::calcGammaTable(gammaCorrectVal); // fill look-up table

JsonObject light_tr = light["tr"];
CJSON(fadeTransition, light_tr["mode"]);
Expand Down
Loading