-
-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimization: color_blend() variable range now determined by overloading #4245
base: 0_15
Are you sure you want to change the base?
Conversation
…ding Removing the bool saves on code size and makes the function a tiny bit faster. Also this is a cleaner solution IMHO. Tested the affected FX, did not notice any change.
- previous version did not return C2 when blend=255 - inlining 0 and max check may be faster in some cases and uses just 40 extra bytes. - 8bit and 16bit versions both call the 16bit base function
@DedeHai I've played a bit with the color_blend function, and I found some improvements converting 8bit blend to 16bit:
To use the full range, we could use improved blend accuracyThe blend logic seems to be based on some very old FastLED code. FastLED has changed their For the 8bit case, we can use the same logic - i've tested it and indeed its more accurate than the code in 0_15. uint8_t r3 = (((r1 << 8) | r2) + (r2 * blend) - (r1*blend) ) >> 8; For 16bit blend, it might be trickier, because |
this was fixed in the latest commit by adding a check in the inline function. I need to take a closer look at the optimization, maybe this new version can be done the way I did with |
- efficient color blend calculation in fews operations possible - omitting min / max checks makes it faster on average - using 8bit for "blend" variable does not significantly influence the resulting color, just transition points are slightly shifted but yield very good results (and better than the original 16bit version using the old fastled math with improper rounding) - updated drawCircle and drawLine to use 8bit directly instead of 16bit with a shift
I made some further optimizations based on the new fastLED version you proposed. In my tests, your suggested variant was a tiny bit faster when testing it on the ripple FX, my updated version uses less code so I am guessing the compile did some inlining to speed up |
Are we good to merge this once the conflict is resolved? |
there are some pending changes. edit: sorry, I think I confused this with #4256 |
I'm not sure I need to do anything regarding this PR. |
Removing the
bool
saves on code size and makes the function a tiny bit faster. Also this is a cleaner solution IMHO.Tested the affected FX, did not notice any change.