Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved FPS calculation #4250

Merged
merged 3 commits into from
Nov 5, 2024
Merged

Conversation

DedeHai
Copy link
Collaborator

@DedeHai DedeHai commented Nov 4, 2024

This improvement is for dev purposes mainly but it also gives a bit smoother FPS return values in normal builds.

Changes:

  • fixed point calculation for improved accuracy, 9.7 bit, so max FPS can be 511
    - dithering (in debug builds only)
  • averaging and optional multiplier can be set as compile flags

Example for speed testing with long averaging and a 10x multiplier:

-D FPS_CALC_AVG=200
-D FPS_MULTIPLIER=10

Since the calculation resolution is limited (9.7bit fixed point) FPS_CALC_AVG values larger than 200 can hit resolution limit and get stuck before reaching the final value.

If WLED_DEBUG is defined, dithering is added to the returned value so sub-frame accuracy is possible in post-processing without enabling the multiplier.

Fixed point calculation for improved accuracy, dithering in debug builds only.
Averaging and optional multiplier can be set as compile flags, example for speed testing with long averaging and a 10x multiplier:

-D FPS_CALC_AVG=200
-D FPS_MULTIPLIER=10

The calculation resolution is limited (9.7bit fixed point) so values larger than 200 can hit resolution limit and get stuck before reaching the final value.

If WLED_DEBUG is defined, dithering is added to the returned value so sub-frame accuracy is possible in post-processingwithout enabling the multiplier.
@softhack007
Copy link
Collaborator

softhack007 commented Nov 4, 2024

@DedeHai looks very good for me, thanks. Two questions

  • the max possible value of 200FPS is due to uint16_t _cumulativeFps, right? so in case i wanted more, changing _cumulativeFps to uint32_t would be enough?
  • to use a higher resolution time (like ESP.getCycleCount()) is just a matter of scaling diff and fpsCurr appropriately? or would it be better to also increase fixed-point resolution in that case?

@DedeHai
Copy link
Collaborator Author

DedeHai commented Nov 4, 2024

  • the problem is resolution, I don't remember exactly which part causes the problem, but it is a standard problem for fixed point low pass filters. You do not want to crank up the number past 200 anyway, the time constant is roughly 6 times larger, so at 200 with 100FPS it takes over 10 seconds for the value to reach the targetvalue. 400 may still work, did not test. I slightly changed the code I used before for this PR, for me, I used 8.8 math but I thought 255FPS maximum may be exceeded in the future
  • if you want more smoothing then yes, you need to increase it to 32bit AND (more importantly) increase the bit shift (and adjust what "0.5" is with the new shift). I really don't think its necessary though, if your FPS are so unstable that an averaging over like 10 seconds does not give you accurate numbers, you may have a different issue there...
  • the time resolution is irrelevant, FPS are always in seconds ;) but yes, just need to adapt some values to match. The filter I implemented here (or I must say extended, it was the same before) is universal: value in = value out, as long as there are no overflows.

edit:
I realize now that you wrote 200FPS max. the limit is 511FPS. If you need more, you can go 32bit or reduce the bit shift to 6. The 200 limit mentioned is for FPS_CALC_AVG and its not a hard limit, nothing breaks really, just the filter will not reach the final value.

@softhack007
Copy link
Collaborator

edit:
I realize now that you wrote 200FPS max. the limit is 511FPS.

Thanks for your explanations. I think 511 FPS is future-safe for a long time ;-)

bitshift was still set from testing, forgot to update
@DedeHai
Copy link
Collaborator Author

DedeHai commented Nov 4, 2024

just for reference: I use this to check FPS in my tests https://pypi.org/project/wled2graph/

#ifdef WLED_DEBUG
return (FPS_MULTIPLIER * (_cumulativeFps + (random16() & 63))) >> 7; // + random("0.5") for dithering
#else
return (FPS_MULTIPLIER * _cumulativeFps) >> 7; // _cumulativeFps is stored in fixed point 9.7 bit
Copy link
Collaborator

@softhack007 softhack007 Nov 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe replace all places with "7" by a constant (#define) ?

Its better for maintainability to leave a hint what the magic number "7" stands for - in this case the fixed point fraction bits.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

dithering is not really needed, the FPS_MULTIPLIER is a much better option.
Copy link
Collaborator

@softhack007 softhack007 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good 👍 I approve

Actually the old FPS code was always "broken" in a sense that it stops converging to real FPS when the delta is below 2. So we could see this as a bugfix that would fit into 0.15.

@DedeHai DedeHai merged commit 271a07a into Aircoookie:0_15 Nov 5, 2024
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants