-
Notifications
You must be signed in to change notification settings - Fork 6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Opacity peephole optimization #29775
Opacity peephole optimization #29775
Conversation
It looks like this pull request may not have tests. Please make sure to add tests before merging. If you need an exemption to this rule, contact Hixie on the #hackers channel in Chat. If you are not sure if you need tests, consider this rule of thumb: the purpose of a test is to make sure someone doesn't accidentally revert the fix. Ask yourself, is there anything in your PR that you feel it is important we not accidentally revert back to how it was before your fix? Reviewers: Read the Tree Hygiene page and make sure this patch meets those guidelines before LGTMing. |
Things TBD. I think moving the overridden methods to just bool fields on Layer/ContainerLayer would be far simpler and eliminate the need for the new interposing subclass. The DL needs to check more conditions than "op_count == 1". (Actually <2 would have been a looser naive test, but it was not meant to be a final implementation.) More layer types need to be tagged as appropriate. |
If we don't already have one that will show the benefits, could you check a benchmark or benchmarks into the framework repo first? Then we can track improvements and catch regressions more easily. Hopefully if we don't already have one, the example code above isn't too hard to turn into a framework benchmark. |
flow/layers/display_list_layer.h
Outdated
@@ -36,6 +36,10 @@ class DisplayListLayer : public Layer { | |||
|
|||
void Paint(PaintContext& context) const override; | |||
|
|||
bool layer_can_accept_opacity() override { | |||
return display_list()->op_count() == 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same TODO here?
I think we can probably do even better than this now though without an RTree if we distinguished between ops that actually draw anything or not (e.g. having a translate and a drawTextBlob should be ok).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, if we just count the number of SkPaint
objects in the display list I think that would work, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(and maybe verify the SkPaint doesn't have something weird on it like an image filter)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are no SkPaint objects in the display list and I'm not sure how counting them would matter.
I mentioned that we need more conditions to be tested. I already know what conditions to look for, I just wanted a proof of concept to demonstrate how the performance would be impacted on a simple example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new tests are more sophisticated, but mainly for rejecting cases that shouldn't be optimized. It can, for example, optimize a DL that is one giant saveLayer around a bunch of other rendering - since the saveLayer can intercept the alpha. But still not doing the non-overlapping test (even the simple one used in PrerollChildren) until the ops start recording their own bounds on a per-op level.
2bb61f7
to
7f046a5
Compare
This PR should address flutter/flutter#75697 |
Removing WIP label and prefix as the next commit should be ready for review and approval. |
49d9a87
to
8118fa2
Compare
It looks like this is ready for final reviews. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with some nits about commenting on numeric tolerances.
So good. |
I missed this one before, but there was also a good improvement to layer cache memory in the new gallery: |
To be clear, the commit highlighted by that link is a regression, but you can see the improvement from this PR if you look back at the history (the big drop towards the left of the graph). |
Yeah, that's the one I meant. SkiaPerf is a bit tricky to use correctly... |
@@ -1067,13 +1079,17 @@ sk_sp<DisplayList> DisplayListBuilder::Build() { | |||
used_ = allocated_ = op_count_ = 0; | |||
nested_bytes_ = nested_op_count_ = 0; | |||
storage_.realloc(bytes); | |||
bool compatible = layer_stack_.back().is_group_opacity_compatible(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@flar Hey, Could I ask you a question? How do i understand the compatible
? and another method add_compatible_op()
. Thank you
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These methods (and variable names) are specific to group opacity compatibility. It's the only peephole optimization we make currently, but more should come.
These optimizations are cases where we can elide a saveLayer because the modification that would have been applied to the rendering inside the layer can instead be applied to each operation individually without any changes to the result. It is common to want to fade an entire UI component by surrounding it with a "saveLayer with alpha". But if the component is a single rendering operation or a few non-overlapping operations, then we can instead apply the alpha to each rendering operation. The code that deals with the group opacity compatibility is looking for such "single or non-overlapping rendering operations that can take an alpha to reduce their opacity" cases.
Fixes flutter/flutter#75697
This is a prototype to test the feasibility of doing peephole optimizations within the engine layers to eliminate unnecessary saveLayers or related caching.
The example below paints a simple child with an animated opacity. The existing code will attempt to cache the child on every frame and apply the opacity as a drawImage-with-alpha operation on it. While this saves a lot of time over using a saveLayer, it actually takes slightly longer than having this particular child just apply the alpha itself. Even more importantly, the caching-to-avoid-a-saveLayer mechanism uses 8mb of layer cache memory to achieve this result.
With the proof of concept changes in this PR the cache memory used falls to 0 and the rendering of the child is actually slightly faster (6.5ms vs 7ms) on an average FPS basis as measured with the Performance Overlay.
Click to expand!