Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Opacity peephole optimization #29775

Merged
merged 13 commits into from
Dec 15, 2021

Conversation

flar
Copy link
Contributor

@flar flar commented Nov 17, 2021

Fixes flutter/flutter#75697

This is a prototype to test the feasibility of doing peephole optimizations within the engine layers to eliminate unnecessary saveLayers or related caching.

The example below paints a simple child with an animated opacity. The existing code will attempt to cache the child on every frame and apply the opacity as a drawImage-with-alpha operation on it. While this saves a lot of time over using a saveLayer, it actually takes slightly longer than having this particular child just apply the alpha itself. Even more importantly, the caching-to-avoid-a-saveLayer mechanism uses 8mb of layer cache memory to achieve this result.

With the proof of concept changes in this PR the cache memory used falls to 0 and the rendering of the child is actually slightly faster (6.5ms vs 7ms) on an average FPS basis as measured with the Performance Overlay.

Click to expand!
import 'package:flutter/material.dart';

void main() {
  runApp(const MyApp());
}

class MyApp extends StatelessWidget {
  const MyApp({Key? key}) : super(key: key);

  @override
  Widget build(BuildContext context) {
    return MaterialApp(
      title: 'Flutter Demo',
      theme: ThemeData(
        primarySwatch: Colors.blue,
      ),
      home: const MyHomePage(title: 'Flutter Demo Home Page'),
    );
  }
}

class MyHomePage extends StatefulWidget {
  const MyHomePage({Key? key, required this.title}) : super(key: key);

  final String title;

  @override
  State<MyHomePage> createState() => _MyHomePageState();
}

class _MyHomePageState extends State<MyHomePage> with SingleTickerProviderStateMixin {
  late AnimationController _controller;

  double get _opacity => _controller.value * 0.5 + 0.25;
  int get _r => (_controller.value * 255).round();
  int get _g => 255 - _r;
  int get _b => 0;

  @override
  void initState() {
    super.initState();

    _controller = AnimationController(vsync: this, duration: const Duration(seconds: 4));
    _controller.repeat(reverse: true);
  }

  @override
  void dispose() {
    _controller.dispose();

    super.dispose();
  }

  @override
  Widget build(BuildContext context) {
    return Scaffold(
      appBar: AppBar(
        title: Text(widget.title),
      ),
      body: Center(
        child: Column(
          mainAxisAlignment: MainAxisAlignment.center,
          children: <Widget>[
            AnimatedBuilder(
              animation: _controller,
              builder: (context, child) => Opacity(
                opacity: _opacity,
                child: Container(width: 300, height: 400, color: Color.fromARGB(255, _r, _g, _b)),
              ),
            )
          ],
        ),
      ),
    );
  }
}

@flutter-dashboard
Copy link

It looks like this pull request may not have tests. Please make sure to add tests before merging. If you need an exemption to this rule, contact Hixie on the #hackers channel in Chat.

If you are not sure if you need tests, consider this rule of thumb: the purpose of a test is to make sure someone doesn't accidentally revert the fix. Ask yourself, is there anything in your PR that you feel it is important we not accidentally revert back to how it was before your fix?

Reviewers: Read the Tree Hygiene page and make sure this patch meets those guidelines before LGTMing.

@flar
Copy link
Contributor Author

flar commented Nov 17, 2021

@flar
Copy link
Contributor Author

flar commented Nov 17, 2021

Things TBD.

I think moving the overridden methods to just bool fields on Layer/ContainerLayer would be far simpler and eliminate the need for the new interposing subclass.

The DL needs to check more conditions than "op_count == 1". (Actually <2 would have been a looser naive test, but it was not meant to be a final implementation.)

More layer types need to be tagged as appropriate.

@zanderso
Copy link
Member

If we don't already have one that will show the benefits, could you check a benchmark or benchmarks into the framework repo first? Then we can track improvements and catch regressions more easily. Hopefully if we don't already have one, the example code above isn't too hard to turn into a framework benchmark.

@@ -36,6 +36,10 @@ class DisplayListLayer : public Layer {

void Paint(PaintContext& context) const override;

bool layer_can_accept_opacity() override {
return display_list()->op_count() == 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same TODO here?

I think we can probably do even better than this now though without an RTree if we distinguished between ops that actually draw anything or not (e.g. having a translate and a drawTextBlob should be ok).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, if we just count the number of SkPaint objects in the display list I think that would work, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(and maybe verify the SkPaint doesn't have something weird on it like an image filter)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are no SkPaint objects in the display list and I'm not sure how counting them would matter.

I mentioned that we need more conditions to be tested. I already know what conditions to look for, I just wanted a proof of concept to demonstrate how the performance would be impacted on a simple example.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new tests are more sophisticated, but mainly for rejecting cases that shouldn't be optimized. It can, for example, optimize a DL that is one giant saveLayer around a bunch of other rendering - since the saveLayer can intercept the alpha. But still not doing the non-overlapping test (even the simple one used in PrerollChildren) until the ops start recording their own bounds on a per-op level.

@flar flar added the Work in progress (WIP) Not ready (yet) for review! label Nov 22, 2021
@flar flar changed the title Opacity peephole optimization WIP: Opacity peephole optimization Nov 22, 2021
@flutter-dashboard flutter-dashboard bot added embedder Related to the embedder API platform-android labels Nov 22, 2021
@flar
Copy link
Contributor Author

flar commented Nov 22, 2021

This PR should address flutter/flutter#75697

@gw280 gw280 self-requested a review December 8, 2021 23:28
@flar flar removed the Work in progress (WIP) Not ready (yet) for review! label Dec 14, 2021
@flar flar changed the title WIP: Opacity peephole optimization Opacity peephole optimization Dec 14, 2021
@flar
Copy link
Contributor Author

flar commented Dec 14, 2021

Removing WIP label and prefix as the next commit should be ready for review and approval.

@flar flar force-pushed the opacity-peephole-optimization branch from 49d9a87 to 8118fa2 Compare December 14, 2021 00:29
@flar flar requested a review from dnfield December 14, 2021 21:48
@flar
Copy link
Contributor Author

flar commented Dec 14, 2021

It looks like this is ready for final reviews.

Copy link
Contributor

@dnfield dnfield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with some nits about commenting on numeric tolerances.

flow/display_list.cc Show resolved Hide resolved
flow/display_list_canvas_unittests.cc Show resolved Hide resolved
flow/display_list_canvas_unittests.cc Show resolved Hide resolved
@flar flar added the waiting for tree to go green This PR is approved and tested, but waiting for the tree to be green to land. label Dec 15, 2021
@fluttergithubbot fluttergithubbot merged commit e444009 into flutter:main Dec 15, 2021
engine-flutter-autoroll added a commit to engine-flutter-autoroll/flutter that referenced this pull request Dec 15, 2021
@dnfield
Copy link
Contributor

dnfield commented Dec 16, 2021

So good.

@zanderso
Copy link
Member

I missed this one before, but there was also a good improvement to layer cache memory in the new gallery:

https://flutter-flutter-perf.skia.org/e/?begin=1638397144&end=1644939237&keys=X41029fd37de71baf2de789a3414c35f0&requestType=0&xbaroffset=27478

@flar
Copy link
Contributor Author

flar commented Feb 15, 2022

I missed this one before, but there was also a good improvement to layer cache memory in the new gallery:

https://flutter-flutter-perf.skia.org/e/?begin=1638397144&end=1644939237&keys=X41029fd37de71baf2de789a3414c35f0&requestType=0&xbaroffset=27478

To be clear, the commit highlighted by that link is a regression, but you can see the improvement from this PR if you look back at the history (the big drop towards the left of the graph).

@zanderso
Copy link
Member

To be clear, the commit highlighted by that link is a regression, but you can see the improvement from this PR if you look back at the history (the big drop towards the left of the graph).

Yeah, that's the one I meant. SkiaPerf is a bit tricky to use correctly...

@@ -1067,13 +1079,17 @@ sk_sp<DisplayList> DisplayListBuilder::Build() {
used_ = allocated_ = op_count_ = 0;
nested_bytes_ = nested_op_count_ = 0;
storage_.realloc(bytes);
bool compatible = layer_stack_.back().is_group_opacity_compatible();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@flar Hey, Could I ask you a question? How do i understand the compatible? and another method add_compatible_op(). Thank you

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These methods (and variable names) are specific to group opacity compatibility. It's the only peephole optimization we make currently, but more should come.

These optimizations are cases where we can elide a saveLayer because the modification that would have been applied to the rendering inside the layer can instead be applied to each operation individually without any changes to the result. It is common to want to fade an entire UI component by surrounding it with a "saveLayer with alpha". But if the component is a single rendering operation or a few non-overlapping operations, then we can instead apply the alpha to each rendering operation. The code that deals with the group opacity compatibility is looking for such "single or non-overlapping rendering operations that can take an alpha to reduce their opacity" cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes embedder Related to the embedder API needs tests platform-android platform-ios waiting for tree to go green This PR is approved and tested, but waiting for the tree to be green to land.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

OpacityLayer causes sharp increases in memory usage in trivial apps
5 participants