Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add optimizations for dead code removal #1444

Draft
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

bbannier
Copy link
Contributor

@bbannier bbannier commented May 9, 2023

This PR adds optimizations related to dead code removal. For that we add detection of whether a function is side effect-free and elimination of dead stores. Marking a store as dead allows us to completely remove the function call then.

For void pure function we also add a pass which simplifies their bodies.

The dead code optimizations around pure functions shouldn't be observable at runtime, but users might be able to see them by e.g., looking at the profiling output.

@bbannier bbannier self-assigned this May 9, 2023
@bbannier bbannier force-pushed the topic/bbannier/dead_stores branch 3 times, most recently from b93a7b9 to 4f33757 Compare May 10, 2023 13:09
@bbannier bbannier changed the title Remove dead stores Add optimizations for dead code removal May 11, 2023
@bbannier bbannier force-pushed the topic/bbannier/dead_stores branch 2 times, most recently from 9b0a754 to 1ab2f25 Compare May 23, 2023 14:09
@bbannier bbannier marked this pull request as ready for review May 23, 2023 15:13
@bbannier bbannier requested a review from rsmmr May 23, 2023 15:13
Copy link
Member

@rsmmr rsmmr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good overall. I can't quite judge all the cases we might run into, but I guess that's what the test suite is for: ensuring we don't remove too much. It seems the logic is a bit more pessimistic at some places than it'd need to be, but that can be extended later. In particular, we could extend the pure notion to more expressions: e.g., an access to a global can be safe if it's read-only (but not sure how much that would buy us in the end).

Have you done any performance comparisons on an actual analyzer yet (and/or an analysis on what/how much actually gets removed?)

hilti/toolchain/src/compiler/optimizer.cc Outdated Show resolved Hide resolved
hilti/toolchain/src/compiler/optimizer.cc Outdated Show resolved Hide resolved
hilti/toolchain/src/compiler/optimizer.cc Show resolved Hide resolved
hilti/toolchain/src/compiler/optimizer.cc Show resolved Hide resolved
hilti/toolchain/src/compiler/optimizer.cc Outdated Show resolved Hide resolved
hilti/toolchain/src/compiler/optimizer.cc Show resolved Hide resolved
hilti/toolchain/src/compiler/optimizer.cc Outdated Show resolved Hide resolved
hilti/toolchain/src/compiler/optimizer.cc Show resolved Hide resolved
Copy link
Member

@rsmmr rsmmr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good overall. I can't quite judge all the cases we might run into, but I guess that's what the test suite is for: ensuring we don't remove too much. It seems the logic is a bit more pessimistic at some places than it'd need to be, but that can be extended later. In particular, we could extend the pure notion to more expressions: e.g., an access to a global can be safe if it's read-only (but not sure how much that would buy us in the end).

Have you done any performance comparisons on an actual analyzer yet (and/or an analysis on what/how much actually gets removed?)

@bbannier bbannier force-pushed the topic/bbannier/dead_stores branch 2 times, most recently from a4be6ee to e9ee59e Compare May 25, 2023 11:43
@bbannier bbannier linked an issue Jun 12, 2023 that may be closed by this pull request
@bbannier
Copy link
Contributor Author

Have you done any performance comparisons on an actual analyzer yet (and/or an analysis on what/how much actually gets removed?)

I benchmarked this with a big internal parser and the optimizations here make no measurable difference. I suspect that this is due to that parser not containing a lot local, dead code. Doing more global dead code removal based on data and control flow analysis might be able to remove more code in that code base. That being said, the changes here should still be able to remove unneeded code if it is actually present.

With that maybe the question whether to merge this or not depends whether the added complexity is too much for us. WDYT?

@bbannier bbannier requested a review from rsmmr January 23, 2024 22:25
@rsmmr
Copy link
Member

rsmmr commented Jan 24, 2024

I benchmarked this with a big internal parser and the optimizations here make no measurable difference.

Can you tell how much code is actually removed (independent of performance)?

With that maybe the question whether to merge this or not depends whether the added complexity is too much for us. WDYT?

I'm leaning towards not merging at this point if there's no immediate benefit. The optimizer is getting pretty complex and I think it'd be a good time to revisit our approach to doing these things.

@bbannier
Copy link
Contributor Author

Can you tell how much code is actually removed (independent of performance)?

From looking at the generated C++ code it looks like pass removed almost no actual user code in this case. There are a lot (thousands) of removals of code like

__location__("foo.spicy:131:2");
(void());

Such (void()) code comes from the optimizer removing other code (e.g., feature-dependent code which we optimize aggressively). Even though the code got removed we still emit location updates (which thankfully do not seem to have a measurable performance impact).

I also saw a few instances of temporaries we emit being removed

::hilti::rt::stream::SafeConstIterator __parse_lahe;
...
::hilti::rt::stream::SafeConstIterator __parse_lahe_2;

Overall it looks like for this code base this pass has no practical relevance.

I'm leaning towards not merging at this point if there's no immediate benefit.

I'll keep this branch around since should we at some point use control and data flow information for optimizations we will probably still need annotations for pure functions (at least if their bodies are invisible to our optimizer in C++ library code).

@bbannier bbannier marked this pull request as draft January 24, 2024 13:25
@rsmmr
Copy link
Member

rsmmr commented Jan 24, 2024

__location__("foo.spicy:131:2");
(void());

We could add a narrow, pattern-based optimization to the C++ code generator to skip these directly. Same for default<void>();; or, even better, the whole chunk of these: ```

(*self).__error = __error;
default<void>();
__error = (*self).__error;

Won't have an impact on performance, but makes the code a bit prettier.

@bbannier
Copy link
Contributor Author

We could add a narrow, pattern-based optimization to the C++ code generator to skip these directly.

I was hoping to build something on top of your AST rewrite to implement true AST node removal instead of just emptying them out like now.

I am not really a fan of doing one-off fixes in the code generator since this seems fragile and also pretty hard (e.g., these roundtripping assignments between self.__error and __error).

@rsmmr rsmmr removed their request for review January 29, 2024 09:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add optimization removing unused temporaries
2 participants