Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inference: make throw block deoptimization concrete-eval friendly #49235

Merged
merged 2 commits into from
Sep 26, 2023

Conversation

aviatesk
Copy link
Member

@aviatesk aviatesk commented Apr 3, 2023

The deoptimization can sometimes destroy the effects analysis and disable [semi-]concrete evaluation that is otherwise possible. This is because the deoptimization was designed with the type domain profitability in mind (#35982), and hasn't been adequately considering the effects domain.

This commit makes the deoptimization aware of the effects domain more and enables the throw block deoptimization only when the effects already known to be ineligible for concrete-evaluation.

In our current effect system, ALWAYS_FALSE/false means that the effect can not be refined to ALWAYS_TRUE/true anymore (unless given user annotation later). Therefore we can enable the throw block deoptimization without hindering the chance of concrete-evaluation when any of the following conditions are met:

  • effects.consistent === ALWAYS_FALSE
  • effects.effect_free === ALWAYS_FALSE
  • effects.terminates === false
  • effects.nonoverlayed === false

Here are some numbers:

Metric master this commit #35982 reverted (set unoptimize_throw_blocks=false)
Base (seconds) 15.579300 15.206645 15.296319
Stdlibs (seconds) 17.919013 17.667094 17.738128
Total (seconds) 33.499279 32.874737 33.035448
Precompilation (seconds) 49.967516 49.421121 49.999998
First time plot(rand(10,3)) 1 2.476678 seconds (11.74 M allocations) 2.430355 seconds (11.77 M allocations) 2.514874 seconds (11.64 M allocations)
First time solve(prob, QNDF())(5.0) 2 4.469492 seconds (15.32 M allocations) 4.499217 seconds (15.41 M allocations) 4.470772 seconds (15.38 M allocations)

These numbers made me question if we are getting any actual benefit from
the throw block deoptimization anymore. Since it is sometimes harmful
for the effects analysis, we probably want to either merge this commit
or remove the throw block deoptimization completely.

Footnotes

  1. With disabling precompilation of Plots.jl.

  2. With disabling precompilation of OrdinaryDiffEq.

@aviatesk aviatesk requested a review from Keno April 3, 2023 15:10
@aviatesk aviatesk force-pushed the avi/throw-block-effects branch 2 times, most recently from a78f003 to bc32263 Compare April 5, 2023 10:36
@aviatesk
Copy link
Member Author

aviatesk commented Apr 5, 2023

@nanosoldier runbenchmarks("inference", vs=":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here.

aviatesk added a commit that referenced this pull request Apr 5, 2023
After experimenting with #49235, I started to question if we are getting
any actual benefit from the `throw` block deoptimization anymore.

This commit removes the deoptimization from the system entirely.

Based on the numbers below, it appears that the deoptimization is not
very profitable in our current Julia-level compilation pipeline,
with the effects analysis playing a significant role in reducing latency.

Here are the updated benchmark:
| Metric                  | master    | #49235      | this commit |
|-------------------------|-----------|-------------|--------------------------------------------|
| Base (seconds)          | 15.579300 | 15.206645   | 15.42059                                  |
| Stdlibs (seconds)       | 17.919013 | 17.667094   | 17.404586                                  |
| Total (seconds)         | 33.499279 | 32.874737   | 32.826162                                  |
| Precompilation (seconds) | 53.488528 | 53.152028  | 53.152028                                  |
| First time `plot(rand(10,3))` [^1] | `3.432983 seconds (16.55 M allocations)` | `3.477767 seconds (16.45 M allocations)` | `3.539117 seconds (16.43 M allocations)` |
| First time `solve(prob, QNDF())(5.0)` [^2] | `4.628278 seconds (15.74 M allocations)` | `4.609222 seconds (15.32 M allocations)` | `4.547323 seconds (15.19 M allocations: 823.510 MiB)` |

[^1]: `using Plots; plot(rand(10,3))`
[^2]: `using DifferentialEquations; solve(prob, QNDF())(5.0)`
aviatesk added a commit that referenced this pull request Apr 5, 2023
After experimenting with #49235, I started to question if we are getting
any actual benefit from the `throw` block deoptimization anymore.

This commit removes the deoptimization from the system entirely.

Based on the numbers below, it appears that the deoptimization is not
very profitable in our current Julia-level compilation pipeline,
with the effects analysis playing a significant role in reducing latency.

Here are the updated benchmark:
| Metric                  | master    | #49235      | this commit |
|-------------------------|-----------|-------------|--------------------------------------------|
| Base (seconds)          | 15.579300 | 15.206645   | 15.42059                                  |
| Stdlibs (seconds)       | 17.919013 | 17.667094   | 17.404586                                  |
| Total (seconds)         | 33.499279 | 32.874737   | 32.826162                                  |
| Precompilation (seconds) | 53.488528 | 53.152028  | 53.152028                                  |
| First time `plot(rand(10,3))` [^1] | `3.432983 seconds (16.55 M allocations)` | `3.477767 seconds (16.45 M allocations)` | `3.539117 seconds (16.43 M allocations)` |
| First time `solve(prob, QNDF())(5.0)` [^2] | `4.628278 seconds (15.74 M allocations)` | `4.609222 seconds (15.32 M allocations)` | `4.547323 seconds (15.19 M allocations: 823.510 MiB)` |

[^1]: With disabling precompilation of Plots.jl.
[^2]: With disabling precompilation of OrdinaryDiffEq.
@aviatesk aviatesk force-pushed the avi/throw-block-effects branch from bc32263 to ee45c04 Compare April 9, 2023 03:17
@aviatesk
Copy link
Member Author

aviatesk commented Apr 9, 2023

@nanosoldier runbenchmarks("inference", vs=":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here.

aviatesk added a commit that referenced this pull request Sep 20, 2023
After experimenting with #49235, I started to question if we are getting
any actual benefit from the `throw` block deoptimization anymore.

This commit removes the deoptimization from the system entirely.

Based on the numbers below, it appears that the deoptimization is not
very profitable in our current Julia-level compilation pipeline,
with the effects analysis playing a significant role in reducing latency.

Here are the updated benchmark:
| Metric                  | master    | #49235      | this commit |
|-------------------------|-----------|-------------|--------------------------------------------|
| Base (seconds)          | 15.579300 | 15.206645   | 15.42059                                  |
| Stdlibs (seconds)       | 17.919013 | 17.667094   | 17.404586                                  |
| Total (seconds)         | 33.499279 | 32.874737   | 32.826162                                  |
| Precompilation (seconds) | 53.488528 | 53.152028  | 53.152028                                  |
| First time `plot(rand(10,3))` [^1] | `3.432983 seconds (16.55 M allocations)` | `3.477767 seconds (16.45 M allocations)` | `3.539117 seconds (16.43 M allocations)` |
| First time `solve(prob, QNDF())(5.0)` [^2] | `4.628278 seconds (15.74 M allocations)` | `4.609222 seconds (15.32 M allocations)` | `4.547323 seconds (15.19 M allocations: 823.510 MiB)` |

[^1]: With disabling precompilation of Plots.jl.
[^2]: With disabling precompilation of OrdinaryDiffEq.
@aviatesk aviatesk force-pushed the avi/throw-block-effects branch from ee45c04 to 81287f2 Compare September 20, 2023 09:12
aviatesk added a commit that referenced this pull request Sep 20, 2023
After experimenting with #49235, I started to question if we are getting
any actual benefit from the `throw` block deoptimization anymore.

This commit removes the deoptimization from the system entirely.

Based on the numbers below, it appears that the deoptimization is not
very profitable in our current Julia-level compilation pipeline,
with the effects analysis playing a significant role in reducing latency.

Here are the updated benchmark:
| Metric                  | master    | #49235      | this commit |
|-------------------------|-----------|-------------|--------------------------------------------|
| Base (seconds)          | 15.579300 | 15.206645   | 15.42059                                  |
| Stdlibs (seconds)       | 17.919013 | 17.667094   | 17.404586                                  |
| Total (seconds)         | 33.499279 | 32.874737   | 32.826162                                  |
| Precompilation (seconds) | 53.488528 | 53.152028  | 53.152028                                  |
| First time `plot(rand(10,3))` [^1] | `3.432983 seconds (16.55 M allocations)` | `3.477767 seconds (16.45 M allocations)` | `3.539117 seconds (16.43 M allocations)` |
| First time `solve(prob, QNDF())(5.0)` [^2] | `4.628278 seconds (15.74 M allocations)` | `4.609222 seconds (15.32 M allocations)` | `4.547323 seconds (15.19 M allocations: 823.510 MiB)` |

[^1]: With disabling precompilation of Plots.jl.
[^2]: With disabling precompilation of OrdinaryDiffEq.
@aviatesk
Copy link
Member Author

@nanosoldier runbenchmarks("inference", vs=":master")

@nanosoldier
Copy link
Collaborator

Your job failed.

@aviatesk
Copy link
Member Author

@nanosoldier runbenchmarks("inference", vs=":master")

aviatesk added a commit that referenced this pull request Sep 20, 2023
After experimenting with #49235, I started to question if we are getting
any actual benefit from the `throw` block deoptimization anymore.

This commit removes the deoptimization from the system entirely.

Based on the numbers below, it appears that the deoptimization is not
very profitable in our current Julia-level compilation pipeline,
with the effects analysis playing a significant role in reducing latency.

Here are the updated benchmark:
| Metric                  | master    | #49235      | this commit |
|-------------------------|-----------|-------------|--------------------------------------------|
| Base (seconds)          | 15.579300 | 15.206645   | 15.42059                                  |
| Stdlibs (seconds)       | 17.919013 | 17.667094   | 17.404586                                  |
| Total (seconds)         | 33.499279 | 32.874737   | 32.826162                                  |
| Precompilation (seconds) | 53.488528 | 53.152028  | 53.152028                                  |
| First time `plot(rand(10,3))` [^1] | `3.432983 seconds (16.55 M allocations)` | `3.477767 seconds (16.45 M allocations)` | `3.539117 seconds (16.43 M allocations)` |
| First time `solve(prob, QNDF())(5.0)` [^2] | `4.628278 seconds (15.74 M allocations)` | `4.609222 seconds (15.32 M allocations)` | `4.547323 seconds (15.19 M allocations: 823.510 MiB)` |

[^1]: With disabling precompilation of Plots.jl.
[^2]: With disabling precompilation of OrdinaryDiffEq.
@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - no performance regressions were detected. A full report can be found here.

@aviatesk aviatesk force-pushed the avi/throw-block-effects branch from 81287f2 to 254b268 Compare September 20, 2023 11:38
@aviatesk
Copy link
Member Author

@nanosoldier runbenchmarks("inference", vs=":master")

@aviatesk
Copy link
Member Author

Given that this appears to be a clear improvement, I'll go ahead and merge it once the CI checks come back clean.

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - no performance regressions were detected. A full report can be found here.

@aviatesk aviatesk force-pushed the avi/throw-block-effects branch 2 times, most recently from e9060cf to e60401b Compare September 21, 2023 06:47
@aviatesk
Copy link
Member Author

@nanosoldier runbenchmarks("inference", vs=":master")

@aviatesk aviatesk force-pushed the avi/throw-block-effects branch from e60401b to e41b902 Compare September 21, 2023 07:18
@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here.

@aviatesk
Copy link
Member Author

@nanosoldier runbenchmarks("inference", vs=":master")

@aviatesk aviatesk force-pushed the avi/throw-block-effects branch from e41b902 to c8a5046 Compare September 21, 2023 08:17
@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here.

The deoptimization can sometimes destroy the effects analysis and
disable [semi-]concrete evaluation that is otherwise possible.
This is because the deoptimization was designed with the type domain
profitability in mind (#35982), and it has not been aware of the effects
domain very well.

This commit makes the deoptimization aware of the effects domain more
and enables the `throw` block deoptimization only when the effects
already known to be ineligible for concrete-evaluation.

In our current effect system, `ALWAYS_FALSE`/`false` means that the
effect can not be refined to `ALWAYS_TRUE`/`true` anymore (unless given
user annotation later). Therefore we can enable the `throw` block
deoptimization without hindering the chance of concrete-evaluation when
any of the following conditions are met:
- `effects.consistent === ALWAYS_FALSE`
- `effects.effect_free === ALWAYS_FALSE`
- `effects.terminates`
- `effects.nonoverlayed`
```

Here are some numbers:

| Metric                  | master    | this commit | #35982 reverted (set `unoptimize_throw_blocks=false`) |
|-------------------------|-----------|-------------|--------------------------------------------|
| Base (seconds)          | 15.579300 | 15.206645   | 15.296319                                  |
| Stdlibs (seconds)       | 17.919013 | 17.667094   | 17.738128                                  |
| Total (seconds)         | 33.499279 | 32.874737   | 33.035448                                  |
| Precompilation (seconds) | 49.967516 | 49.421121  | 49.999998                                  |
| First time `plot(rand(10,3))` [^1] | `2.476678 seconds (11.74 M allocations)` | `2.430355 seconds (11.77 M allocations)` | `2.514874 seconds (11.64 M allocations)` |

[^1]: I got these numbers with disabling all the `@precompile_all_calls` statements in Plots.jl.

These numbers made me question if we are getting any actual benefit from
the `throw` block deoptimization anymore. Since it is sometimes harmful
for the effects analysis, we probably want to either merge this commit
or remove the `throw` block deoptimization completely.
@aviatesk aviatesk force-pushed the avi/throw-block-effects branch from c8a5046 to d77836e Compare September 26, 2023 04:54
@aviatesk
Copy link
Member Author

@nanosoldier runbenchmarks("inference", vs=":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here.

@aviatesk aviatesk merged commit 28d9f73 into master Sep 26, 2023
@aviatesk aviatesk deleted the avi/throw-block-effects branch September 26, 2023 06:44
aviatesk added a commit that referenced this pull request Sep 26, 2023
After experimenting with #49235, I started to question if we are getting
any actual benefit from the `throw` block deoptimization anymore.

This commit removes the deoptimization from the system entirely.

Based on the numbers below, it appears that the deoptimization is not
very profitable in our current Julia-level compilation pipeline,
with the effects analysis playing a significant role in reducing latency.

Here are the updated benchmark:
| Metric                  | master    | #49235      | this commit |
|-------------------------|-----------|-------------|--------------------------------------------|
| Base (seconds)          | 15.579300 | 15.206645   | 15.42059                                  |
| Stdlibs (seconds)       | 17.919013 | 17.667094   | 17.404586                                  |
| Total (seconds)         | 33.499279 | 32.874737   | 32.826162                                  |
| Precompilation (seconds) | 53.488528 | 53.152028  | 53.152028                                  |
| First time `plot(rand(10,3))` [^1] | `3.432983 seconds (16.55 M allocations)` | `3.477767 seconds (16.45 M allocations)` | `3.539117 seconds (16.43 M allocations)` |
| First time `solve(prob, QNDF())(5.0)` [^2] | `4.628278 seconds (15.74 M allocations)` | `4.609222 seconds (15.32 M allocations)` | `4.547323 seconds (15.19 M allocations: 823.510 MiB)` |

[^1]: With disabling precompilation of Plots.jl.
[^2]: With disabling precompilation of OrdinaryDiffEq.
aviatesk added a commit that referenced this pull request Dec 7, 2023
After experimenting with #49235, I started to question if we are getting
any actual benefit from the `throw` block deoptimization anymore.

This commit removes the deoptimization from the system entirely.

Based on the numbers below, it appears that the deoptimization is not
very profitable in our current Julia-level compilation pipeline,
with the effects analysis playing a significant role in reducing latency.

Here are the updated benchmark:
| Metric                  | master    | #49235      | this commit |
|-------------------------|-----------|-------------|--------------------------------------------|
| Base (seconds)          | 15.579300 | 15.206645   | 15.42059                                  |
| Stdlibs (seconds)       | 17.919013 | 17.667094   | 17.404586                                  |
| Total (seconds)         | 33.499279 | 32.874737   | 32.826162                                  |
| Precompilation (seconds) | 53.488528 | 53.152028  | 53.152028                                  |
| First time `plot(rand(10,3))` [^1] | `3.432983 seconds (16.55 M allocations)` | `3.477767 seconds (16.45 M allocations)` | `3.539117 seconds (16.43 M allocations)` |
| First time `solve(prob, QNDF())(5.0)` [^2] | `4.628278 seconds (15.74 M allocations)` | `4.609222 seconds (15.32 M allocations)` | `4.547323 seconds (15.19 M allocations: 823.510 MiB)` |

[^1]: With disabling precompilation of Plots.jl.
[^2]: With disabling precompilation of OrdinaryDiffEq.
aviatesk added a commit that referenced this pull request Dec 7, 2023
After experimenting with #49235, I started to question if we are getting
any actual benefit from the `throw` block deoptimization anymore.

This commit removes the deoptimization from the system entirely.

Based on the numbers below, it appears that the deoptimization is not
very profitable in our current Julia-level compilation pipeline,
with the effects analysis playing a significant role in reducing latency.

Here are the updated benchmark:
| Metric                  | master    | #49235      | this commit |
|-------------------------|-----------|-------------|--------------------------------------------|
| Base (seconds)          | 15.579300 | 15.206645   | 15.42059                                  |
| Stdlibs (seconds)       | 17.919013 | 17.667094   | 17.404586                                  |
| Total (seconds)         | 33.499279 | 32.874737   | 32.826162                                  |
| Precompilation (seconds) | 53.488528 | 53.152028  | 53.152028                                  |
| First time `plot(rand(10,3))` [^1] | `3.432983 seconds (16.55 M allocations)` | `3.477767 seconds (16.45 M allocations)` | `3.539117 seconds (16.43 M allocations)` |
| First time `solve(prob, QNDF())(5.0)` [^2] | `4.628278 seconds (15.74 M allocations)` | `4.609222 seconds (15.32 M allocations)` | `4.547323 seconds (15.19 M allocations: 823.510 MiB)` |

[^1]: With disabling precompilation of Plots.jl.
[^2]: With disabling precompilation of OrdinaryDiffEq.
aviatesk added a commit that referenced this pull request Jan 18, 2024
After experimenting with #49235, I started to question if we are getting
any actual benefit from the `throw` block deoptimization anymore.

This commit removes the deoptimization from the system entirely.

Based on the numbers below, it appears that the deoptimization is not
very profitable in our current Julia-level compilation pipeline,
with the effects analysis playing a significant role in reducing latency.

Here are the updated benchmark:
| Metric                  | master    | #49235      | this commit |
|-------------------------|-----------|-------------|--------------------------------------------|
| Base (seconds)          | 15.579300 | 15.206645   | 15.42059                                  |
| Stdlibs (seconds)       | 17.919013 | 17.667094   | 17.404586                                  |
| Total (seconds)         | 33.499279 | 32.874737   | 32.826162                                  |
| Precompilation (seconds) | 53.488528 | 53.152028  | 53.152028                                  |
| First time `plot(rand(10,3))` [^1] | `3.432983 seconds (16.55 M allocations)` | `3.477767 seconds (16.45 M allocations)` | `3.539117 seconds (16.43 M allocations)` |
| First time `solve(prob, QNDF())(5.0)` [^2] | `4.628278 seconds (15.74 M allocations)` | `4.609222 seconds (15.32 M allocations)` | `4.547323 seconds (15.19 M allocations: 823.510 MiB)` |

[^1]: With disabling precompilation of Plots.jl.
[^2]: With disabling precompilation of OrdinaryDiffEq.
aviatesk added a commit that referenced this pull request Apr 11, 2024
After experimenting with #49235, I started to question if we are getting
any actual benefit from the `throw` block deoptimization anymore.

This commit removes the deoptimization from the system entirely.

Based on the numbers below, it appears that the deoptimization is not
very profitable in our current Julia-level compilation pipeline,
with the effects analysis playing a significant role in reducing latency.

Here are the updated benchmark:
| Metric                  | master    | #49235      | this commit |
|-------------------------|-----------|-------------|--------------------------------------------|
| Base (seconds)          | 15.579300 | 15.206645   | 15.42059                                  |
| Stdlibs (seconds)       | 17.919013 | 17.667094   | 17.404586                                  |
| Total (seconds)         | 33.499279 | 32.874737   | 32.826162                                  |
| Precompilation (seconds) | 53.488528 | 53.152028  | 53.152028                                  |
| First time `plot(rand(10,3))` [^1] | `3.432983 seconds (16.55 M allocations)` | `3.477767 seconds (16.45 M allocations)` | `3.539117 seconds (16.43 M allocations)` |
| First time `solve(prob, QNDF())(5.0)` [^2] | `4.628278 seconds (15.74 M allocations)` | `4.609222 seconds (15.32 M allocations)` | `4.547323 seconds (15.19 M allocations: 823.510 MiB)` |

[^1]: With disabling precompilation of Plots.jl.
[^2]: With disabling precompilation of OrdinaryDiffEq.
aviatesk added a commit that referenced this pull request Apr 11, 2024
After experimenting with #49235, I started to question if we are getting
any actual benefit from the `throw` block deoptimization anymore.

This commit removes the deoptimization from the system entirely.

Based on the numbers below, it appears that the deoptimization is not
very profitable in our current Julia-level compilation pipeline,
with the effects analysis playing a significant role in reducing latency.

Here are the updated benchmark:
| Metric                  | master    | #49235      | this commit |
|-------------------------|-----------|-------------|--------------------------------------------|
| Base (seconds)          | 15.579300 | 15.206645   | 15.42059                                  |
| Stdlibs (seconds)       | 17.919013 | 17.667094   | 17.404586                                  |
| Total (seconds)         | 33.499279 | 32.874737   | 32.826162                                  |
| Precompilation (seconds) | 53.488528 | 53.152028  | 53.152028                                  |
| First time `plot(rand(10,3))` [^1] | `3.432983 seconds (16.55 M allocations)` | `3.477767 seconds (16.45 M allocations)` | `3.539117 seconds (16.43 M allocations)` |
| First time `solve(prob, QNDF())(5.0)` [^2] | `4.628278 seconds (15.74 M allocations)` | `4.609222 seconds (15.32 M allocations)` | `4.547323 seconds (15.19 M allocations: 823.510 MiB)` |

[^1]: With disabling precompilation of Plots.jl.
[^2]: With disabling precompilation of OrdinaryDiffEq.
aviatesk added a commit that referenced this pull request Apr 11, 2024
After experimenting with #49235, I started to question if we are getting
any actual benefit from the `throw` block deoptimization anymore.

This commit removes the deoptimization from the system entirely.

Based on the numbers below, it appears that the deoptimization is not
very profitable in our current Julia-level compilation pipeline,
with the effects analysis playing a significant role in reducing latency.

Here are the updated benchmark:
| Metric                  | master    | #49235      | this commit |
|-------------------------|-----------|-------------|--------------------------------------------|
| Base (seconds)          | 15.579300 | 15.206645   | 15.42059                                  |
| Stdlibs (seconds)       | 17.919013 | 17.667094   | 17.404586                                  |
| Total (seconds)         | 33.499279 | 32.874737   | 32.826162                                  |
| Precompilation (seconds) | 53.488528 | 53.152028  | 53.152028                                  |
| First time `plot(rand(10,3))` [^1] | `3.432983 seconds (16.55 M allocations)` | `3.477767 seconds (16.45 M allocations)` | `3.539117 seconds (16.43 M allocations)` |
| First time `solve(prob, QNDF())(5.0)` [^2] | `4.628278 seconds (15.74 M allocations)` | `4.609222 seconds (15.32 M allocations)` | `4.547323 seconds (15.19 M allocations: 823.510 MiB)` |

[^1]: With disabling precompilation of Plots.jl.
[^2]: With disabling precompilation of OrdinaryDiffEq.
aviatesk added a commit that referenced this pull request Apr 11, 2024
After experimenting with #49235, I started to question if we are getting
any actual benefit from the `throw` block deoptimization anymore.

This commit removes the deoptimization from the system entirely.

Based on the numbers below, it appears that the deoptimization is not
very profitable in our current Julia-level compilation pipeline,
with the effects analysis playing a significant role in reducing latency.

Here are the updated benchmark:
| Metric                  | master    | #49235      | this commit |
|-------------------------|-----------|-------------|--------------------------------------------|
| Base (seconds)          | 15.579300 | 15.206645   | 15.42059                                  |
| Stdlibs (seconds)       | 17.919013 | 17.667094   | 17.404586                                  |
| Total (seconds)         | 33.499279 | 32.874737   | 32.826162                                  |
| Precompilation (seconds) | 53.488528 | 53.152028  | 53.152028                                  |
| First time `plot(rand(10,3))` [^1] | `3.432983 seconds (16.55 M allocations)` | `3.477767 seconds (16.45 M allocations)` | `3.539117 seconds (16.43 M allocations)` |
| First time `solve(prob, QNDF())(5.0)` [^2] | `4.628278 seconds (15.74 M allocations)` | `4.609222 seconds (15.32 M allocations)` | `4.547323 seconds (15.19 M allocations: 823.510 MiB)` |

[^1]: With disabling precompilation of Plots.jl.
[^2]: With disabling precompilation of OrdinaryDiffEq.
aviatesk added a commit that referenced this pull request Apr 12, 2024
After experimenting with #49235, I started to question if we are getting
any actual benefit from the `throw` block deoptimization anymore.

This commit removes the deoptimization from the system entirely.

Based on the numbers below, it appears that the deoptimization is not
very profitable in our current Julia-level compilation pipeline,
with the effects analysis playing a significant role in reducing latency.

Here are the updated benchmark:
| Metric                  | master    | #49235      | this commit |
|-------------------------|-----------|-------------|--------------------------------------------|
| Base (seconds)          | 15.579300 | 15.206645   | 15.42059                                  |
| Stdlibs (seconds)       | 17.919013 | 17.667094   | 17.404586                                  |
| Total (seconds)         | 33.499279 | 32.874737   | 32.826162                                  |
| Precompilation (seconds) | 53.488528 | 53.152028  | 53.152028                                  |
| First time `plot(rand(10,3))` [^1] | `3.432983 seconds (16.55 M allocations)` | `3.477767 seconds (16.45 M allocations)` | `3.539117 seconds (16.43 M allocations)` |
| First time `solve(prob, QNDF())(5.0)` [^2] | `4.628278 seconds (15.74 M allocations)` | `4.609222 seconds (15.32 M allocations)` | `4.547323 seconds (15.19 M allocations: 823.510 MiB)` |

[^1]: With disabling precompilation of Plots.jl.
[^2]: With disabling precompilation of OrdinaryDiffEq.
aviatesk added a commit that referenced this pull request Apr 30, 2024
After experimenting with #49235, I started to question if we are getting
any actual benefit from the `throw` block deoptimization anymore.

This commit removes the deoptimization from the system entirely.

Based on the numbers below, it appears that the deoptimization is not
very profitable in our current Julia-level compilation pipeline,
with the effects analysis playing a significant role in reducing latency.

Here are the updated benchmark:
| Metric                  | master    | #49235      | this commit |
|-------------------------|-----------|-------------|--------------------------------------------|
| Base (seconds)          | 15.579300 | 15.206645   | 15.42059                                  |
| Stdlibs (seconds)       | 17.919013 | 17.667094   | 17.404586                                  |
| Total (seconds)         | 33.499279 | 32.874737   | 32.826162                                  |
| Precompilation (seconds) | 53.488528 | 53.152028  | 53.152028                                  |
| First time `plot(rand(10,3))` [^1] | `3.432983 seconds (16.55 M allocations)` | `3.477767 seconds (16.45 M allocations)` | `3.539117 seconds (16.43 M allocations)` |
| First time `solve(prob, QNDF())(5.0)` [^2] | `4.628278 seconds (15.74 M allocations)` | `4.609222 seconds (15.32 M allocations)` | `4.547323 seconds (15.19 M allocations: 823.510 MiB)` |

[^1]: With disabling precompilation of Plots.jl.
[^2]: With disabling precompilation of OrdinaryDiffEq.
aviatesk added a commit that referenced this pull request May 9, 2024
After experimenting with #49235, I started to question if we are getting
any actual benefit from the `throw` block deoptimization anymore.

This commit removes the deoptimization from the system entirely.

Based on the numbers below, it appears that the deoptimization is not
very profitable in our current Julia-level compilation pipeline,
with the effects analysis playing a significant role in reducing latency.

Here are the updated benchmark:
| Metric                  | master    | #49235      | this commit |
|-------------------------|-----------|-------------|--------------------------------------------|
| Base (seconds)          | 15.579300 | 15.206645   | 15.42059                                  |
| Stdlibs (seconds)       | 17.919013 | 17.667094   | 17.404586                                  |
| Total (seconds)         | 33.499279 | 32.874737   | 32.826162                                  |
| Precompilation (seconds) | 53.488528 | 53.152028  | 53.152028                                  |
| First time `plot(rand(10,3))` [^1] | `3.432983 seconds (16.55 M allocations)` | `3.477767 seconds (16.45 M allocations)` | `3.539117 seconds (16.43 M allocations)` |
| First time `solve(prob, QNDF())(5.0)` [^2] | `4.628278 seconds (15.74 M allocations)` | `4.609222 seconds (15.32 M allocations)` | `4.547323 seconds (15.19 M allocations: 823.510 MiB)` |

[^1]: With disabling precompilation of Plots.jl.
[^2]: With disabling precompilation of OrdinaryDiffEq.
aviatesk added a commit that referenced this pull request Jul 4, 2024
After experimenting with #49235, I started to question if we are getting
any actual benefit from the `throw` block deoptimization anymore.

This commit removes the deoptimization from the system entirely.

Based on the numbers below, it appears that the deoptimization is not
very profitable in our current Julia-level compilation pipeline,
with the effects analysis playing a significant role in reducing latency.

Here are the updated benchmark:
| Metric                  | master    | #49235      | this commit |
|-------------------------|-----------|-------------|--------------------------------------------|
| Base (seconds)          | 15.579300 | 15.206645   | 15.42059                                  |
| Stdlibs (seconds)       | 17.919013 | 17.667094   | 17.404586                                  |
| Total (seconds)         | 33.499279 | 32.874737   | 32.826162                                  |
| Precompilation (seconds) | 53.488528 | 53.152028  | 53.152028                                  |
| First time `plot(rand(10,3))` [^1] | `3.432983 seconds (16.55 M allocations)` | `3.477767 seconds (16.45 M allocations)` | `3.539117 seconds (16.43 M allocations)` |
| First time `solve(prob, QNDF())(5.0)` [^2] | `4.628278 seconds (15.74 M allocations)` | `4.609222 seconds (15.32 M allocations)` | `4.547323 seconds (15.19 M allocations: 823.510 MiB)` |

[^1]: With disabling precompilation of Plots.jl.
[^2]: With disabling precompilation of OrdinaryDiffEq.
aviatesk added a commit that referenced this pull request Jul 5, 2024
After experimenting with #49235, I started to question if we are getting
any actual benefit from the `throw` block deoptimization anymore.

This commit removes the deoptimization from the system entirely.

Based on the numbers below, it appears that the deoptimization is not
very profitable in our current Julia-level compilation pipeline,
with the effects analysis playing a significant role in reducing latency.

Here are the updated benchmark:
| Metric                  | master    | #49235      | this commit |
|-------------------------|-----------|-------------|--------------------------------------------|
| Base (seconds)          | 15.579300 | 15.206645   | 15.42059                                  |
| Stdlibs (seconds)       | 17.919013 | 17.667094   | 17.404586                                  |
| Total (seconds)         | 33.499279 | 32.874737   | 32.826162                                  |
| Precompilation (seconds) | 53.488528 | 53.152028  | 53.152028                                  |
| First time `plot(rand(10,3))` [^1] | `3.432983 seconds (16.55 M allocations)` | `3.477767 seconds (16.45 M allocations)` | `3.539117 seconds (16.43 M allocations)` |
| First time `solve(prob, QNDF())(5.0)` [^2] | `4.628278 seconds (15.74 M allocations)` | `4.609222 seconds (15.32 M allocations)` | `4.547323 seconds (15.19 M allocations: 823.510 MiB)` |

[^1]: With disabling precompilation of Plots.jl.
[^2]: With disabling precompilation of OrdinaryDiffEq.
topolarity pushed a commit that referenced this pull request Aug 2, 2024
After experimenting with #49235, I started to question if we are getting
any actual benefit from the `throw` block deoptimization anymore.

This commit removes the deoptimization from the system entirely.

Based on the numbers below, it appears that the deoptimization is not
very profitable in our current Julia-level compilation pipeline,
with the effects analysis playing a significant role in reducing latency.

Here are the updated benchmark:
| Metric                  | master    | #49235      | this commit |
|-------------------------|-----------|-------------|--------------------------------------------|
| Base (seconds)          | 15.579300 | 15.206645   | 15.42059                                  |
| Stdlibs (seconds)       | 17.919013 | 17.667094   | 17.404586                                  |
| Total (seconds)         | 33.499279 | 32.874737   | 32.826162                                  |
| Precompilation (seconds) | 53.488528 | 53.152028  | 53.152028                                  |
| First time `plot(rand(10,3))` [^1] | `3.432983 seconds (16.55 M allocations)` | `3.477767 seconds (16.45 M allocations)` | `3.539117 seconds (16.43 M allocations)` |
| First time `solve(prob, QNDF())(5.0)` [^2] | `4.628278 seconds (15.74 M allocations)` | `4.609222 seconds (15.32 M allocations)` | `4.547323 seconds (15.19 M allocations: 823.510 MiB)` |

[^1]: With disabling precompilation of Plots.jl.
[^2]: With disabling precompilation of OrdinaryDiffEq.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants