Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: does erasing types of components improve compile times? #2905

Merged
merged 17 commits into from
Oct 8, 2024

Conversation

gbj
Copy link
Collaborator

@gbj gbj commented Sep 1, 2024

Related to #2902. There is a genuine trade-off between compile time and WASM binary size: there seems to be some exponential compile time growth (and occasional recursion limits!) as we build a larger and larger statically-typed view tree, but it does allow the compiler to eliminate unused code much more effectively.

hackernews
cargo build --timings --features=ssr: 19.67s (main) => 6.65s (branch)
WASM size: 544kb (main) => 583kb (branch)

todo_app_sqlite_axum
cargo build --timings --features=ssr: 5.74s (main) => 2.62s (branch)
WASM size: 535kb (main) => 566kb (branch)

@gbj gbj force-pushed the component-erasure branch from 3ed6823 to 5bcab29 Compare September 12, 2024 01:43
@gbj
Copy link
Collaborator Author

gbj commented Sep 14, 2024

It's worth asking whether this should be opt-in or opt-out. (If you look at the commit history here, you can see I waffled.)

My thinking here is this: Ideally, the framework should be able to operate in "no erased types" mode all the time. It produces the best end result for your users, in both binary size (load times, bandwidth use) and runtime performance. It is purely a limitation of compiler performance that this becomes infeasible for some larger apps, with larger UI trees and therefore combinatorically-increasing type trees.

I would rather the default be the best outcome, and be able to tell people "oh, and it might help your dev-mode compile times to opt into the erase-types feature, but just FYI here are the limitations of that" rather than having people say "Wait why is my binary huge?" and have to say "oh did you add the dont_erase_types flag?"

@benwis
Copy link
Contributor

benwis commented Sep 16, 2024

I suppose it’s the question of where’s the pain point. I expect developers to notice long compile times when they first try Leptos, and not binary size/runtime perf. Then when they go to release or deploy you say oh you need the release flag and can optimize size by adding this. I think people expect some hoops at that point.

Making this opt in, i.e slow to compile by default, reinforces a preconception I've heard a bunch that rust on the web has slow iteration times due to long compile times. I imagine more people would simply have their preconception validated and quit if they’re trying it out than go through that friction of asking the question.

I understand the fear of packages escaping with poor performance/bundle size into the wild, but it would be quite easy to add a Warning/Check to cargo leptos build --release saying "Hey, you should enable the don't_erase_types for release to get all these benefits."

@gbj gbj force-pushed the component-erasure branch 2 times, most recently from ffc40bd to 926ae84 Compare September 21, 2024 22:03
@zakstucke
Copy link
Contributor

zakstucke commented Sep 26, 2024

Chiming in here.

TLDR:
This branch still does wonders for compile time compared to current main. I'm confident >=90% of people would choose erasure in debug, no erasure in release and so the default should be variable between release and debug.

It's hard to argue a 50%ish improvement in compile times will ever not be wanted in dev.
Dev compile times are just so important from an actual productivity standpoint, but also the rust stigma @benwis mentioned.

It's also easy to see the opposite is true in release, an e.g. 3->6 minute compile time increase won't matter much to most people infrequently building, at the gain of bundle size and performance, this is where the magic @gbj has brought with the static tree shines.

How could this be done?
I don't think you can change the default feature set between debug and release, see rust-lang/cargo#1197.
So this could be done with 2 features, release_erased and debug_no_erased, so erasure is cfg() flagged based on release/debug unless the corresponding opt out feature is enabled. There might be other ways, but I think this would work off the top of my head.

I've done some concrete comparisons to remove any doubt between main 2bf1f46 and this branch rebased on main for a real project.

Notes:

  • cargo clean is used between each type and attempt, except incremental+clippy tests were run after an initial build, and had a small code edit between changes.
  • These are all in debug mode, where the compile time pressure comes.
  • The argument is particularly strong for ssr incremental, the bottleneck in a hot dev loop, with an over 50% improvement
Variant/Attempt no 1 2 3
Main: ssr no-incremental 2m22s 2m22s 2m19s
Erased: ssr no-incremental 1m24s 1m22s 1m22s
0.6: ssr no-incremental 1m14s 1m14s 1m13s
Main: hydrate/wasm no-incremental 1m18s 1m17s 1m18s
Erased: hydrate/wasm no-incremental 59.15s 1m00s 59.37s
0.6: hydrate/wasm no-incremental 49.01s 49.14s 48.53s
Main: incremental ssr 27.12s 25.34s 24.01s
Erased: incremental ssr 12.85s 12.17s 12.08s
0.6: incremental ssr 10.68s 9.04s 8.96s
Main: incremental wasm 11.66s 11.55s 11.41s
Erased: incremental wasm 7.14s 6.80s 6.83s
0.6: incremental wasm 5.49s 5.35s 5.70s
Main: cargo clippy --all-features 5.56s 5.61s 5.66s
Erased: cargo clippy --all-features 5.33s 5.36s 5.29s
0.6: cargo clippy --all-features 5.67s 5.56s 5.86s

I hope this helps, and thanks again for your work gbj, truly impressive.

@zakstucke
Copy link
Contributor

zakstucke commented Sep 26, 2024

I've just added 0.6 timings for the codebase under test pre-port to 0.7.

Still slightly better, but it's important to show it's a minimal increase from 6 to 7 for dev compiles once erasure added too.

@gbj gbj force-pushed the component-erasure branch from 926ae84 to d345416 Compare September 28, 2024 13:23
@gbj
Copy link
Collaborator Author

gbj commented Sep 28, 2024

Thanks for the numbers @zakstucke

Ben and I were discussing yesterday and I have a proposal: using a custom --cfg flag rather than a feature. cargo-leptos can be adapted to use this cfg by default when building for debug, and not to use it for release. (Users can control it via RUSTFLAGS in any case.)

Benefits of using --cfg:

  • It does not require the user to opt out for release
  • It does not need to be un-set by all ecosystem libraries (if we use a default feature it will need to be deactivated everywhere, lest it be included via Cargo feature unification by being included somewhere)

I've rebased the PR and updated accordingly.

@zakstucke
Copy link
Contributor

zakstucke commented Sep 28, 2024

@gbj sounds great!

I think the crux is cargo leptos/leptos depending on how it's implemented should be optimised for performance when "releasing" and optimised for compile times when "developing", importantly by default to prevent you having to explain to every next person haha.

Your flag solution and having it built into cargo leptos sounds like it covers more edge cases than my features suggestion.

@gbj gbj force-pushed the component-erasure branch from 48e1782 to 130c07d Compare October 2, 2024 23:22
@zakstucke
Copy link
Contributor

Hey @gbj, unfortunately this branch now brings back the dreaded infinite compile -> RAM overflow, this has happened post my benchmarks.

I've tested the current commit history of this PR, the breaking commit is:
support AddAnyAttr on AnyView

@zakstucke
Copy link
Contributor

Interestingly, when I reverse the changes in that commit, I assumed it would be caused by the todo!() replacement on line 317, but it's not, still happened when moved back to todo!().

What actually causes it is the add_any_attr being added to the AnyView struct, when I remove that, it compiles fine again.

Exact changes to get compiling here:
zakstucke@1179cab

Hopefully that makes some sense to you...

@zakstucke
Copy link
Contributor

zakstucke commented Oct 6, 2024

It seems just replacing the contents of the add_any_attr closure with a todo!() works, so it seems to be some compiler infinite recursion with the into_any() call inside that closure.

Something like the compiler gets stuck producing code for:
into_any<T1>() -> add_any_attr<T1>() -> into_any<T2>() -> add_any_attr<T2>() -> into_any<T3>() ...

@gbj
Copy link
Collaborator Author

gbj commented Oct 6, 2024

@zakstucke Thanks! Looks like this particular change must have been the issue, then, more than the others.

That's quite tricky as this PR becomes virtually unusable with todo!() there, because it will break spreading props onto any component unless I'm mistaken.

@zakstucke
Copy link
Contributor

zakstucke commented Oct 6, 2024

Unusable yes, but only because of this problem with AnyView, AFAIK this isn't breaking because of large types, but genuine recursion in the compiler due to the current implementation of attribute addition. (issue still exists when erased types turned off)

@zakstucke
Copy link
Contributor

But technically a site does work by just removing the closure entirely, and just returning self from the AddAnyAttr implementation, just missing functionality.

@gbj
Copy link
Collaborator Author

gbj commented Oct 6, 2024

@zakstucke Is it possible that the whole issue was that I foolishly named them the same thing? I don't think that's right because I think I'm correctly accessing it as the struct field function and not the method, but I guess maybe it is. Try d799ede I guess. I don't think it will fix it but who knows.

@zakstucke
Copy link
Contributor

zakstucke commented Oct 6, 2024

No, I'm quite confident there's a logic issue here, take a look at the add_any_attr closure contents, and the fact that it calls into_any() on the generic output of the AddAnyAttr trait, which means into_any() recurses and produces a new add_any_attr closure etc etc

@zakstucke
Copy link
Contributor

zakstucke commented Oct 6, 2024

For example, when I was testing I tried reworking things to reuse the same AnyView (replacing the .value) instead of calling the into_any() trait to produce another one in add_any_attr. This compiles fine, but then fails at runtime downcasting because the T does actually always change during the add_any_attr closure.

@zakstucke
Copy link
Contributor

zakstucke commented Oct 6, 2024

@gbj To put my thinking more formally:

Where T = ()

  • into_any() called, or more accurately the compiler is setting it up/analysing it or something.
  • add_any_attr closure is created for T = (), T is concretely needed for the downcast and the generic AddAnyAttr::Output.
  • T::add_any_attr(..) -> T2 = (attr,)
  • T2::into_any() repeats above infinitely at compile time, leading to T3 = (attr, attr), T4 = (attr, attr, attr), ... etc.

I'm not too familiar with compiler stuff, so I may be way off, but from my testing and how the logic looks, this or similar seems to me what's happening.

@gbj
Copy link
Collaborator Author

gbj commented Oct 7, 2024

@zakstucke Thanks — This is a little tricky because I don't have an example in front of me that actually causes compiler issues. Do you have an example, or does it only occur in a sufficiently large/complex app that it's not really reproducible in small?

While investigating, I did find a possible source of the looping — Namely, that calling .into_any() on an AnyView creates an additional level of nesting, rather than simply returning that AnyView. It should really never be necessary for this to happen, so in 79ade93 I just check if the T that is being converted into AnyView is already AnyView and if so, simply return it rather than adding an additional layer of boxing. (This will probably also help with binary sizes I guess, in cases where we inadvertently end up with .into_any().into_any() chains).

I have no idea whether this will help with the particular issue, but it seems better to me in any case.

@zakstucke
Copy link
Contributor

Unfortunately your change doesn't fix it.

I've been trying to repro: after deleting half the project the app starts compiling in 5m... still using >10GB of RAM whilst compiling though. (I wasn't deleting components, just the entire body of components, aka all the view!{} trees)

So seems like it might be a size issue, shame, my recursive idea would've been much nicer 😢

@zakstucke
Copy link
Contributor

Does AnyView need to be still relying on T? Could it be reworked back into an enum?

@zakstucke
Copy link
Contributor

I've also realised I can actually get to the end of compilation with the untouched project too, it uses close to 50GB of RAM to get there (I must've been running too much other stuff last time it caused my PC to crash), compiles then get to this:

    Finished `dev` profile [unoptimized + debuginfo] target(s) in 9m 42s
     Running `target/debug/app`
dyld[90873]: dyld cache '(null)' not loaded: syscall to map cache into shared region failed
dyld[90873]: Library not loaded: /System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation
  Referenced from: <5B562BFE-8FC3-3113-AF60-B514DFEAB547> /Users/zak/z/code/app/rust/target/debug/app
  Reason: tried: '/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation' (no such file), '/System/Volumes/Preboot/Cryptexes/OS/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation' (no such file), '/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation' (no such file, no dyld cache)
[1]    90873 abort      cargo run --features ssr

@zakstucke
Copy link
Contributor

zakstucke commented Oct 7, 2024

image

Note the size of the files, is it that T becomes a runtime cost/string for every single AnyView for the downcasting from Any, whereas normally they're just a compile time construct maybe.

@zakstucke
Copy link
Contributor

@gbj see discord dm

@gbj gbj force-pushed the component-erasure branch from 79ade93 to a84bad0 Compare October 8, 2024 01:04
@gbj gbj force-pushed the component-erasure branch from a84bad0 to f9cc3d3 Compare October 8, 2024 01:06
@gbj
Copy link
Collaborator Author

gbj commented Oct 8, 2024

@zakstucke I've just reverted the add_any_attr change to do nothing but return the same AnyView (self) rather than panicking, so hopefully that will return this to helpful territory. I'm glad you caught it.

I've realized that -- when someone runs into the case in which they need prop spreading but now it isn't supported because their component has been erased -- they can just mark that component #[component(transparent)] anyway, and it will not be erased even with component erasure turned on.

If you could confirm for me that the latest commit here is back to normal, I think it's in fine shape to be merged.

@gbj gbj merged commit 57c07e9 into main Oct 8, 2024
73 checks passed
@gbj gbj deleted the component-erasure branch October 8, 2024 21:03
chrisp60 pushed a commit to chrisp60/leptos that referenced this pull request Oct 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants