-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Load rustdoc's JS search index on-demand. #82310
Conversation
Some changes occurred in HTML/CSS/JS. |
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @GuillaumeGomez (or someone else) soon. If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes. Please see the contribution instructions for more information. |
I have two problems with this change:
|
Pushed a fix. Thanks for spotting. I've also updated the demo pages at https://jacob.hoffman-andrews.com/rust/search-on-demand/std/primitive.slice.html.
In my tests, the browser is spending significant time parsing the JS (and the JSON inside). That's time that blocks other JS work, like building the toggles, which in turn blocks the overall page load from completing. This technique also reduces the memory footprint of doc pages in the common case where someone hasn't used search. |
When I pick a crate to filter the search results, the picked crate won't show up when I reload the page until I load the search index. |
Another thing which bothers me is the fact that the crate list arrives quite a while after clicking on the menu. It's really bad from a UX perspective... |
Good catch, fixed (and updated the demo pages).
Agreed, this also bothers me. Perhaps it's possible to generate the crate list statically rather than in JS? I'll take a look. By the way, this noticeable delay is a good demonstration that loading the search JSON takes non-negligible time even when the JS is cached. |
That's actually a great argument in favor for this change!
It'd require another JS file to store the crate list but it's possible. In the last version on your website, it seems like the search-index is now loaded every time (even with an empty local storage). |
Is the crate list not known at generation time for any particular page? Also, are there situations where the crate list is quite large, or is the situation on
Yep. The issue here is that if you pick a crate, then go back to "All crates", there's still a storage entry. Updated the code to ignore "All crates" (and pushed demo). |
It's actually very common to have more than 5 crates. To be exact, it depends on how many dependencies you have (so I'll let you do the maths :p).
I precised that my local storage was empty, which meant that I removed the local storage "rustdoc-saved-filter-crate" entry completely through the web dev tools. ;) |
Sorry, forgot to answer to one of your questions:
No, and the reason is actually a bit more complex than what it seems: you might add more crates to your local documentation but the search index has to work in any case and not simply be overwritten. So to do so, if the search-index already has this crate, it overwrites this entry specifically, otherwise it happens it. |
I pushed a new version (and updated demo) that generates a separate, smaller, crates.js listing the crates. This is used for building the dropdown and also for adding sidebar items depending on the page. There's still some issue: my crates.js has
Found the issue. Looks like a difference between Firefox and Chrome. |
src/librustdoc/html/render/mod.rs
Outdated
|
||
let crate_list_dst = cx.dst.join(&format!("crates{}.js", cx.shared.resource_suffix)); | ||
let crate_list = | ||
format!("const crates = [{}];", krates.iter().map(|k| format!("\"{}\"", k)).join(",")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const
isn't part of ES5, please use instead: window.CRATES = [{}]
. Also: this update is problematic because you remove crates that aren't listed, which is why you don't have test
listed in the crates. Please update the list instead of rewriting it completely.
EDIT: I wrote CRATES
capitalized but it's simply to avoid name conflicts, because "crates" seems way too common. If you want to go for another name (longer at least), don't hesitate to do so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case you wonder: it's why we update the search index like that and have a crate define on each line separately.
src/librustdoc/html/static/main.js
Outdated
} | ||
|
||
// Assumes `crates{version}.js` has already been loaded, setting the global const `crates`. | ||
addSearchOptions(crates); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like much to "assume" things are there in an async world. Instead, please call the function to init the crates lists from the crates{version}.js
file directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok nevermind, crates{version}.js
will always be loaded before. It's fine as is. However, please use window.crates
or equivalent. I don't like the usage of global with such common names, it makes the readibility a lot more problematic.
src/librustdoc/html/static/main.js
Outdated
// Assumes `crates{version}.js` has already been loaded, setting the global const `crates`. | ||
addSearchOptions(crates); | ||
addSidebarCrates(crates); | ||
var crateSearchDropDown = document.getElementById("crate-search"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you put all this in a function too so we can avoid creating variables in the "global" scope of this function please?
I think we're getting close to something good! Not having to load the search-index when unneeded is a good idea. I'm still worried about the big delay we have on the first search though... I'll think about that. |
A couple of thoughts on reducing the feeling of delay on the first search: Right now, there's no reaction on the rest of the page when you highlight the search box. It would be nice to insert some text below the search box like "Getting ready to search." Alternately: right now the search box says "Click or press 'S' to search, '?' for more options." And it continues to say that even after you focus the search box. We could make it so focusing the search box changes this to say "Type in your search here." That would indicate to the user that their input was received. Similarly, when the user starts typing, we should be responsive to that typing even if the index is not yet loaded. For instance, We could add text under the search box "Searching for 'yourquery'." One problem: the index JS load is one big event, not interruptible by other events. We could probably work around that by breaking up the JSON.parse into multiple calls (one for each crate), and adding setTimeout between. Similarly, I think there's a fair amount of work happening to turn the parsed JSON into in-memory structures; we could similarly add setTimeout during that work to make it possible to update the page in between. |
This comment has been minimized.
This comment has been minimized.
By the way, I was looking at the delay when searching and realized we have an artificial delay of 500ms before any search gets executed: rust/src/librustdoc/html/static/main.js Line 1935 in 301ad8a
|
Yep, but in here we outrun this timeout apparently. ;) |
I did a little test: I added some console.log lines for various events, cleared cache, loaded a page, then hit 's' and typed "string" as fast as I could. I just took a short typing test, and I type 475 characters in 60 seconds, or 1 character every 126ms. The JS code triggers on both keydown and keyup, so I'd expect to see 2 events every 126ms.
This shows that no events fire during load of the search-index. Some time after the index is fully loaded, we see a series of events come in impossibly fast. Then 500ms later, the search is actually executed. Due to the single-threaded nature of JS, no events can fire while the search-index is loading. Note that you can bypass the 500ms timeout and search immediately by hitting enter. This might be a good way to test whether things feel slow independent of that 500ms timeout. If my theory above is correct, adding setTimeouts during the search-index load should allow events to fire and generally make the page feel more responsive. |
I tested the I did find that Overall, I think this is the best we're likely to do. If we wanted to get really fancy, we could use a Web Worker that always has a copy of the index loaded. That would make searches blazing fast, but it wouldn't work for local files so we'd need two code paths. That's probably too much complexity. |
src/librustdoc/html/static/main.js
Outdated
} | ||
} | ||
crates_text.sort(function(a, b) { | ||
crates.sort(function(a, b) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just thought: now that the crate list is separated from the search index, it might be nice to sort them when generating the crate list in the rust code instead of doing it client side every time. What do you think? If you think it's too much, please add a FIXME comment so it can be taken care of later on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I'd be happy to do this!
☔ The latest upstream changes (presumably #82511) made this pull request unmergeable. Please resolve the merge conflicts. |
a4f2d2d
to
3d5ac81
Compare
src/librustdoc/html/render/mod.rs
Outdated
let dst = cx.dst.join(&format!("search-index{}.js", cx.shared.resource_suffix)); | ||
let (mut all_indexes, mut krates) = try_err!(collect_json(&dst, &krate.name.as_str()), &dst); | ||
all_indexes.push(search_index); | ||
krates.push(krate.name.to_string()); | ||
krates.sort(); | ||
krates.dedup(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just wondering: is that case even possible? Normally you can only have each crate once. Did you found a case where there was a duplicate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was moving this up from here: https://github.com/rust-lang/rust/pull/82310/files#diff-40a0eb025da61717b3b765ceb7fab21d91af3012360e90b9f46e15a4047946faL1081. I do think it's possible that krates.push(krate.name.to_string())
adds a crate name that is a duplicate of a crate already in the list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Look at the code of collect_json
, this more precisely:
// ...
let prefix = format!("\"{}\"", krate);
// ...
if line.starts_with(&prefix) {
continue;
}
// ...
If the crate is already in the search index, we don't add it to the crates list returned. So the dedup
is unnecessary here.
I went through the changes once again and I think we're getting close to the approval. Just two comments (one nit and one question about the |
3d5ac81
to
9416946
Compare
Instead of being loaded on every page, the JS search index is now loaded when either (a) there is a `?search=` param, or (b) the search input is focused. This saves both CPU and bandwidth. As of Feb 2021, https://doc.rust-lang.org/search-index1.50.0.js is 273,838 bytes gzipped or 2,544,939 bytes uncompressed. Evaluating it takes 445 ms of CPU time in Chrome 88 on a i7-10710U CPU (out of a total ~2,100 ms page reload). Generate separate JS file with crate names. This is much smaller than the full search index, and is used in the "hot path" to draw the page. In particular it's used to crate the dropdown for the search bar, and to append a list of crates to the sidebar (on some pages). Skip early search that can bypass 500ms timeout. This was occurring when someone had typed some text during the load of search-index.js. Their query was usually not ready to execute, and the search itself is fairly expensive, delaying the overall load, which delayed the input / keyup events, which delayed eventually executing the query.
9416946
to
768d5e9
Compare
Updated, thanks! |
Thanks! @bors: r+ |
📌 Commit 768d5e9 has been approved by |
…laumeGomez Load rustdoc's JS search index on-demand. Instead of being loaded on every page, the JS search index is now loaded when either (a) there is a `?search=` param, or (b) the search input is focused. This saves both CPU and bandwidth. As of Feb 2021, https://doc.rust-lang.org/search-index1.50.0.js is 273,838 bytes gzipped or 2,544,939 bytes uncompressed. Evaluating it takes 445 ms of CPU time in Chrome 88 on a i7-10710U CPU (out of a total ~2,100 ms page reload). Tested on Firefox and Chrome. New: https://jacob.hoffman-andrews.com/rust/search-on-demand/std/primitive.slice.html https://jacob.hoffman-andrews.com/rust/search-on-demand/std/primitive.slice.html?search=fn Old: https://jacob.hoffman-andrews.com/rust/search-on-load/std/primitive.slice.html https://jacob.hoffman-andrews.com/rust/search-on-load/std/primitive.slice.html?search=fn
…laumeGomez Load rustdoc's JS search index on-demand. Instead of being loaded on every page, the JS search index is now loaded when either (a) there is a `?search=` param, or (b) the search input is focused. This saves both CPU and bandwidth. As of Feb 2021, https://doc.rust-lang.org/search-index1.50.0.js is 273,838 bytes gzipped or 2,544,939 bytes uncompressed. Evaluating it takes 445 ms of CPU time in Chrome 88 on a i7-10710U CPU (out of a total ~2,100 ms page reload). Tested on Firefox and Chrome. New: https://jacob.hoffman-andrews.com/rust/search-on-demand/std/primitive.slice.html https://jacob.hoffman-andrews.com/rust/search-on-demand/std/primitive.slice.html?search=fn Old: https://jacob.hoffman-andrews.com/rust/search-on-load/std/primitive.slice.html https://jacob.hoffman-andrews.com/rust/search-on-load/std/primitive.slice.html?search=fn
Rollup of 8 pull requests Successful merges: - rust-lang#80527 (Make rustdoc lints a tool lint instead of built-in) - rust-lang#82310 (Load rustdoc's JS search index on-demand.) - rust-lang#82315 (Improve page load performance in rustdoc) - rust-lang#82564 (Revert `Vec::spare_capacity_mut` impl to prevent pointers invalidation) - rust-lang#82697 (Fix stabilization version of move_ref_pattern) - rust-lang#82717 (Account for macros when suggesting adding lifetime) - rust-lang#82740 (Fix commit detected when using `download-rustc`) - rust-lang#82744 (Pass `CrateNum` by value instead of by reference) Failed merges: r? `@ghost` `@rustbot` modify labels: rollup
rustdoc merged a new PR [#82310](rust-lang/rust#82310) a few days ago, this cause breaking changes to this extension. This commit aims to fix this incompatibility.
Instead of being loaded on every page, the JS search index is now loaded when either (a) there is a
?search=
param, or (b) the search input is focused.This saves both CPU and bandwidth. As of Feb 2021, https://doc.rust-lang.org/search-index1.50.0.js is 273,838 bytes gzipped or 2,544,939 bytes uncompressed. Evaluating it takes 445 ms of CPU time in Chrome 88 on a i7-10710U CPU (out of a total ~2,100 ms page reload).
Tested on Firefox and Chrome.
New:
https://jacob.hoffman-andrews.com/rust/search-on-demand/std/primitive.slice.html
https://jacob.hoffman-andrews.com/rust/search-on-demand/std/primitive.slice.html?search=fn
Old:
https://jacob.hoffman-andrews.com/rust/search-on-load/std/primitive.slice.html
https://jacob.hoffman-andrews.com/rust/search-on-load/std/primitive.slice.html?search=fn