Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rustdoc: compute maximum Levenshtein distance based on the query #107141

Merged
merged 4 commits into from
Feb 6, 2023

Conversation

notriddle
Copy link
Contributor

@notriddle notriddle commented Jan 21, 2023

Preview: https://notriddle.com/notriddle-rustdoc-demos/search-lev-distance-2023/std/index.html?search=regex

The heuristic is pretty close to the name resolver, maxLevDistance = Math.floor(queryLen / 3).

Fixes #103357
Fixes #82131

Similar to #103710, but following the suggestion in #103710 (comment) to use floor instead of ceil, and unblocked now that #105796 made it so that setting the max lev distance to 0 doesn't cause substring matches to be removed.

@rustbot
Copy link
Collaborator

rustbot commented Jan 21, 2023

r? @jsha

(rustbot has picked a reviewer for you, use r? to override)

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue. labels Jan 21, 2023
@rustbot
Copy link
Collaborator

rustbot commented Jan 21, 2023

Some changes occurred in HTML/CSS/JS.

cc @GuillaumeGomez, @Folyd, @jsha

@rust-log-analyzer

This comment has been minimized.

@notriddle notriddle force-pushed the notriddle/max-lev-distance-2023 branch from 049164d to 39fd4bb Compare January 21, 2023 07:11
lev = checkIfInGenerics(row, elem);
// Now whatever happens, the returned distance is "less good" so we should mark
// it as such, and so we add 0.5 to the distance to make it "less good".
return lev + 0.5;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This hack has to go. It's intended to impact search result ordering, but there's a more principled way to accomplish that, and we don't have a test case for it.

@notriddle
Copy link
Contributor Author

Fixed a bug related to path matching (which caused it to discard path_lev in cases where it's needed for a good result).

Copy link
Member

@GuillaumeGomez GuillaumeGomez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, looks good to me. Can you add a test focusing on this change please? For now it's mostly removal and no adds.

@GuillaumeGomez
Copy link
Member

Looks good to me, thanks! Considering it's a change in the search engine, let's start a FCP to confirm it's ok with the team.

@rfcbot fcp merge

@rfcbot
Copy link

rfcbot commented Jan 24, 2023

Team member @GuillaumeGomez has proposed to merge this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. labels Jan 24, 2023
@camelid
Copy link
Member

camelid commented Jan 25, 2023

Do we know of any downsides to this change, or is it just a matter of making search follow the text of the query more strictly?

@rfcbot rfcbot added final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. and removed proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. labels Jan 25, 2023
@rfcbot
Copy link

rfcbot commented Jan 25, 2023

🔔 This is now entering its final comment period, as per the review above. 🔔

@notriddle
Copy link
Contributor Author

@camelid

Do we know of any downsides to this change, or is it just a matter of making search follow the text of the query more strictly?

It's not quite that simple. The old MAX_LEV_DISTANCE was hardcoded to 3. Math.floor(queryLen / 3) will return a smaller number if queryLen is less than 9, but queries with more than 12 characters will allow lev to be 4 or more.

@camelid
Copy link
Member

camelid commented Jan 26, 2023

@notriddle Ah, thanks for the explanation! Scaling linearly with the input length seems like a good approach 👍

@rfcbot rfcbot added finished-final-comment-period The final comment period is finished for this PR / Issue. and removed final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. labels Feb 4, 2023
@rfcbot
Copy link

rfcbot commented Feb 4, 2023

The final comment period, with a disposition to merge, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

This will be merged soon.

@rfcbot rfcbot added the to-announce Announce this issue on triage meeting label Feb 4, 2023
@GuillaumeGomez
Copy link
Member

Thanks!

@bors r+ rollup

@bors
Copy link
Contributor

bors commented Feb 5, 2023

📌 Commit 616a0db has been approved by GuillaumeGomez

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Feb 5, 2023
@bors
Copy link
Contributor

bors commented Feb 6, 2023

⌛ Testing commit 616a0db with merge 7c3f0d6...

@bors
Copy link
Contributor

bors commented Feb 6, 2023

☀️ Test successful - checks-actions
Approved by: GuillaumeGomez
Pushing 7c3f0d6 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Feb 6, 2023
@bors bors merged commit 7c3f0d6 into rust-lang:master Feb 6, 2023
@rustbot rustbot added this to the 1.69.0 milestone Feb 6, 2023
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (7c3f0d6): comparison URL.

Overall result: ❌ regressions - no action needed

@rustbot label: -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
1.1% [0.5%, 1.6%] 2
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) - - 0

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
9.0% [9.0%, 9.0%] 1
Improvements ✅
(primary)
-3.5% [-4.1%, -2.9%] 2
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -3.5% [-4.1%, -2.9%] 2

Cycles

This benchmark run did not return any relevant results for this metric.

@apiraino apiraino removed the to-announce Announce this issue on triage meeting label Feb 16, 2023
wip-sync pushed a commit to NetBSD/pkgsrc-wip that referenced this pull request Apr 24, 2023
Pkgsrc changes:
 * Adjust patches and cargo checksums to new versions.
 * Sadly, the patch to reduce the cargo verbosity no longer applies,
   so I've asked upstream about the proper way to get the old result.
   (so the build log becomes Quite Bloated for now).

Upstream changes:

Version 1.69.0 (2023-04-20)
==========================

Language
--------

- [Deriving built-in traits on packed structs works with `Copy` fields.]
  (rust-lang/rust#104429)
- [Stabilize the `cmpxchg16b` target feature on x86 and x86_64.]
  (rust-lang/rust#106774)
- [Improve analysis of trait bounds for associated types.]
  (rust-lang/rust#103695)
- [Allow associated types to be used as union fields.]
  (rust-lang/rust#106938)
- [Allow `Self: Autotrait` bounds on dyn-safe trait methods.]
  (rust-lang/rust#107082)
- [Treat `str` as containing `[u8]` for auto trait purposes.]
  (rust-lang/rust#107941)

Compiler
--------

- [Upgrade `*-pc-windows-gnu` on CI to mingw-w64 v10 and GCC 12.2.]
  (rust-lang/rust#100178)
- [Rework min_choice algorithm of member constraints.]
  (rust-lang/rust#105300)
- [Support `true` and `false` as boolean flags in compiler arguments.]
  (rust-lang/rust#107043)
- [Default `repr(C)` enums to `c_int` size.]
  (rust-lang/rust#107592)

Libraries
---------

- [Implement the unstable `DispatchFromDyn` for cell types, allowing
  downstream experimentation with custom method receivers.]
  (rust-lang/rust#97373)
- [Document that `fmt::Arguments::as_str()` may return `Some(_)`
  in more cases after optimization, subject to change.]
  (rust-lang/rust#106823)
- [Implement `AsFd` and `AsRawFd` for `Rc`.]
  (rust-lang/rust#107317)

Stabilized APIs
---------------

- [`CStr::from_bytes_until_nul`]
  (https://doc.rust-lang.org/stable/core/ffi/struct.CStr.html#method.from_bytes_until_nul)
- [`core::ffi::FromBytesUntilNulError`]
  (https://doc.rust-lang.org/stable/core/ffi/struct.FromBytesUntilNulError.html)

These APIs are now stable in const contexts:

- [`SocketAddr::new`]
  (https://doc.rust-lang.org/stable/std/net/enum.SocketAddr.html#method.new)
- [`SocketAddr::ip`]
  (https://doc.rust-lang.org/stable/std/net/enum.SocketAddr.html#method.ip)
- [`SocketAddr::port`]
  (https://doc.rust-lang.org/stable/std/net/enum.SocketAddr.html#method.port)
- [`SocketAddr::is_ipv4`]
  (https://doc.rust-lang.org/stable/std/net/enum.SocketAddr.html#method.is_ipv4)
- [`SocketAddr::is_ipv6`]
  (https://doc.rust-lang.org/stable/std/net/enum.SocketAddr.html#method.is_ipv6)
- [`SocketAddrV4::new`]
  (https://doc.rust-lang.org/stable/std/net/struct.SocketAddrV4.html#method.new)
- [`SocketAddrV4::ip`]
  (https://doc.rust-lang.org/stable/std/net/struct.SocketAddrV4.html#method.ip)
- [`SocketAddrV4::port`]
  (https://doc.rust-lang.org/stable/std/net/struct.SocketAddrV4.html#method.port)
- [`SocketAddrV6::new`]
  (https://doc.rust-lang.org/stable/std/net/struct.SocketAddrV6.html#method.new)
- [`SocketAddrV6::ip`]
  (https://doc.rust-lang.org/stable/std/net/struct.SocketAddrV6.html#method.ip)
- [`SocketAddrV6::port`]
  (https://doc.rust-lang.org/stable/std/net/struct.SocketAddrV6.html#method.port)
- [`SocketAddrV6::flowinfo`]
  (https://doc.rust-lang.org/stable/std/net/struct.SocketAddrV6.html#method.flowinfo)
- [`SocketAddrV6::scope_id`]
  (https://doc.rust-lang.org/stable/std/net/struct.SocketAddrV6.html#method.scope_id)

Cargo
-----

- [Cargo now suggests `cargo fix` or `cargo clippy --fix` when compilation warnings are auto-fixable.]
  (rust-lang/cargo#11558)
- [Cargo now suggests `cargo add` if you try to install a library crate.]
  (rust-lang/cargo#11410)
- [Cargo now sets the `CARGO_BIN_NAME` environment variable also for binary examples.]
  (rust-lang/cargo#11705)

Rustdoc
-----

- [Vertically compact trait bound formatting.]
  (rust-lang/rust#102842)
- [Only include stable lints in `rustdoc::all` group.]
  (rust-lang/rust#106316)
- [Compute maximum Levenshtein distance based on the query.]
  (rust-lang/rust#107141)
- [Remove inconsistently-present sidebar tooltips.]
  (rust-lang/rust#107490)
- [Search by macro when query ends with `!`.]
  (rust-lang/rust#108143)

Compatibility Notes
-------------------

- [The `rust-analysis` component from `rustup` now only contains
  a warning placeholder.]
  (rust-lang/rust#101841) This was primarily
  intended for RLS, and the corresponding `-Zsave-analysis` flag
  has been removed from the compiler as well.
- [Unaligned references to packed fields are now a hard error.]
  (rust-lang/rust#102513) This has been
  a warning since 1.53, and denied by default with a future-compatibility
  warning since 1.62.
- [Update the minimum external LLVM to 14.]
  (rust-lang/rust#107573)
- [Cargo now emits errors on invalid characters in a registry token.]
  (rust-lang/cargo#11600)
- [When `default-features` is set to false of a workspace dependency,
  and an inherited dependency of a member has `default-features =
  true`, Cargo will enable default features of that dependency.]
  (rust-lang/cargo#11409)
- [Cargo denies `CARGO_HOME` in the `[env]` configuration table.
  Cargo itself doesn't pick up this value, but recursive calls to
  cargo would, which was not intended.]
  (rust-lang/cargo#11644)
- [Debuginfo for build dependencies is now off if not explicitly
  set. This is expected to improve the overall build time.]

  (rust-lang/cargo#11252)

Internal Changes
----------------

These changes do not affect any public interfaces of Rust, but they represent
significant improvements to the performance or internals of rustc and related
tools.

- [Move `format_args!()` into AST (and expand it during AST lowering)]
  (rust-lang/rust#106745)
netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this pull request May 4, 2023
Pkgsrc changes:
 * Adjust patches and cargo checksums to new versions.

Upstream changes:

Version 1.69.0 (2023-04-20)
==========================

Language
--------

- [Deriving built-in traits on packed structs works with `Copy` fields.]
  (rust-lang/rust#104429)
- [Stabilize the `cmpxchg16b` target feature on x86 and x86_64.]
  (rust-lang/rust#106774)
- [Improve analysis of trait bounds for associated types.]
  (rust-lang/rust#103695)
- [Allow associated types to be used as union fields.]
  (rust-lang/rust#106938)
- [Allow `Self: Autotrait` bounds on dyn-safe trait methods.]
  (rust-lang/rust#107082)
- [Treat `str` as containing `[u8]` for auto trait purposes.]
  (rust-lang/rust#107941)

Compiler
--------

- [Upgrade `*-pc-windows-gnu` on CI to mingw-w64 v10 and GCC 12.2.]
  (rust-lang/rust#100178)
- [Rework min_choice algorithm of member constraints.]
  (rust-lang/rust#105300)
- [Support `true` and `false` as boolean flags in compiler arguments.]
  (rust-lang/rust#107043)
- [Default `repr(C)` enums to `c_int` size.]
  (rust-lang/rust#107592)

Libraries
---------

- [Implement the unstable `DispatchFromDyn` for cell types, allowing
  downstream experimentation with custom method receivers.]
  (rust-lang/rust#97373)
- [Document that `fmt::Arguments::as_str()` may return `Some(_)`
  in more cases after optimization, subject to change.]
  (rust-lang/rust#106823)
- [Implement `AsFd` and `AsRawFd` for `Rc`.]
  (rust-lang/rust#107317)

Stabilized APIs
---------------

- [`CStr::from_bytes_until_nul`]
  (https://doc.rust-lang.org/stable/core/ffi/struct.CStr.html#method.from_bytes_until_nul)
- [`core::ffi::FromBytesUntilNulError`]
  (https://doc.rust-lang.org/stable/core/ffi/struct.FromBytesUntilNulError.html)

These APIs are now stable in const contexts:

- [`SocketAddr::new`]
  (https://doc.rust-lang.org/stable/std/net/enum.SocketAddr.html#method.new)
- [`SocketAddr::ip`]
  (https://doc.rust-lang.org/stable/std/net/enum.SocketAddr.html#method.ip)
- [`SocketAddr::port`]
  (https://doc.rust-lang.org/stable/std/net/enum.SocketAddr.html#method.port)
- [`SocketAddr::is_ipv4`]
  (https://doc.rust-lang.org/stable/std/net/enum.SocketAddr.html#method.is_ipv4)
- [`SocketAddr::is_ipv6`]
  (https://doc.rust-lang.org/stable/std/net/enum.SocketAddr.html#method.is_ipv6)
- [`SocketAddrV4::new`]
  (https://doc.rust-lang.org/stable/std/net/struct.SocketAddrV4.html#method.new)
- [`SocketAddrV4::ip`]
  (https://doc.rust-lang.org/stable/std/net/struct.SocketAddrV4.html#method.ip)
- [`SocketAddrV4::port`]
  (https://doc.rust-lang.org/stable/std/net/struct.SocketAddrV4.html#method.port)
- [`SocketAddrV6::new`]
  (https://doc.rust-lang.org/stable/std/net/struct.SocketAddrV6.html#method.new)
- [`SocketAddrV6::ip`]
  (https://doc.rust-lang.org/stable/std/net/struct.SocketAddrV6.html#method.ip)
- [`SocketAddrV6::port`]
  (https://doc.rust-lang.org/stable/std/net/struct.SocketAddrV6.html#method.port)
- [`SocketAddrV6::flowinfo`]
  (https://doc.rust-lang.org/stable/std/net/struct.SocketAddrV6.html#method.flowinfo)
- [`SocketAddrV6::scope_id`]
  (https://doc.rust-lang.org/stable/std/net/struct.SocketAddrV6.html#method.scope_id)

Cargo
-----

- [Cargo now suggests `cargo fix` or `cargo clippy --fix` when compilation warnings are auto-fixable.]
  (rust-lang/cargo#11558)
- [Cargo now suggests `cargo add` if you try to install a library crate.]
  (rust-lang/cargo#11410)
- [Cargo now sets the `CARGO_BIN_NAME` environment variable also for binary examples.]
  (rust-lang/cargo#11705)

Rustdoc
-----

- [Vertically compact trait bound formatting.]
  (rust-lang/rust#102842)
- [Only include stable lints in `rustdoc::all` group.]
  (rust-lang/rust#106316)
- [Compute maximum Levenshtein distance based on the query.]
  (rust-lang/rust#107141)
- [Remove inconsistently-present sidebar tooltips.]
  (rust-lang/rust#107490)
- [Search by macro when query ends with `!`.]
  (rust-lang/rust#108143)

Compatibility Notes
-------------------

- [The `rust-analysis` component from `rustup` now only contains
  a warning placeholder.]
  (rust-lang/rust#101841) This was primarily
  intended for RLS, and the corresponding `-Zsave-analysis` flag
  has been removed from the compiler as well.
- [Unaligned references to packed fields are now a hard error.]
  (rust-lang/rust#102513) This has been
  a warning since 1.53, and denied by default with a future-compatibility
  warning since 1.62.
- [Update the minimum external LLVM to 14.]
  (rust-lang/rust#107573)
- [Cargo now emits errors on invalid characters in a registry token.]
  (rust-lang/cargo#11600)
- [When `default-features` is set to false of a workspace dependency,
  and an inherited dependency of a member has `default-features =
  true`, Cargo will enable default features of that dependency.]
  (rust-lang/cargo#11409)
- [Cargo denies `CARGO_HOME` in the `[env]` configuration table.
  Cargo itself doesn't pick up this value, but recursive calls to
  cargo would, which was not intended.]
  (rust-lang/cargo#11644)
- [Debuginfo for build dependencies is now off if not explicitly
  set. This is expected to improve the overall build time.]

  (rust-lang/cargo#11252)

Internal Changes
----------------

These changes do not affect any public interfaces of Rust, but they represent
significant improvements to the performance or internals of rustc and related
tools.

- [Move `format_args!()` into AST (and expand it during AST lowering)]
  (rust-lang/rust#106745)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this PR / Issue. merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue.
Projects
None yet
10 participants