Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

phone-search analyzer: don't emit sip/tel prefix, int'l prefix, extension & unformatted input #16993

Conversation

rursprung
Copy link
Contributor

@rursprung rursprung commented Jan 10, 2025

Description

see the individual commit messages for further details.

Related Issues

n/a

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable. (no API change => nothing done)
  • Public documentation issue/PR created, if applicable. (not docs relevant => nothing done)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Copy link
Contributor

❌ Gradle check result for 6b36f2e: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@rursprung rursprung force-pushed the analysis-phonenumber-no-intl-prefix-in-search branch from 6b36f2e to 9e120b7 Compare January 10, 2025 10:29
@rursprung
Copy link
Contributor Author

rursprung commented Jan 10, 2025

please add the backport 2.x label (and if there's a chance for a 2.18.1 release please also the backport to that 🙂)

Copy link
Contributor

❕ Gradle check result for 9e120b7: UNSTABLE

  • TEST FAILURES:
      1 org.opensearch.indices.replication.SegmentReplicationIT.testNodeDropWithOngoingReplication
      1 org.opensearch.action.admin.cluster.node.tasks.ResourceAwareTasksTests.testTaskResourceTrackingDuringTaskCancellation

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

Copy link

codecov bot commented Jan 10, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 72.19%. Comparing base (f6dc4a6) to head (ed0014f).
Report is 2 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main   #16993      +/-   ##
============================================
+ Coverage     72.17%   72.19%   +0.02%     
+ Complexity    65251    65199      -52     
============================================
  Files          5301     5301              
  Lines        303662   303664       +2     
  Branches      43989    43991       +2     
============================================
+ Hits         219181   219245      +64     
+ Misses        66552    66430     -122     
- Partials      17929    17989      +60     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@rursprung
Copy link
Contributor Author

❕ Gradle check result for 9e120b7: UNSTABLE

* **TEST FAILURES:**
      1 org.opensearch.indices.replication.SegmentReplicationIT.testNodeDropWithOngoingReplication
      1 org.opensearch.action.admin.cluster.node.tasks.ResourceAwareTasksTests.testTaskResourceTrackingDuringTaskCancellation

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

these are both flaky tests:

@reta
Copy link
Collaborator

reta commented Jan 10, 2025

@rursprung could you please resolve the conflicts? thank you

@reta reta added bug Something isn't working v3.0.0 Issues and PRs related to version 3.0.0 v2.19.0 Issues and PRs related to version 2.19.0 backport 2.x Backport to 2.x branch labels Jan 10, 2025
this was an oversight in the initial implementation: if the tokenizer
emits the international calling prefix in the search analyzer then all
documents with the same international calling prefix will match.

e.g. when searching for `+1-555-123-4567` not only documents with this
number would match but also any other document with a `1` token (i.e.
any other number with this prefix).

thus the search functionality is currently broken for this analyzer,
making it useless.

the test coverage has now been extended to cover these and other
use-cases.

Signed-off-by: Ralph Ursprung <[email protected]>
if these tokens are emitted it meant that phone numbers with other
international dialling prefixes still matched.

e.g. searching for `+1 1234` would also match a number stored as
`+2 1234`, which was wrong.

the tokens still need to be emited for the `phone` analyzer, e.g. when
the user only enters the extension / local number it should still match,
the same is with the other ngrams: these are needed for
search-as-you-type style queries where the user input needs to match
against partial phone numbers.

Signed-off-by: Ralph Ursprung <[email protected]>
@rursprung rursprung force-pushed the analysis-phonenumber-no-intl-prefix-in-search branch from 9e120b7 to ff3c8da Compare January 10, 2025 13:53
@rursprung rursprung changed the title phone-search analyzer: don't emit int'l prefix phone-search analyzer: don't emit int'l prefix, extension & unformatted input Jan 10, 2025
@rursprung
Copy link
Contributor Author

@rursprung could you please resolve the conflicts? thank you

done.

i've also added a 2nd commit and am now emitting even fewer tokens. please see the commit messages for further details. with the new test coverage i'm now quite confident that this is working as intended

CHANGELOG.md Outdated Show resolved Hide resolved
Copy link
Contributor

❌ Gradle check result for ff3c8da: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@rursprung
Copy link
Contributor Author

❌ Gradle check result for ff3c8da: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

flaky test: #16658

Copy link
Contributor

❌ Gradle check result for ff3c8da: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@rursprung
Copy link
Contributor Author

❌ Gradle check result for ff3c8da: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

flaky test: #15826

in line with the previous two commits, this is something else the search
analyzer shouldn't emit since otherwise searching for any number with
such a prefix will match _any_ document with the same prefix.

Signed-off-by: Ralph Ursprung <[email protected]>
@rursprung rursprung changed the title phone-search analyzer: don't emit int'l prefix, extension & unformatted input phone-search analyzer: don't emit sip/tel prefix, int'l prefix, extension & unformatted input Jan 10, 2025
@rursprung
Copy link
Contributor Author

with the new test coverage i'm now quite confident that this is working as intended

aaaand that confidence was misplaced, i spotted another corner case and fixed that as well (incl. increased test coverage). now it should be fine 😅

Copy link
Contributor

✅ Gradle check result for ed0014f: SUCCESS

@reta reta merged commit 4d94399 into opensearch-project:main Jan 10, 2025
38 of 39 checks passed
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/OpenSearch/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/OpenSearch/backport-2.x
# Create a new branch
git switch --create backport/backport-16993-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 4d943993ac93e1a140c1b58c11e812a58578f27d
# Push it to GitHub
git push --set-upstream origin backport/backport-16993-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/OpenSearch/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-16993-to-2.x.

@reta
Copy link
Collaborator

reta commented Jan 10, 2025

@rursprung could you please backport to 2.x manually? thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch backport-failed bug Something isn't working v2.19.0 Issues and PRs related to version 2.19.0 v3.0.0 Issues and PRs related to version 3.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants