Skip to content

[BUG FIX] Add space type default and ef search parameter in warmup#276

Merged
jmazanec15 merged 5 commits intoopensearch-project:mainfrom
jmazanec15:shard-spacetype-fix
Feb 10, 2022
Merged

[BUG FIX] Add space type default and ef search parameter in warmup#276
jmazanec15 merged 5 commits intoopensearch-project:mainfrom
jmazanec15:shard-spacetype-fix

Conversation

@jmazanec15
Copy link
Copy Markdown
Member

Description

Warmup is used to load native library files into memory so that there are no initial latency penalties during search. The load process for search and warmup are very similar.

Currently, warmup api has a similar bug to #267. It does not default to a space type. For customers upgrading to 1.2 from ES 7.1 or 7.4 this will cause warmup failure. This only applies to 1.2. 1.1 and 1.0 do not have this bug in warmup.

Additionally, for nmslib, we allow the ef_search parameter to be set after indexing. To do this, we pass the parameter in during loading. In 1.2, we did not pass this parameter in. This will cause default ef_search to be used during warmup. To clean this up a bit, I abstracted generating load parameters into a utility function and used the same one for both search as well as warmup.

Issues Resolved

#255

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed as per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Sets the default value of space type to L2 in KNNIndexShard.
KNNIndexShard is used during warmup to load segments into memory. For
indices created in ES 7.1 and 7.4, they will not have this value set
because the only space we supported was l2. So, we need to hardcode the
defaults here.

Signed-off-by: John Mazanec <jmazane@amazon.com>
For nmslib, the ef_search parameter is configurable at load time. So, it
needs to be passed as a parameter in both the search load phase as well
as warmup. This commit adds it to the warmup phase and abstracts load
parameters to a helper function so that it can be consistent for both
search and warmup.

Signed-off-by: John Mazanec <jmazane@amazon.com>
Signed-off-by: John Mazanec <jmazane@amazon.com>
@jmazanec15 jmazanec15 requested a review from a team February 9, 2022 23:13
* @param indexName Name of OpenSearch index that the segment files belong to
* @return load parameters that will be passed to the JNI.
*/
public static Map<String, Object> getLoadParameters(SpaceType spaceType, KNNEngine knnEngine, String indexName) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may be: getParametersAtLoading ?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Will update.

for (FieldInfo fieldInfo : reader.getFieldInfos()) {
if (fieldInfo.attributes().containsKey(KNNVectorFieldMapper.KNN_FIELD)) {
SpaceType spaceType = SpaceType.getSpace(fieldInfo.attributes().get(SPACE_TYPE));
String spaceTypeName = fieldInfo.attributes().getOrDefault(SPACE_TYPE, SpaceType.L2.getValue());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about to add a default value here to make more readable?

Suggested change
String spaceTypeName = fieldInfo.attributes().getOrDefault(SPACE_TYPE, SpaceType.L2.getValue());
String spaceTypeDefaultVaule = SpaceType.L2.getValue();
String spaceTypeName = fieldInfo.attributes().getOrDefault(SPACE_TYPE, spaceTypeDefaultVaule);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would make default value a constant

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense, I will update.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either way is fine, but i am leaning towards what you have :) since method itself gives the hint that it is get or Default :)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I think I will just keep what I have. So, we do have SpaceType.DEFAULT, but in this case I do not want to use that because I know for sure that this should be L2. I will update with a comment.

* @return load parameters that will be passed to the JNI.
*/
public static Map<String, Object> getLoadParameters(SpaceType spaceType, KNNEngine knnEngine, String indexName) {
Map<String, Object> loadParameters = Maps.newHashMap(ImmutableMap.of(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we going to mutate this map later? Seems that at the end we return mutable object from this method.

Suggested change
Map<String, Object> loadParameters = Maps.newHashMap(ImmutableMap.of(
final Map<String, Object> loadParameters;
if (KNNEngine.NMSLIB.equals(knnEngine)) {
loadParameters = ImmutableMap.of(
SPACE_TYPE, spaceType.getValue(),
HNSW_ALGO_EF_SEARCH, KNNSettings.getEfSearchParam(indexName)
);
} else {
loadParameters = ImmutableMap.of(SPACE_TYPE, spaceType.getValue());
}
return loadParameters;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or we can just return UnModifiableMap.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer the unmodifiable map approach. Will go with that.

for (FieldInfo fieldInfo : reader.getFieldInfos()) {
if (fieldInfo.attributes().containsKey(KNNVectorFieldMapper.KNN_FIELD)) {
SpaceType spaceType = SpaceType.getSpace(fieldInfo.attributes().get(SPACE_TYPE));
String spaceTypeName = fieldInfo.attributes().getOrDefault(SPACE_TYPE, SpaceType.L2.getValue());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would make default value a constant

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Feb 10, 2022

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 83.38%. Comparing base (18e6e35) to head (fd8bd5e).
⚠️ Report is 802 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff            @@
##               main     #276   +/-   ##
=========================================
  Coverage     83.38%   83.38%           
- Complexity      884      885    +1     
=========================================
  Files           127      127           
  Lines          3833     3834    +1     
  Branches        361      361           
=========================================
+ Hits           3196     3197    +1     
  Misses          475      475           
  Partials        162      162           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Member

@VijayanB VijayanB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

@jmazanec15 jmazanec15 added backport 1.2 label to add to PRs to auto backport backport 1.x labels Feb 10, 2022
@jmazanec15 jmazanec15 merged commit 2fb2ad1 into opensearch-project:main Feb 10, 2022
github-actions bot pushed a commit that referenced this pull request Feb 10, 2022
)

Sets the default value of space type to L2 in KNNIndexShard.
KNNIndexShard is used during warmup to load segments into memory. For
indices created in ES 7.1 and 7.4, they will not have this value set
because the only space we supported was l2. So, we need to hardcode the
defaults here.

For nmslib, the ef_search parameter is configurable at load time. So, it
needs to be passed as a parameter in both the search load phase as well
as warmup. This commit adds it to the warmup phase and abstracts load
parameters to a helper function so that it can be consistent for both
search and warmup.

Signed-off-by: John Mazanec <jmazane@amazon.com>
(cherry picked from commit 2fb2ad1)
@github-actions
Copy link
Copy Markdown

The backport to 1.2 failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-1.2 1.2
# Navigate to the new working tree
cd .worktrees/backport-1.2
# Create a new branch
git switch --create backport/backport-276-to-1.2
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 2fb2ad116bc11bde3eab5695aed65392943c08ae
# Push it to GitHub
git push --set-upstream origin backport/backport-276-to-1.2
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-1.2

Then, create a pull request where the base branch is 1.2 and the compare/head branch is backport/backport-276-to-1.2.

jmazanec15 added a commit to jmazanec15/k-NN-1 that referenced this pull request Feb 10, 2022
…pensearch-project#276)

Sets the default value of space type to L2 in KNNIndexShard.
KNNIndexShard is used during warmup to load segments into memory. For
indices created in ES 7.1 and 7.4, they will not have this value set
because the only space we supported was l2. So, we need to hardcode the
defaults here.

For nmslib, the ef_search parameter is configurable at load time. So, it
needs to be passed as a parameter in both the search load phase as well
as warmup. This commit adds it to the warmup phase and abstracts load
parameters to a helper function so that it can be consistent for both
search and warmup.

Signed-off-by: John Mazanec <jmazane@amazon.com>
(cherry picked from commit 2fb2ad1)
jmazanec15 added a commit that referenced this pull request Feb 10, 2022
) (#285)

Sets the default value of space type to L2 in KNNIndexShard.
KNNIndexShard is used during warmup to load segments into memory. For
indices created in ES 7.1 and 7.4, they will not have this value set
because the only space we supported was l2. So, we need to hardcode the
defaults here.

For nmslib, the ef_search parameter is configurable at load time. So, it
needs to be passed as a parameter in both the search load phase as well
as warmup. This commit adds it to the warmup phase and abstracts load
parameters to a helper function so that it can be consistent for both
search and warmup.

Signed-off-by: John Mazanec <jmazane@amazon.com>
(cherry picked from commit 2fb2ad1)
martin-gaievski pushed a commit to martin-gaievski/k-NN that referenced this pull request Mar 30, 2022
…pensearch-project#276)

Sets the default value of space type to L2 in KNNIndexShard.
KNNIndexShard is used during warmup to load segments into memory. For
indices created in ES 7.1 and 7.4, they will not have this value set
because the only space we supported was l2. So, we need to hardcode the
defaults here.

For nmslib, the ef_search parameter is configurable at load time. So, it
needs to be passed as a parameter in both the search load phase as well
as warmup. This commit adds it to the warmup phase and abstracts load
parameters to a helper function so that it can be consistent for both
search and warmup.

Signed-off-by: John Mazanec <jmazane@amazon.com>
jingqimao77-spec pushed a commit to jingqimao77-spec/k-NN that referenced this pull request Mar 15, 2026
…pensearch-project#276)

Sets the default value of space type to L2 in KNNIndexShard.
KNNIndexShard is used during warmup to load segments into memory. For
indices created in ES 7.1 and 7.4, they will not have this value set
because the only space we supported was l2. So, we need to hardcode the
defaults here.

For nmslib, the ef_search parameter is configurable at load time. So, it
needs to be passed as a parameter in both the search load phase as well
as warmup. This commit adds it to the warmup phase and abstracts load
parameters to a helper function so that it can be consistent for both
search and warmup.

Signed-off-by: John Mazanec <jmazane@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport 1.x backport 1.2 label to add to PRs to auto backport

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants