[BUG FIX] Add space type default and ef search parameter in warmup#276
Conversation
Sets the default value of space type to L2 in KNNIndexShard. KNNIndexShard is used during warmup to load segments into memory. For indices created in ES 7.1 and 7.4, they will not have this value set because the only space we supported was l2. So, we need to hardcode the defaults here. Signed-off-by: John Mazanec <jmazane@amazon.com>
For nmslib, the ef_search parameter is configurable at load time. So, it needs to be passed as a parameter in both the search load phase as well as warmup. This commit adds it to the warmup phase and abstracts load parameters to a helper function so that it can be consistent for both search and warmup. Signed-off-by: John Mazanec <jmazane@amazon.com>
Signed-off-by: John Mazanec <jmazane@amazon.com>
| * @param indexName Name of OpenSearch index that the segment files belong to | ||
| * @return load parameters that will be passed to the JNI. | ||
| */ | ||
| public static Map<String, Object> getLoadParameters(SpaceType spaceType, KNNEngine knnEngine, String indexName) { |
There was a problem hiding this comment.
may be: getParametersAtLoading ?
There was a problem hiding this comment.
Makes sense. Will update.
| for (FieldInfo fieldInfo : reader.getFieldInfos()) { | ||
| if (fieldInfo.attributes().containsKey(KNNVectorFieldMapper.KNN_FIELD)) { | ||
| SpaceType spaceType = SpaceType.getSpace(fieldInfo.attributes().get(SPACE_TYPE)); | ||
| String spaceTypeName = fieldInfo.attributes().getOrDefault(SPACE_TYPE, SpaceType.L2.getValue()); |
There was a problem hiding this comment.
How about to add a default value here to make more readable?
| String spaceTypeName = fieldInfo.attributes().getOrDefault(SPACE_TYPE, SpaceType.L2.getValue()); | |
| String spaceTypeDefaultVaule = SpaceType.L2.getValue(); | |
| String spaceTypeName = fieldInfo.attributes().getOrDefault(SPACE_TYPE, spaceTypeDefaultVaule); |
There was a problem hiding this comment.
I would make default value a constant
There was a problem hiding this comment.
That makes sense, I will update.
There was a problem hiding this comment.
Either way is fine, but i am leaning towards what you have :) since method itself gives the hint that it is get or Default :)
There was a problem hiding this comment.
Right, I think I will just keep what I have. So, we do have SpaceType.DEFAULT, but in this case I do not want to use that because I know for sure that this should be L2. I will update with a comment.
| * @return load parameters that will be passed to the JNI. | ||
| */ | ||
| public static Map<String, Object> getLoadParameters(SpaceType spaceType, KNNEngine knnEngine, String indexName) { | ||
| Map<String, Object> loadParameters = Maps.newHashMap(ImmutableMap.of( |
There was a problem hiding this comment.
Are we going to mutate this map later? Seems that at the end we return mutable object from this method.
| Map<String, Object> loadParameters = Maps.newHashMap(ImmutableMap.of( | |
| final Map<String, Object> loadParameters; | |
| if (KNNEngine.NMSLIB.equals(knnEngine)) { | |
| loadParameters = ImmutableMap.of( | |
| SPACE_TYPE, spaceType.getValue(), | |
| HNSW_ALGO_EF_SEARCH, KNNSettings.getEfSearchParam(indexName) | |
| ); | |
| } else { | |
| loadParameters = ImmutableMap.of(SPACE_TYPE, spaceType.getValue()); | |
| } | |
| return loadParameters; |
There was a problem hiding this comment.
or we can just return UnModifiableMap.
There was a problem hiding this comment.
Prefer the unmodifiable map approach. Will go with that.
| for (FieldInfo fieldInfo : reader.getFieldInfos()) { | ||
| if (fieldInfo.attributes().containsKey(KNNVectorFieldMapper.KNN_FIELD)) { | ||
| SpaceType spaceType = SpaceType.getSpace(fieldInfo.attributes().get(SPACE_TYPE)); | ||
| String spaceTypeName = fieldInfo.attributes().getOrDefault(SPACE_TYPE, SpaceType.L2.getValue()); |
There was a problem hiding this comment.
I would make default value a constant
Signed-off-by: John Mazanec <jmazane@amazon.com>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #276 +/- ##
=========================================
Coverage 83.38% 83.38%
- Complexity 884 885 +1
=========================================
Files 127 127
Lines 3833 3834 +1
Branches 361 361
=========================================
+ Hits 3196 3197 +1
Misses 475 475
Partials 162 162 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
) Sets the default value of space type to L2 in KNNIndexShard. KNNIndexShard is used during warmup to load segments into memory. For indices created in ES 7.1 and 7.4, they will not have this value set because the only space we supported was l2. So, we need to hardcode the defaults here. For nmslib, the ef_search parameter is configurable at load time. So, it needs to be passed as a parameter in both the search load phase as well as warmup. This commit adds it to the warmup phase and abstracts load parameters to a helper function so that it can be consistent for both search and warmup. Signed-off-by: John Mazanec <jmazane@amazon.com> (cherry picked from commit 2fb2ad1)
|
The backport to To backport manually, run these commands in your terminal: # Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-1.2 1.2
# Navigate to the new working tree
cd .worktrees/backport-1.2
# Create a new branch
git switch --create backport/backport-276-to-1.2
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 2fb2ad116bc11bde3eab5695aed65392943c08ae
# Push it to GitHub
git push --set-upstream origin backport/backport-276-to-1.2
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-1.2Then, create a pull request where the |
…pensearch-project#276) Sets the default value of space type to L2 in KNNIndexShard. KNNIndexShard is used during warmup to load segments into memory. For indices created in ES 7.1 and 7.4, they will not have this value set because the only space we supported was l2. So, we need to hardcode the defaults here. For nmslib, the ef_search parameter is configurable at load time. So, it needs to be passed as a parameter in both the search load phase as well as warmup. This commit adds it to the warmup phase and abstracts load parameters to a helper function so that it can be consistent for both search and warmup. Signed-off-by: John Mazanec <jmazane@amazon.com> (cherry picked from commit 2fb2ad1)
) (#285) Sets the default value of space type to L2 in KNNIndexShard. KNNIndexShard is used during warmup to load segments into memory. For indices created in ES 7.1 and 7.4, they will not have this value set because the only space we supported was l2. So, we need to hardcode the defaults here. For nmslib, the ef_search parameter is configurable at load time. So, it needs to be passed as a parameter in both the search load phase as well as warmup. This commit adds it to the warmup phase and abstracts load parameters to a helper function so that it can be consistent for both search and warmup. Signed-off-by: John Mazanec <jmazane@amazon.com> (cherry picked from commit 2fb2ad1)
…pensearch-project#276) Sets the default value of space type to L2 in KNNIndexShard. KNNIndexShard is used during warmup to load segments into memory. For indices created in ES 7.1 and 7.4, they will not have this value set because the only space we supported was l2. So, we need to hardcode the defaults here. For nmslib, the ef_search parameter is configurable at load time. So, it needs to be passed as a parameter in both the search load phase as well as warmup. This commit adds it to the warmup phase and abstracts load parameters to a helper function so that it can be consistent for both search and warmup. Signed-off-by: John Mazanec <jmazane@amazon.com>
…pensearch-project#276) Sets the default value of space type to L2 in KNNIndexShard. KNNIndexShard is used during warmup to load segments into memory. For indices created in ES 7.1 and 7.4, they will not have this value set because the only space we supported was l2. So, we need to hardcode the defaults here. For nmslib, the ef_search parameter is configurable at load time. So, it needs to be passed as a parameter in both the search load phase as well as warmup. This commit adds it to the warmup phase and abstracts load parameters to a helper function so that it can be consistent for both search and warmup. Signed-off-by: John Mazanec <jmazane@amazon.com>
Description
Warmup is used to load native library files into memory so that there are no initial latency penalties during search. The load process for search and warmup are very similar.
Currently, warmup api has a similar bug to #267. It does not default to a space type. For customers upgrading to 1.2 from ES 7.1 or 7.4 this will cause warmup failure. This only applies to 1.2. 1.1 and 1.0 do not have this bug in warmup.
Additionally, for nmslib, we allow the ef_search parameter to be set after indexing. To do this, we pass the parameter in during loading. In 1.2, we did not pass this parameter in. This will cause default ef_search to be used during warmup. To clean this up a bit, I abstracted generating load parameters into a utility function and used the same one for both search as well as warmup.
Issues Resolved
#255
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.