Introduce compression and mode mapping parms by jmazanec15 · Pull Request #2019 · opensearch-project/k-NN

jmazanec15 · 2024-09-03T18:23:05Z

Description

Introduces new params for mapping and training, called compression_level and mode. These parameters are high level parameters that give the plugin a hint as to what the user wants to configure their system like without exposing algorithmic details. This change just adds these parameters to the plugin as noops. In future change, we will add the functionality for parameter resolution.

Along with this, I added a class to more easily manage the original parameters that a user passes. This will help ensure our mapper maintains good compatibility.

Adding tests now, but wanted to raise PR to get early feedback.

Related Issues

#1949

Check List

Commits are signed per the DCO using --signoff.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Introduces new params for mapping and training, called compression_level and mode. These parameters are high level parameters that give the plugin a hint as to what the user wants to configure their system like without exposing algorithmic details. This change just adds these parameters to the plugin as noops. In future change, we will add the functionality for parameter resolution. Along with this, I added a class to more easily manage the original parameters that a user passes. This will help ensure our mapper maintains good compatibility. Signed-off-by: John Mazanec <jmazane@amazon.com>

src/main/java/org/opensearch/knn/index/mapper/CompressionLevel.java

Signed-off-by: John Mazanec <jmazane@amazon.com>

src/main/java/org/opensearch/knn/index/mapper/CompressionLevel.java

heemin32 · 2024-09-03T19:19:20Z

src/main/java/org/opensearch/knn/index/mapper/KNNVectorFieldMapper.java

            }
        });

+        protected final Parameter<String> mode = Parameter.restrictedStringParam(


Can we define these variable as static so that we don't need to create them for every instances?

I don't think so it can be static. As these parameters are getting parsed. So for every mapper instance the values depend on the provided input.

I thought @heemin32 meant just the last stream

heemin32 · 2024-09-03T19:28:04Z

src/main/java/org/opensearch/knn/index/mapper/KNNVectorFieldMapper.java

            }

-            if (resolvedKNNMethodContext == null) {
+            if (originalParameters.getResolvedKnnMethodContext() == null) {


Just see if this make sense otherwise please ignore.

I think originalParameters is customer provided. Then, I would wrap it another another class.

InternalParameter internalParameters = new InternalParameter(originalParameter); internalParameters.knnMethodContext();

Then, you can have some of separation between user provided parameters and internal parameters which might also provide better flexibility.

I like this - but, might take in another review

src/main/java/org/opensearch/knn/indices/ModelDao.java

navneet1v

Overall looks good to me. I am little skeptical on the part of having NotConfigured along with null values for new Enums. Would like to know what other think about this.

cc: @heemin32 , @shatejas

navneet1v · 2024-09-03T21:21:35Z

src/main/java/org/opensearch/knn/index/mapper/KNNVectorFieldMapper.java

            }
        });

+        protected final Parameter<String> mode = Parameter.restrictedStringParam(


I don't think so it can be static. As these parameters are getting parsed. So for every mapper instance the values depend on the provided input.

navneet1v · 2024-09-03T21:24:30Z

src/main/java/org/opensearch/knn/indices/ModelDao.java

+                        put(KNNConstants.MODE_PARAMETER, modelMetadata.getMode().toString());
+                    }
+                    if (CompressionLevel.isConfigured(modelMetadata.getCompressionLevel())) {
+                        put(KNNConstants.COMPRESSION_LEVEL_PARAMETER, modelMetadata.getCompressionLevel().toString());


@jmazanec15 if you are putting the enums in the objects, I would recommend going with value of the enum rather than toString. This will ensure that even when toString representation is changed value of enum is something that will be consistent and will ensure BWC.

I was a little bit concerned here that we are ingesting the value into the system index and was concerned on having an object indexed. Do you think thats fine though?

navneet1v · 2024-09-03T21:25:31Z

src/main/java/org/opensearch/knn/indices/ModelMetadata.java

+                builder.field(KNNConstants.MODE_PARAMETER, mode.toString());
+            }
+            if (CompressionLevel.isConfigured(compressionLevel)) {
+                builder.field(KNNConstants.COMPRESSION_LEVEL_PARAMETER, compressionLevel.toString());


why not use ENUM.value here rather than toString.

For toXContent method, this will be potentially exposed to user. So, I was thinking it would be better to return what they would put in

In that case, we should have another parameter in enum which tells what is user provided value. having toString here is not what I would recommend

heemin32 · 2024-09-03T21:33:23Z

src/main/java/org/opensearch/knn/index/mapper/CompressionLevel.java

+     *
+     * @return number of bits to represent a float at this compression level
+     */
+    public int numBitsForFloat() {


?

Suggested change

public int numBitsForFloat() {

public int numBitsForFloat32() {

shatejas · 2024-09-03T18:55:49Z

src/main/java/org/opensearch/knn/index/mapper/CompressionLevel.java

+ */
+@AllArgsConstructor
+public enum CompressionLevel {
+    NOT_CONFIGURED(-1),


nit: be careful with -1 here, as long as its just for syntactic sugar and not being used in any computation we should be good

Right. The value isnt exposed outside of this class and is handled below

shatejas · 2024-09-03T21:32:15Z

src/main/java/org/opensearch/knn/indices/ModelMetadata.java

+            this.mode = Mode.NOT_CONFIGURED;
+            this.compressionLevel = CompressionLevel.NOT_CONFIGURED;


nit: could have defaulted to in_memory and 1x here. They are still valid values for before 2.17

I prefer NOT_CONFIGURED just because I want to differentiate when someone actually supplies as config "1x" or "in_memory" and handle that case a bit differently.

handle that case a bit differently

Would appreciate more context on this part. I was of the opinion it will execute the same code path as the current one

shatejas · 2024-09-03T21:33:32Z

src/main/java/org/opensearch/knn/indices/ModelMetadata.java

                    + "\"<KNNEngine>,<SpaceType>,<Dimension>,<ModelState>,<Timestamp>,<Description>,<Error>,<NodeAssignment>,<MethodContext>\" or "
-                    + "\"<KNNEngine>,<SpaceType>,<Dimension>,<ModelState>,<Timestamp>,<Description>,<Error>,<NodeAssignment>,<MethodContext>,<VectorDataType>\"."
+                    + "\"<KNNEngine>,<SpaceType>,<Dimension>,<ModelState>,<Timestamp>,<Description>,<Error>,<NodeAssignment>,<MethodContext>,<VectorDataType>\". or "
+                    + "\"<KNNEngine>,<SpaceType>,<Dimension>,<ModelState>,<Timestamp>,<Description>,<Error>,<NodeAssignment>,<MethodContext>,<VectorDataType>,<WorkloadConfig>,<CompressionConfig>\"."


nit: <Mode>,<CompressionLevel>

heemin32 · 2024-09-03T21:39:16Z

Overall looks good to me. I am little skeptical on the part of having NotConfigured along with null values for new Enums. Would like to know what other think about this.

cc: @heemin32 , @shatejas

Yes. By having NotConfigured, we should be able to assume that the enum won't be a null for it to be useful.

shatejas · 2024-09-03T21:41:54Z

Overall looks good to me. I am little skeptical on the part of having NotConfigured along with null values for new Enums. Would like to know what other think about this.

cc: @heemin32 , @shatejas

@navneet1v jmazanec15#1 (comment)

Ideally we shouldn't have placeholders, I think we can get away with it unless I am missing something. We can revisit removing as long as its not a one way door (make sure we are not passing NOT_CONFIGURED in streams used for bwc)

shatejas

Overall no concerns with functionality

Please make sure NOT_CONFIGURED is not passed in streams for bwc in any case

Signed-off-by: John Mazanec <jmazane@amazon.com>

Introduces new params for mapping and training, called compression_level and mode. These parameters are high level parameters that give the plugin a hint as to what the user wants to configure their system like without exposing algorithmic details. This change just adds these parameters to the plugin as noops. In future change, we will add the functionality for parameter resolution. Along with this, I added a class to more easily manage the original parameters that a user passes. This will help ensure our mapper maintains good compatibility. Signed-off-by: John Mazanec <jmazane@amazon.com> (cherry picked from commit 920c819)

Introduces new params for mapping and training, called compression_level and mode. These parameters are high level parameters that give the plugin a hint as to what the user wants to configure their system like without exposing algorithmic details. This change just adds these parameters to the plugin as noops. In future change, we will add the functionality for parameter resolution. Along with this, I added a class to more easily manage the original parameters that a user passes. This will help ensure our mapper maintains good compatibility. Signed-off-by: John Mazanec <jmazane@amazon.com> Signed-off-by: Akash Shankaran <akash.shankaran@intel.com>

Introduces new params for mapping and training, called compression_level and mode. These parameters are high level parameters that give the plugin a hint as to what the user wants to configure their system like without exposing algorithmic details. This change just adds these parameters to the plugin as noops. In future change, we will add the functionality for parameter resolution. Along with this, I added a class to more easily manage the original parameters that a user passes. This will help ensure our mapper maintains good compatibility. Signed-off-by: John Mazanec <jmazane@amazon.com>

jmazanec15 added backport 2.x skip-changelog labels Sep 3, 2024

jmazanec15 requested review from VijayanB, heemin32, junqiu-lei, luyuncheng, martin-gaievski, naveentatikonda, navneet1v, ryanbogan and vamshin as code owners September 3, 2024 18:23

jmazanec15 force-pushed the mode-compress-params branch from d0ae053 to 302006e Compare September 3, 2024 18:25

jmazanec15 force-pushed the mode-compress-params branch from 302006e to 722b7f1 Compare September 3, 2024 18:33

navneet1v reviewed Sep 3, 2024

View reviewed changes

src/main/java/org/opensearch/knn/index/mapper/CompressionLevel.java Outdated Show resolved Hide resolved

jmazanec15 added 2 commits September 3, 2024 11:46

Fix original parameter setting

c366e95

Signed-off-by: John Mazanec <jmazane@amazon.com>

Fix compression bits

ea0f732

Signed-off-by: John Mazanec <jmazane@amazon.com>

jmazanec15 requested a review from navneet1v September 3, 2024 19:06

jmazanec15 added 2 commits September 3, 2024 12:53

Minor changes

a8e91c3

Signed-off-by: John Mazanec <jmazane@amazon.com>

fix bugs

27f9228

Signed-off-by: John Mazanec <jmazane@amazon.com>

heemin32 reviewed Sep 3, 2024

View reviewed changes

jmazanec15 requested a review from heemin32 September 3, 2024 21:00

navneet1v reviewed Sep 3, 2024

View reviewed changes

heemin32 reviewed Sep 3, 2024

View reviewed changes

shatejas reviewed Sep 3, 2024

View reviewed changes

shatejas approved these changes Sep 3, 2024

View reviewed changes

jmazanec15 force-pushed the mode-compress-params branch 2 times, most recently from ee6d75c to c0e84bd Compare September 3, 2024 22:37

Address comments

f0896f2

Signed-off-by: John Mazanec <jmazane@amazon.com>

jmazanec15 force-pushed the mode-compress-params branch from c0e84bd to f0896f2 Compare September 3, 2024 22:41

jmazanec15 requested review from heemin32 and navneet1v September 3, 2024 22:42

Change string from null to empty

485ee91

Signed-off-by: John Mazanec <jmazane@amazon.com>

heemin32 previously approved these changes Sep 3, 2024

View reviewed changes

jmazanec15 dismissed heemin32’s stale review via d31ad23 September 3, 2024 23:45

Minor fixes

f85826d

Signed-off-by: John Mazanec <jmazane@amazon.com>

jmazanec15 force-pushed the mode-compress-params branch from d31ad23 to f85826d Compare September 3, 2024 23:46

Fix mapping

a7c09f8

Signed-off-by: John Mazanec <jmazane@amazon.com>

jmazanec15 force-pushed the mode-compress-params branch from 9912899 to a7c09f8 Compare September 4, 2024 00:29

navneet1v approved these changes Sep 4, 2024

View reviewed changes

jmazanec15 requested a review from heemin32 September 4, 2024 01:26

jmazanec15 added the backport 2.17 label Sep 4, 2024

naveentatikonda approved these changes Sep 4, 2024

View reviewed changes

jmazanec15 merged commit 920c819 into opensearch-project:main Sep 4, 2024

opensearch-trigger-bot bot mentioned this pull request Sep 4, 2024

[Backport 2.x] Introduce compression and mode mapping parms #2028

Merged

opensearch-trigger-bot bot mentioned this pull request Sep 4, 2024

[Backport 2.17] Introduce compression and mode mapping parms #2029

Merged

	public int numBitsForFloat() {
	public int numBitsForFloat32() {

		this.mode = Mode.NOT_CONFIGURED;
		this.compressionLevel = CompressionLevel.NOT_CONFIGURED;

Conversation

jmazanec15 commented Sep 3, 2024

Description

Related Issues

Check List

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

navneet1v left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

heemin32 commented Sep 3, 2024

Uh oh!

shatejas commented Sep 3, 2024

Uh oh!

shatejas left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants