Skip to content

Introduce compression and mode mapping parms#2019

Merged
jmazanec15 merged 9 commits intoopensearch-project:mainfrom
jmazanec15:mode-compress-params
Sep 4, 2024
Merged

Introduce compression and mode mapping parms#2019
jmazanec15 merged 9 commits intoopensearch-project:mainfrom
jmazanec15:mode-compress-params

Conversation

@jmazanec15
Copy link
Copy Markdown
Member

Description

Introduces new params for mapping and training, called compression_level and mode. These parameters are high level parameters that give the plugin a hint as to what the user wants to configure their system like without exposing algorithmic details. This change just adds these parameters to the plugin as noops. In future change, we will add the functionality for parameter resolution.

Along with this, I added a class to more easily manage the original parameters that a user passes. This will help ensure our mapper maintains good compatibility.

Adding tests now, but wanted to raise PR to get early feedback.

Related Issues

#1949

Check List

  • Commits are signed per the DCO using --signoff.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Introduces new params for mapping and training, called compression_level
and mode. These parameters are high level parameters that give the
plugin a hint as to what the user wants to configure their system like
without exposing algorithmic details. This change just adds these
parameters to the plugin as noops. In future change, we will add the
functionality for parameter resolution.

Along with this, I added a class to more easily manage the original
parameters that a user passes. This will help ensure our mapper
maintains good compatibility.

Signed-off-by: John Mazanec <jmazane@amazon.com>
Signed-off-by: John Mazanec <jmazane@amazon.com>
Signed-off-by: John Mazanec <jmazane@amazon.com>
@jmazanec15 jmazanec15 requested a review from navneet1v September 3, 2024 19:06
Signed-off-by: John Mazanec <jmazane@amazon.com>
Signed-off-by: John Mazanec <jmazane@amazon.com>
}
});

protected final Parameter<String> mode = Parameter.restrictedStringParam(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we define these variable as static so that we don't need to create them for every instances?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so it can be static. As these parameters are getting parsed. So for every mapper instance the values depend on the provided input.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought @heemin32 meant just the last stream

}

if (resolvedKNNMethodContext == null) {
if (originalParameters.getResolvedKnnMethodContext() == null) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just see if this make sense otherwise please ignore.

I think originalParameters is customer provided. Then, I would wrap it another another class.

InternalParameter internalParameters = new InternalParameter(originalParameter);
internalParameters.knnMethodContext();

Then, you can have some of separation between user provided parameters and internal parameters which might also provide better flexibility.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this - but, might take in another review

@jmazanec15 jmazanec15 requested a review from heemin32 September 3, 2024 21:00
Copy link
Copy Markdown
Collaborator

@navneet1v navneet1v left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good to me. I am little skeptical on the part of having NotConfigured along with null values for new Enums. Would like to know what other think about this.

cc: @heemin32 , @shatejas

}
});

protected final Parameter<String> mode = Parameter.restrictedStringParam(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so it can be static. As these parameters are getting parsed. So for every mapper instance the values depend on the provided input.

Comment on lines +299 to +302
put(KNNConstants.MODE_PARAMETER, modelMetadata.getMode().toString());
}
if (CompressionLevel.isConfigured(modelMetadata.getCompressionLevel())) {
put(KNNConstants.COMPRESSION_LEVEL_PARAMETER, modelMetadata.getCompressionLevel().toString());
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jmazanec15 if you are putting the enums in the objects, I would recommend going with value of the enum rather than toString. This will ensure that even when toString representation is changed value of enum is something that will be consistent and will ensure BWC.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was a little bit concerned here that we are ingesting the value into the system index and was concerned on having an object indexed. Do you think thats fine though?

Comment on lines +508 to +511
builder.field(KNNConstants.MODE_PARAMETER, mode.toString());
}
if (CompressionLevel.isConfigured(compressionLevel)) {
builder.field(KNNConstants.COMPRESSION_LEVEL_PARAMETER, compressionLevel.toString());
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not use ENUM.value here rather than toString.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For toXContent method, this will be potentially exposed to user. So, I was thinking it would be better to return what they would put in

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, we should have another parameter in enum which tells what is user provided value. having toString here is not what I would recommend

*
* @return number of bits to represent a float at this compression level
*/
public int numBitsForFloat() {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

Suggested change
public int numBitsForFloat() {
public int numBitsForFloat32() {

*/
@AllArgsConstructor
public enum CompressionLevel {
NOT_CONFIGURED(-1),
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: be careful with -1 here, as long as its just for syntactic sugar and not being used in any computation we should be good

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. The value isnt exposed outside of this class and is handled below

Comment on lines +104 to +105
this.mode = Mode.NOT_CONFIGURED;
this.compressionLevel = CompressionLevel.NOT_CONFIGURED;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could have defaulted to in_memory and 1x here. They are still valid values for before 2.17

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer NOT_CONFIGURED just because I want to differentiate when someone actually supplies as config "1x" or "in_memory" and handle that case a bit differently.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

handle that case a bit differently

Would appreciate more context on this part. I was of the opinion it will execute the same code path as the current one

+ "\"<KNNEngine>,<SpaceType>,<Dimension>,<ModelState>,<Timestamp>,<Description>,<Error>,<NodeAssignment>,<MethodContext>\" or "
+ "\"<KNNEngine>,<SpaceType>,<Dimension>,<ModelState>,<Timestamp>,<Description>,<Error>,<NodeAssignment>,<MethodContext>,<VectorDataType>\"."
+ "\"<KNNEngine>,<SpaceType>,<Dimension>,<ModelState>,<Timestamp>,<Description>,<Error>,<NodeAssignment>,<MethodContext>,<VectorDataType>\". or "
+ "\"<KNNEngine>,<SpaceType>,<Dimension>,<ModelState>,<Timestamp>,<Description>,<Error>,<NodeAssignment>,<MethodContext>,<VectorDataType>,<WorkloadConfig>,<CompressionConfig>\"."
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: <Mode>,<CompressionLevel>

@heemin32
Copy link
Copy Markdown
Collaborator

heemin32 commented Sep 3, 2024

Overall looks good to me. I am little skeptical on the part of having NotConfigured along with null values for new Enums. Would like to know what other think about this.

cc: @heemin32 , @shatejas

Yes. By having NotConfigured, we should be able to assume that the enum won't be a null for it to be useful.

@shatejas
Copy link
Copy Markdown
Collaborator

shatejas commented Sep 3, 2024

Overall looks good to me. I am little skeptical on the part of having NotConfigured along with null values for new Enums. Would like to know what other think about this.

cc: @heemin32 , @shatejas

@navneet1v jmazanec15#1 (comment)

Ideally we shouldn't have placeholders, I think we can get away with it unless I am missing something. We can revisit removing as long as its not a one way door (make sure we are not passing NOT_CONFIGURED in streams used for bwc)

Copy link
Copy Markdown
Collaborator

@shatejas shatejas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall no concerns with functionality

  • Please make sure NOT_CONFIGURED is not passed in streams for bwc in any case

@jmazanec15 jmazanec15 force-pushed the mode-compress-params branch 2 times, most recently from ee6d75c to c0e84bd Compare September 3, 2024 22:37
Signed-off-by: John Mazanec <jmazane@amazon.com>
Signed-off-by: John Mazanec <jmazane@amazon.com>
heemin32
heemin32 previously approved these changes Sep 3, 2024
Signed-off-by: John Mazanec <jmazane@amazon.com>
Signed-off-by: John Mazanec <jmazane@amazon.com>
@jmazanec15 jmazanec15 merged commit 920c819 into opensearch-project:main Sep 4, 2024
opensearch-trigger-bot bot pushed a commit that referenced this pull request Sep 4, 2024
Introduces new params for mapping and training, called compression_level
and mode. These parameters are high level parameters that give the
plugin a hint as to what the user wants to configure their system like
without exposing algorithmic details. This change just adds these
parameters to the plugin as noops. In future change, we will add the
functionality for parameter resolution.

Along with this, I added a class to more easily manage the original
parameters that a user passes. This will help ensure our mapper
maintains good compatibility.

Signed-off-by: John Mazanec <jmazane@amazon.com>
(cherry picked from commit 920c819)
opensearch-trigger-bot bot pushed a commit that referenced this pull request Sep 4, 2024
Introduces new params for mapping and training, called compression_level
and mode. These parameters are high level parameters that give the
plugin a hint as to what the user wants to configure their system like
without exposing algorithmic details. This change just adds these
parameters to the plugin as noops. In future change, we will add the
functionality for parameter resolution.

Along with this, I added a class to more easily manage the original
parameters that a user passes. This will help ensure our mapper
maintains good compatibility.

Signed-off-by: John Mazanec <jmazane@amazon.com>
(cherry picked from commit 920c819)
jmazanec15 pushed a commit that referenced this pull request Sep 4, 2024
Introduces new params for mapping and training, called compression_level
and mode. These parameters are high level parameters that give the
plugin a hint as to what the user wants to configure their system like
without exposing algorithmic details. This change just adds these
parameters to the plugin as noops. In future change, we will add the
functionality for parameter resolution.

Along with this, I added a class to more easily manage the original
parameters that a user passes. This will help ensure our mapper
maintains good compatibility.

Signed-off-by: John Mazanec <jmazane@amazon.com>
(cherry picked from commit 920c819)
akashsha1 pushed a commit to akashsha1/k-NN that referenced this pull request Sep 16, 2024
Introduces new params for mapping and training, called compression_level
and mode. These parameters are high level parameters that give the
plugin a hint as to what the user wants to configure their system like
without exposing algorithmic details. This change just adds these
parameters to the plugin as noops. In future change, we will add the
functionality for parameter resolution.

Along with this, I added a class to more easily manage the original
parameters that a user passes. This will help ensure our mapper
maintains good compatibility.

Signed-off-by: John Mazanec <jmazane@amazon.com>
Signed-off-by: Akash Shankaran <akash.shankaran@intel.com>
jingqimao77-spec pushed a commit to jingqimao77-spec/k-NN that referenced this pull request Mar 15, 2026
Introduces new params for mapping and training, called compression_level
and mode. These parameters are high level parameters that give the
plugin a hint as to what the user wants to configure their system like
without exposing algorithmic details. This change just adds these
parameters to the plugin as noops. In future change, we will add the
functionality for parameter resolution.

Along with this, I added a class to more easily manage the original
parameters that a user passes. This will help ensure our mapper
maintains good compatibility.

Signed-off-by: John Mazanec <jmazane@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants