[Inference API] Parse endpoint metadata from persisted endpoints#143081
Conversation
This PR continues from elastic#141393. It wires in parsing the endpoint metadata from the persisted endpoints. This means endpoint metadata will now be available from the GET inference APIs. As we had to make `UnparsedModel` the way to parse persisted configs for EIS, I took the opportunity to refactor the `InferenceService.parsePersistedConfig` methods so that there is now only one, the one expecting an `UnparsedModel`. This is a nice clean up in that area. Even after this PR there is quite some duplication across services for their parse request/persistent config code. I'll follow up to clean further in separate PRs in the future.
|
Pinging @elastic/search-inference-team (Team:Search - Inference) |
|
Hi @dimitris-athanasiou, I've created a changelog YAML for you. |
|
I wonder if it would be possible to move the implementation of EDIT: In fact, I think this change would work very well with the changes that Jan has been making related to the update operation, with the addition of the Changing the constructor of I tested this with |
|
This is brilliant @DonalEvans ! I followed through and the result is so beautiful it almost made me cry! :-) |
|
One thing I noticed: |
jonathan-buttner
left a comment
There was a problem hiding this comment.
Looks good, just left a few questions.
| "secret_settings": { | ||
| "api_key": "12345" | ||
| } | ||
| "service_settings": {} |
There was a problem hiding this comment.
Just curious why we're removing the secret_settings here? Is it because they're not necessary for the test to to find model?
There was a problem hiding this comment.
The test's parsePersistedConfig takes a json string so it requires some handling to convert to an UnparsedModel which expects separate maps for service/secret settings. However, I have now modified it to work properly and reinstated those 2 tests.
|
|
||
| public void testParseStoredConfig_DoesNotThrowWhenAnExtraKeyExistsInServiceSettings() throws IOException { | ||
| try (var service = createServiceWithMockSender()) { | ||
| { |
There was a problem hiding this comment.
Should we keep the individual cases in the test?
There was a problem hiding this comment.
Not sure what you mean. The test was previously trying the same thing 3 times because there were 3 different parse methods. The actual config used was always the same.
| this.settings = settings == null ? null : new HashMap<>(settings); | ||
| // Additionally, an empty secrets map is treated as null in order to skip potential validations for missing keys | ||
| // which should not be necessary when parsing a persisted model. | ||
| this.secrets = secrets == null || secrets.isEmpty() ? null : new HashMap<>(secrets); |
There was a problem hiding this comment.
Can you talk a little more about the advantage of using null here? It was possible to get null before right? We're just changing empty to also be null?
There was a problem hiding this comment.
If secrets here ends being an empty map, validations will fail complaining for missing fields. In my understanding, we should never have validations failing when parsing a persisted config. They should only apply when we parse from the request. Thus, here, I'm taking care of this potential issue in a single place.
Previously this was not an issue because there was the variant for parsing without secrets which resulted in null. As now we have a single parse method, it is good defense I think to handle this here. Happy to revise though if you think otherwise.
| private final Sender sender; | ||
| private final ServiceComponents serviceComponents; | ||
| private final ClusterService clusterService; | ||
| private final Map<TaskType, ModelCreator<? extends M>> modelCreators; |
There was a problem hiding this comment.
Lifting this up is great because it reduces the code duplication. My only concern is when the child classes don't adhere to the Map<TaskType, ModelCreator<? extends M>> like EIS. Any thoughts on how we can handle that in the future?
I guess that's the same situation we were in before, if we need to add a new field to the ModelCreator interface then we have to add it everywhere.
I'm not really sure how to solve that, maybe by using a new object that gets passed instead (similar to what we did for UnparsedModel) 🤷♂️
There was a problem hiding this comment.
I think ideally we would have those EIS ModelCreators share a common base class/interface with the rest. Then we could deal with this. I thought about taking a look, but I wanted to draw the line somewhere on this PR :-) But it should be possible I'd think.
DonalEvans
left a comment
There was a problem hiding this comment.
I think we talked about this already, but it would be good if we could reduce the duplication of tests for the parsePersistedConfig() method, especially now that a lot of tests that were testing the similar parsePersistedConfigWithSecrets() method have been converted into testing parsePersistedConfig(). I haven't checked all of the test files, but certainly in HuggingFaceServiceTests (and I suspect many others) we now have multiple tests that are testing the same behaviour.
Not something that absolutely needs to be addressed in this PR, but if we don't already have an issue for it, it would be good to create one.
.../main/java/org/elasticsearch/xpack/inference/action/TransportUpdateInferenceModelAction.java
Outdated
Show resolved
Hide resolved
|
@DonalEvans Agreed on removing test duplication. I raised https://github.com/elastic/search-team/issues/13133 |
…cations * upstream/main: (60 commits) Use batches for other bulk vector benchmarks (elastic#143167) Mute org.elasticsearch.xpack.esql.qa.mixed.MixedClusterEsqlSpecIT test {csv-spec:lookup-join.MvJoinKeyOnTheLookupIndexAfterStats} elastic#143388 Mute org.elasticsearch.snapshots.ConcurrentSnapshotsIT testBackToBackQueuedDeletes elastic#143387 [Inference API] Parse endpoint metadata from persisted endpoints (elastic#143081) Add cluster formation doc to DistributedArchitectureGuide (elastic#143318) Fix flattened root block loader null expectation (elastic#143238) Unmute ValueSourceReaderTypeConversionTests testLoadAll (elastic#143189) ESQL: Add split coalescing for many small files (elastic#143335) Unmute mixed-cluster spatial parse warning test (elastic#143186) Fix zero-size estimate in BytesRefBlock null test (elastic#143258) Make DataType and DataFormat top-level enums (elastic#143312) Add support for steps to change the target index name for later steps (elastic#142955) Set mayContainDuplicates flag to test deduplication (elastic#143375) ESQL: Fix Driver search load millis as nanos bug (elastic#143267) Mute org.elasticsearch.xpack.esql.qa.mixed.MixedClusterEsqlSpecIT test {csv-spec:lookup-join.LookupJoinWithMixPushableAndUnpushableFilters} elastic#143378 ESQL: Forbid MV_EXPAND before full text functions (elastic#143249) ESQL: Fix unresolved name pattern (elastic#143210) Implement boxplot queryDSL aggregation for exponential_histograms (elastic#143026) Add prefetching to x64 bulk vector implementations (elastic#142387) Make large segment vector tests resilient to memory constraints (elastic#143366) ...
…stic#143081) This PR continues from elastic#141393. It wires in parsing the endpoint metadata from the persisted endpoints. This means endpoint metadata will now be available from the GET inference APIs. As we had to make `UnparsedModel` the way to parse persisted configs for EIS, I took the opportunity to refactor the `InferenceService.parsePersistedConfig` methods so that there is now only one, the one expecting an `UnparsedModel`. Also, I'm removing the duplication by pulling this method up in `SenderService`.
This PR continues from #141393. It wires in parsing the endpoint metadata from the persisted endpoints. This means endpoint metadata will now be available from the GET inference APIs.
As we had to make
UnparsedModelthe way to parse persisted configs for EIS, I took the opportunity to refactor theInferenceService.parsePersistedConfigmethods so that there is now only one, the one expecting anUnparsedModel.This is a nice clean up in that area. Even after this PR there is quite some duplication across services for their parse request/persistent config code. I'll follow up to clean further in separate PRs in the future.