-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature Request]: Allow specification of a custom model inference method for a RunInference ModelHandler #22572
Comments
This also applies to scikit-learn. For example, RandomForestClassifier has |
Parent issue: #22117 |
I think this would be difficult to do in a general (cross-ModelHandler) way as each ModelHandler is responsible for invoking it's model, and they currently have different ways of doing so. sklearn calls a predict method:
pytorch calls the model like a callable (which then uses the forward method IIUC?):
I think the best we could do to solve the problem generally is establish some kind of convention. It's also worth noting that the
Is a separate generation modelhandler a better solution? |
I could imagine three options:
Personally, I'd prefer option three. Something like this:
Wyt? |
Correct. And thanks @agvdndor for the detailed suggestions!
There are some other workarounds that users could do. Would these be sufficient solutions to this?
|
Generally, my take here is that we should do option 3 here and allow users to pass in a custom function. Basically:
@jrmccluskey could you pick this one up when you have the bandwidth? |
Looking into this a little bit, it's doable for each handler type but the end result is somewhat restrictive for the user. The provided function is going to have to take the same arguments in the same position as the current inference methods. For the given examples discussed this isn't a huge issue (unless HuggingFace users really want to use the 30+ optional It also looks like providing the alternate inference function will need to be done at run_inference call-time, not handler init-time, since the scikit-learn and PyTorch approaches are using functions from specific instances of their respective models. Can't specify the function until you have the model, unless I'm missing something. |
I'm not 100% sure this is true, for example I could imagine an approach where we let users pass in some sort of function like:
You could probably do something with Thoughts? |
I've put together a brief doc discussing my perspective and preferred solution for this here - https://docs.google.com/document/d/1YYGsF20kminz7j9ifFdCD5WQwVl8aTeCo0cgPjbdFNU/edit?usp=sharing PTAL |
@jrmccluskey could you please file a follow up issue to update our notebooks to use this feature once this is released? |
Filed as #24334 |
Thanks! |
What would you like to happen?
The current implementation of RunInference provides model handlers for PyTorch and Sklearn models. These handlers assume that the method to call for inference is fixed:
__call__
method ->output = torch_model(input)
predict
method ->output = sklearn_model.predict(input)
However in some cases we want to provide a custom method for RunInference to call.
Two examples:
A number of pretrained models loaded with the Huggingface transformers library recommend using the
generate()
method. From the Huggingface docs on the T5 mode:Using OpenAI's CLIP model which is implemented as a torch model we might not want to execute the normal forward pass to encode both images and text
image_embedding, text_embedding = clip_model(image, text)
but instead only compute the image embeddingsimage_embedding = clip_model.encode_image(image)
.Solution: Allowing the user to specify the
inference_fn
when creating a ModelHandler would enable this usage.Issue Priority
Priority: 2
Issue Component
Component: sdk-py-core
The text was updated successfully, but these errors were encountered: