-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Add Int4 and UInt4 support for Cast #24973
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can commit the suggested changes from lintrunner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can commit the suggested changes from lintrunner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can commit the suggested changes from lintrunner.
Add Int4 and UInt4 support for `Cast`. There were a few QNN pipelines [failing](https://aiinfra.visualstudio.com/PublicPackages/_build/results?buildId=841810&view=logs&j=9d976e38-31ec-50dd-b1f8-279fbf889fca&t=85ed5ad3-b72a-52c3-abe0-a87b66004fd0&l=1773) for this PR, which are fixed by this onnx PR [Update input and output tensors in pb files to match the model](onnx/onnx#7074). The problem is that onnxruntime, which uses onnx as a submodule in `cmake/external/onnx`, points to the latest release of onnx (1.18.0), but in order to have the pipeline run with my onnx fix, we would need to point to a newer version. Since we can't update the onnx submodule to point to a non-release commit, waiting for a new onnx release might take a long time, and creating a patch under [onnxruntime/cmake/patches/onnx](https://github.com/microsoft/onnxruntime/tree/main/cmake/patches/onnx) with the changes in my onnx PR is tricky because that fix changes some binary files, this PR skips the tests which are currently failing in the QNN pipelines.
Add Int4 and UInt4 support for
Cast.There were a few QNN pipelines failing for this PR, which are fixed by this onnx PR Update input and output tensors in pb files to match the model. The problem is that onnxruntime, which uses onnx as a submodule in
cmake/external/onnx, points to the latest release of onnx (1.18.0), but in order to have the pipeline run with my onnx fix, we would need to point to a newer version.Since we can't update the onnx submodule to point to a non-release commit, waiting for a new onnx release might take a long time, and creating a patch under onnxruntime/cmake/patches/onnx with the changes in my onnx PR is tricky because that fix changes some binary files, this PR skips the tests which are currently failing in the QNN pipelines.