Fix simd::gatherBits for Mac M1 or when AVX2 is disabled#8415
Fix simd::gatherBits for Mac M1 or when AVX2 is disabled#8415icejoywoo wants to merge 3 commits intofacebookincubator:mainfrom
Conversation
✅ Deploy Preview for meta-velox canceled.
|
mbasmanova
left a comment
There was a problem hiding this comment.
@Yuhta Jimmy, would you help review this PR?
velox/common/base/SimdUtil.cpp
Outdated
There was a problem hiding this comment.
Is this a copy-paste from DecoderUtil::nonNullRowsFromSparse ? Would it be possible to refactor to avoid that?
There was a problem hiding this comment.
Yes, I copy the code structure from DecoderUtil::nonNullRowsFromSparse.
Maybe we can refactor simd::gather8Bits to make it work as the function name says. But I'm not sure this is acceptable or not. Currently, I want to change little to fix this, to minimize the impact.
There was a problem hiding this comment.
I don't think gather8Bits can be changed as the indices is from one single register. For removing the duplicates I created #8416, @icejoywoo you can rebase upon my commit and use the new function, and @mbasmanova can you review #8416?
There was a problem hiding this comment.
@Yuhta Ok, I will rebase code to use new function bits::storeBitsToByte.
Summary: We will need this to fix `simd::gatherBits` for non-AVX cases. See facebookincubator#8415 Differential Revision: D52837098
|
@Yuhta I already rebase and use the new function |
|
@Yuhta has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
|
Conbench analyzed the 1 benchmark run on commit There were no benchmark performance regressions. 🎉 The full Conbench report has more details. |
Fix #8377
When avx2 is disabled or run on Mac M1(arm64), simd::gatherBits works incorrectly.
This fix comes from DecoderUtil::nonNullRowsFromSparse.