Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AVX2 popcount implementation #69

Closed
Charlyo opened this issue Jan 24, 2024 · 5 comments
Closed

AVX2 popcount implementation #69

Charlyo opened this issue Jan 24, 2024 · 5 comments
Assignees
Labels

Comments

@Charlyo
Copy link

Charlyo commented Jan 24, 2024

Right now for x86 it seems there's only serial and AVX-512 popcount hamming implementation.

Could you also implement AVX2 based one? Can be found here.

Thank you very much.

@ashvardanian
Copy link
Owner

Yes, that can be done. I am a bit overloaded the next week. Any chance you can open a PR for this?

Looking at the replies in the thread, Wojciech's variant looks great. We just need to adjust the style, remove loop unrolling, and add references to the original.

@Charlyo
Copy link
Author

Charlyo commented Jan 24, 2024

I'm not proficient at C or C++. Would rather let someone more experienced to do the job (if that's ok).

@jianshu93
Copy link

Hello Both,

I believe libpopcnt.h has all AVX implementations of popcount: https://github.com/kimwalisch/libpopcnt

There is not need to implement an additional one. However, I think include it into this library can be useful.

Jianshu

@ashvardanian ashvardanian self-assigned this Jan 24, 2024
@ashvardanian ashvardanian added the good first issue Good for newcomers label Jan 29, 2024
@ashvardanian
Copy link
Owner

It's a good idea to add popcount, and libpopcnt looks nice, but we only need one routine for AVX2 Harley Seal transform. Would be easier to add those few lines of C code, than to add the first dependency update all of CI. Coincidently, ClickHouse and other users have expressed interest in bit-level operations, so I'm definitely open to PRs 🤗

ashvardanian added a commit that referenced this issue Mar 3, 2024
ashvardanian pushed a commit that referenced this issue Mar 4, 2024
# [3.9.0](v3.8.1...v3.9.0) (2024-03-04)

### Add

* Complex numbers support ([0a0665a](0a0665a))
* Hamming & Jaccard for pre-AVX512 CPUs ([4f1eba1](4f1eba1)), closes [#69](#69)
* Rust binary distances ([960af05](960af05)), closes [#84](#84)

### Fix

* `datatype` variable repeated ([8558c4a](8558c4a))
* VNNI casting on AVX-512 ([c4398d1](c4398d1)), closes [#91](#91)

### Improve

* Python type inference ([227de70](227de70))

### Make

* Bump ip from 2.0.0 to 2.0.1 (#92) ([559a16d](559a16d)), closes [#92](#92)
@ashvardanian
Copy link
Owner

🎉 This issue has been resolved in version 3.9.0 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants