Skip to content

cuVS Cagra FP16 support#4384

Closed
jinsolp wants to merge 17 commits intofacebookresearch:mainfrom
jinsolp:cuvs-cagra-fp16-pub
Closed

cuVS Cagra FP16 support#4384
jinsolp wants to merge 17 commits intofacebookresearch:mainfrom
jinsolp:cuvs-cagra-fp16-pub

Conversation

@jinsolp
Copy link
Contributor

@jinsolp jinsolp commented Jun 10, 2025

Supporting fp16 for cuVS cagra, and introducing new extended APIs for this.
Discussions related to this issue: #4324

Added tests in faiss/gpu/test/TestGpuIndexCagra.cu and faiss/gpu/test/test_cagra.py for example usage.

@facebook-github-bot
Copy link
Contributor

Hi @jinsolp!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks!

@jinsolp jinsolp changed the title Cuvs Cagra FP 16 support cuVS Cagra FP16 support Jun 10, 2025
@mnorris11
Copy link

Hi @jinsolp can you complete the CLA? Then I can import it and run internal tests.

@jinsolp
Copy link
Contributor Author

jinsolp commented Jun 11, 2025

@mnorris11 Sure! : ) Who should I be writing as the "Point of Contact"? What about "Schedule A" (list of designated employees)? Should I be writing myself in those sections?

@mnorris11
Copy link

@mnorris11 Sure! : ) Who should I be writing as the "Point of Contact"? What about "Schedule A" (list of designated employees)? Should I be writing myself in those sections?

Hmm, @cjnolet @tarang-jain do you remember, did you fill out the Individual or Company one for NVIDIA? If Company, did you email cla@meta.com to update it with additional folks, or do you usually just direct folks to the Individual option? I think there is no preference on our side.

If there is no NVIDIA "Company" CLA yet, feel free to start it @jinsolp and add yourself as Point of Contact and under Schedule A list of employees (along with Corey, Tarang, Tamas, and any others you deem should be added)

@tarang-jain
Copy link
Contributor

@mnorris11 we were told to sign the individual one.

@jinsolp
Copy link
Contributor Author

jinsolp commented Jun 12, 2025

@mnorris11 Signed!

@facebook-github-bot
Copy link
Contributor

@mnorris11 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@mnorris11
Copy link

Seems like the ROCM build fails, are the logs visible to you @jinsolp ?

@jinsolp
Copy link
Contributor Author

jinsolp commented Jun 12, 2025

@mnorris11 yes I can see the logs, but I can't tell why it failed from the logs. Do you know how I can reproduce the results?

@mnorris11
Copy link

Weird; seems like rocm hipification is having trouble with half syntax. But we do have other files with half being used in Faiss and those hipify fine. @jinsolp can you try including headers that other files use? You can search the codebase for half references in faiss/gpu code, then run the hipify script.

The cmake command to repro looks like this:

2025-06-12T01:51:51.2772694Z �[36;1mcmake -B build \�[0m
2025-06-12T01:51:51.2772913Z �[36;1m      -DBUILD_TESTING=ON \�[0m
2025-06-12T01:51:51.2773167Z �[36;1m      -DBUILD_SHARED_LIBS=ON \�[0m
2025-06-12T01:51:51.2773418Z �[36;1m      -DFAISS_ENABLE_GPU=ON \�[0m
2025-06-12T01:51:51.2773687Z �[36;1m      -DFAISS_ENABLE_CUVS=OFF \�[0m
2025-06-12T01:51:51.2773931Z �[36;1m      -DFAISS_ENABLE_ROCM=ON \�[0m
2025-06-12T01:51:51.2774179Z �[36;1m      -DFAISS_OPT_LEVEL=generic \�[0m
2025-06-12T01:51:51.2774427Z �[36;1m      -DFAISS_ENABLE_C_API=ON \�[0m
2025-06-12T01:51:51.2774689Z �[36;1m      -DPYTHON_EXECUTABLE=$CONDA/bin/python \�[0m
2025-06-12T01:51:51.2774959Z �[36;1m      -DCMAKE_BUILD_TYPE=Release \�[0m
2025-06-12T01:51:51.2775194Z �[36;1m      -DBLA_VENDOR=Intel10_64_dyn \�[0m
2025-06-12T01:51:51.2775490Z �[36;1m      -DCMAKE_CUDA_FLAGS="-gencode arch=compute_75,code=sm_75"

Meanwhile @ItsPitt do you have ideas on the AMD side of what to include for half to hipify? (Sorry, I would tag Johannes too but I am not finding his Github username...)

Error logs:

2025-06-12T01:52:38.8113545Z /__w/faiss/faiss/faiss/gpu-rocm/GpuIndex.hip:195:18: error: use of undeclared identifier 'half'
2025-06-12T01:52:38.8114096Z   195 |         dispatch(half{});
2025-06-12T01:52:38.8114355Z       |                  ^
2025-06-12T01:52:38.8173913Z /__w/faiss/faiss/faiss/gpu-rocm/GpuIndex.hip:252:18: error: use of undeclared identifier 'half'
2025-06-12T01:52:38.8174480Z   252 |         dispatch(half{});
2025-06-12T01:52:38.8174780Z       |                  ^
2025-06-12T01:52:38.8232056Z /__w/faiss/faiss/faiss/gpu-rocm/GpuIndex.hip:406:39: error: use of undeclared identifier 'half'
2025-06-12T01:52:38.8232642Z   406 |         auto vecs = toDeviceTemporary<half, 2>(
2025-06-12T01:52:38.8232999Z       |                                       ^
2025-06-12T01:52:38.8261155Z /__w/faiss/faiss/faiss/gpu-rocm/GpuIndex.hip:409:28: error: unknown type name 'half'
2025-06-12T01:52:38.8261785Z   409 |                 const_cast<half*>(static_cast<const half*>(x)),
2025-06-12T01:52:38.8262145Z       |                            ^
2025-06-12T01:52:38.8286932Z /__w/faiss/faiss/faiss/gpu-rocm/GpuIndex.hip:409:53: error: unknown type name 'half'
2025-06-12T01:52:38.8287481Z   409 |                 const_cast<half*>(static_cast<const half*>(x)),
2025-06-12T01:52:38.8287853Z       |                                                     ^
2025-06-12T01:52:38.8341304Z /__w/faiss/faiss/faiss/gpu-rocm/GpuIndex.hip:485:51: error: unknown type name 'half'
2025-06-12T01:52:38.8341882Z   485 |                                 static_cast<const half*>(x) + cur * this->d),
2025-06-12T01:52:38.8342244Z       |                                                   ^
2025-06-12T01:52:38.8342742Z /__w/faiss/faiss/faiss/gpu-rocm/GpuIndex.hip:485:61: error: arithmetic on a pointer to void
2025-06-12T01:52:38.8343279Z   485 |                                 static_cast<const half*>(x) + cur * this->d),
2025-06-12T01:52:38.8344246Z       |                                                          ~  ^
2025-06-12T01:52:38.8485720Z /__w/faiss/faiss/faiss/gpu-rocm/GpuIndex.hip:646:18: error: use of undeclared identifier 'half'
2025-06-12T01:52:38.8486277Z   646 |         dispatch(half{});

@jinsolp
Copy link
Contributor Author

jinsolp commented Jun 12, 2025

I've added #include <faiss/gpu/utils/Float16.cuh> which seems to be including <hip/hip_fp16.h>. : )

@facebook-github-bot
Copy link
Contributor

@mnorris11 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@jinsolp
Copy link
Contributor Author

jinsolp commented Jun 12, 2025

@mnorris11 Looks like build&tests failed with a bunch of warnings, but I don't think I have access to the log 👀
Also, how can I run the facebook internal linter to pass the linter check?

@mnorris11
Copy link

@mnorris11 Looks like build&tests failed with a bunch of warnings, but I don't think I have access to the log 👀 Also, how can I run the facebook internal linter to pass the linter check?

It looks like just warnings on the internal end, so no worries, it is now just in review.

@facebook-github-bot
Copy link
Contributor

@mnorris11 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@mnorris11 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@mnorris11 merged this pull request in 752b687.

Copy link
Contributor

@mdouze mdouze left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I did not follow this PR. See my comments inline.

idx_t n,
const void* x,
NumericType numeric_type,
const idx_t* xids) override;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should have taken the opportunity to also make the id sizes parameterizable. There are many use cases where int32 is more appropriate than int64

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can change that in a subsequent diff, but I would like to avoid having 3 different add_with_ids implementaitons.

n, d = x.shape
assert d == self.d
x = np.ascontiguousarray(x, dtype='float32')
if numeric_type == faiss.Float32:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a bit clumsy that the Python interface is not able to directly accept np.float16 arguments because there is no way to tell if this will raise an error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mdouze is x expected to be a numpy array? The docs here say "array-like", but since it calls x.shape, can I assume that this is a numpy array and access the dtype instead of getting numeric_type as an argument?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be a numpy array, see https://github.com/facebookresearch/faiss/blob/main/faiss/python/swigfaiss.swig#L1163

Support for torch arrays is via another mechanism.

x = np.ascontiguousarray(x, dtype='float32')
else:
x = np.ascontiguousarray(x, dtype='float16')
self.add_c(n, swig_ptr(x))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will not work in the fp16 case because it will go through the regular add method, not the one with numeric_type

@jinsolp
Copy link
Contributor Author

jinsolp commented Jun 24, 2025

Thanks for the feedback @mdouze ! It looks like the PR is merged already. I'll open up a new PR with follow-ups.

@jinsolp
Copy link
Contributor Author

jinsolp commented Jun 25, 2025

@mdouze Changes and fixes for python API are reflected in this PR #4411

@jinsolp jinsolp deleted the cuvs-cagra-fp16-pub branch July 11, 2025 17:28
facebook-github-bot pushed a commit that referenced this pull request Jul 22, 2025
Summary:
This PR does 2 things
- Enable support fot `IndexIDMap` with Cagra fp16 (original support introduced in #4188)
    - Added tests in `test_cagra.py`
- Reflecting feedback about python API from #4384 (comment)

Pull Request resolved: #4411

Reviewed By: junjieqi

Differential Revision: D78695771

Pulled By: mnorris11

fbshipit-source-id: 4b3a0869bed5d33165354f415c748812b0d4b253
dian-lun-lin pushed a commit to ahuber21/faiss that referenced this pull request Jul 30, 2025
Summary:
This PR does 2 things
- Enable support fot `IndexIDMap` with Cagra fp16 (original support introduced in facebookresearch#4188)
    - Added tests in `test_cagra.py`
- Reflecting feedback about python API from facebookresearch#4384 (comment)

Pull Request resolved: facebookresearch#4411

Reviewed By: junjieqi

Differential Revision: D78695771

Pulled By: mnorris11

fbshipit-source-id: 4b3a0869bed5d33165354f415c748812b0d4b253
samanthawaters8882michaeldonovan added a commit to samanthawaters8882michaeldonovan/faiss that referenced this pull request Oct 12, 2025
Summary:
Supporting fp16 for cuVS cagra, and introducing new extended APIs for this.
Discussions related to this issue: facebookresearch/faiss#4324

Added tests in `faiss/gpu/test/TestGpuIndexCagra.cu` and `faiss/gpu/test/test_cagra.py` for example usage.

Pull Request resolved: facebookresearch/faiss#4384

Reviewed By: junjieqi

Differential Revision: D76480612

Pulled By: mnorris11

fbshipit-source-id: 863d8671eab461733110f74550ffc56650f77407
samanthawaters8882michaeldonovan added a commit to samanthawaters8882michaeldonovan/faiss that referenced this pull request Oct 12, 2025
Summary:
This PR does 2 things
- Enable support fot `IndexIDMap` with Cagra fp16 (original support introduced in facebookresearch/faiss#4188)
    - Added tests in `test_cagra.py`
- Reflecting feedback about python API from facebookresearch/faiss#4384 (comment)

Pull Request resolved: facebookresearch/faiss#4411

Reviewed By: junjieqi

Differential Revision: D78695771

Pulled By: mnorris11

fbshipit-source-id: 4b3a0869bed5d33165354f415c748812b0d4b253
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants