Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Issues:
Incomplete Testing in
testLargeTensor
Method:Location:
tests/L0/run_optimizers/test_adam.py
.Description: The method aimed to compare the correctness of
FusedAdam
by applying the step() function to two large tensors with same gradient(another one usingtorch.optim.adam
). However, the test only invoked step() on the first optimizer.Type Overflow in
TensorListMetadata
:Location:
csrc/multi_tensor_apply.cuh
Description: The data structures sizes[] and block_to_chunk[] within TensorListMetadata were statically typed as integers. This led to overflow when managing tensors with lengths surpassing INT_MAX.
Solution:
Added an optimizer step to
optimizer2
withintestLargeTensor
oftests/L0/run_optimizers/test_adam.py
for accurate testing of large tensor operations.Refactored
TensorListMetadata
incsrc/multi_tensor_apply.cuh
:A modification of the template to accommodate either int32_t or int64_t sizes, ensuring backward compatibility with the existing declaration method (TensorListMetadata)
Added multi_tensor_apply64 to specifically handle int64_t size indexing for Adam. This new function mirrors the functionality of multi_tensor_apply but incorporates checks against depth_to_max_tensors64 for enhanced large tensor support.
Modifications have been made in csrc/multi_tensor_adam.cu to invoke multi_tensor_apply64 and utilize the specified data structure according to index_t.