Skip to content

Conversation

@younesbelkada
Copy link
Contributor

@younesbelkada younesbelkada commented Jun 17, 2022

What does this PR do?

Thanks to @justheuristic 's contribution alibi tensor is better created/communicated during the forward pass. The tests seem to pass but still stays as an experimental feature.

cc @justheuristic

This probably will break with accelerate offloading because when initialising alibi tensor we do it only once at the beginning of the forward pass with the device of the first hidden states. In the previous version we used to dynamically change alibi's device which was fine for accelerate offloading

- deals better with padded batched input
- avoid useless cpu/gpu communication when creating alibi

Co-authored-by: justheuristic <justheuristic@gmail.com>
@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Jun 17, 2022

The documentation is not available anymore as the PR was closed or merged.

Co-authored-by: justheuristic <justheuristic@gmail.com>
Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you elaborate on what's breaking? I see the alibi tensor is created at each forward pass and is now put on the same device as the hidden_states which actually seems clean!

It doesn't break model parallelism/offload as when you call each submodule with the alibi tensor as inputs, each submodule will place it on the right device before executing the forward. I checked the corresponding tests and they all pass on this branch.

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
@younesbelkada
Copy link
Contributor Author

Great thanks ! I think that you are right :) will merge it as soon as the lights are all green 🟢

@younesbelkada younesbelkada deleted the bloom-enhance-alibi-creation branch June 24, 2022 14:34
@younesbelkada
Copy link
Contributor Author

It looks like a bad rebase happened, moved the PR at: #17866

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants