Skip to content

Added parallel device usage for GPT-J#22713

Merged
sgugger merged 1 commit into
huggingface:mainfrom
jprivera44:jp-model-parallel
Apr 12, 2023
Merged

Added parallel device usage for GPT-J#22713
sgugger merged 1 commit into
huggingface:mainfrom
jprivera44:jp-model-parallel

Conversation

@jprivera44

Copy link
Copy Markdown
Contributor

What does this PR do?

This PR is within the issue 22561, and is related to issue 22535 which concerns model parallelism. Specifically, this PR fixes the issue in GPT-J where tensors might accidentally be moved to different devices, causing a mismatch. The implemented fix ensures that all tensors are on the same device, preventing potential errors.

Test case:
`
#Setting up the tokenizer and model
tokenizer = GPT2Tokenizer.from_pretrained("EleutherAI/gpt-j-6B")
model = GPTJForSequenceClassification.from_pretrained("EleutherAI/gpt-j-6B")

#Now move the model to the GPU
model.to("cuda:0")

#setting up the text
text = "this is an example of text for device mismatch for GPT-J"

inputs = tokenizer(text,return_tensors = "pt")

#I've already move the model to Cuda:0

for k,v in inputs.items():
inputs[k] = v.to('cuda:0')

labels = torch.tensor([1]).to('cpu')

#Forward pass
outputs = model(**inputs,labels = labels)`

I recreated the issue by running the code without the fix, which resulted in the following error: "RuntimeError: Expected all tensors to be on the same device, ...". After implementing the fix, the error disappeared, and the model now keeps all tensors on the same device, as expected.

Fixes # 22561

Motivation and Context

I worked on helping with the code to make all transformers compatible with model parallelism, specifically GPT-J.

Who can review?

@sgugger

@HuggingFaceDocBuilderDev

HuggingFaceDocBuilderDev commented Apr 11, 2023

Copy link
Copy Markdown

The documentation is not available anymore as the PR was closed or merged.

@sgugger sgugger left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect, thanks a lot!

@sgugger sgugger merged commit 17503b0 into huggingface:main Apr 12, 2023
novice03 pushed a commit to novice03/transformers that referenced this pull request Jun 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants