Skip to content

Fix for Falcon-40b inference with deepspeed#502

Merged
regisss merged 2 commits into
mainfrom
dev/schoi/falcon_ds
Nov 10, 2023
Merged

Fix for Falcon-40b inference with deepspeed#502
regisss merged 2 commits into
mainfrom
dev/schoi/falcon_ds

Conversation

@schoi-habana
Copy link
Copy Markdown
Collaborator

This PR fixes a dimension mismatch error in Falcon-40B with DeepSpeed. Using self.num_kv_heads for splitting head is not correct. This change reads the number of heads from the config instead of self.num_heads which is modified by deepspeed.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

@schoi-habana schoi-habana requested a review from regisss November 1, 2023 06:01
@schoi-habana schoi-habana requested a review from a user November 1, 2023 06:01
@regisss
Copy link
Copy Markdown
Collaborator

regisss commented Nov 1, 2023

I got an error at graph compilation running

python ../gaudi_spawn.py --use_deepspeed --world_size 2 run_generation.py --model_name_or_path tiiuae/falcon-40b --batch_size 2 --use_hpu_graphs --use_kv_cache --max_new_tokens 100

Do we need to merge #475 before to have this working?

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

HuggingFaceDocBuilderDev commented Nov 10, 2023

The documentation is not available anymore as the PR was closed or merged.

Copy link
Copy Markdown
Collaborator

@regisss regisss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! It works now after rebasing on main.

@regisss regisss merged commit 0081a73 into main Nov 10, 2023
@regisss regisss deleted the dev/schoi/falcon_ds branch November 10, 2023 20:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants