Skip to content

Add support for safetensors when reading checkpoints#19

Closed
tjs-intel wants to merge 1 commit into
HabanaAI:habana-mainfrom
tjs-intel:support-safetensors
Closed

Add support for safetensors when reading checkpoints#19
tjs-intel wants to merge 1 commit into
HabanaAI:habana-mainfrom
tjs-intel:support-safetensors

Conversation

@tjs-intel
Copy link
Copy Markdown

@tjs-intel tjs-intel commented Feb 1, 2024

What does this PR do?

The TinyLlama model only has checkpoints in the form of model.safetensors. This checkpoint needs to be included in the list of checkpoints that is passed to DeepSpeed in order for the model to function properly when initialized with DeepSpeed.

This PR adds the safetensor checkpoints to the list of checkpoints passed to DeepSpeed.

Note: This change requires an upstream commit in microsoft/DeepSpeed to be merged downstream to HabanaAI/DeepSpeed in order for DeepSpeed to support the provided safetensor format.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

@tjs-intel
Copy link
Copy Markdown
Author

PR applying patch to support safetensors to HabanaAI/DeepSpeed: HabanaAI/DeepSpeed#2

@tjs-intel
Copy link
Copy Markdown
Author

@dvarshney-habana please assign reviewers as appropriate

@ghost ghost requested a review from vivekgoe February 2, 2024 03:34
@vivekgoe
Copy link
Copy Markdown

vivekgoe commented Feb 2, 2024

@tjs-intel @dvarshney-habana this looks same as #15 . Please confirm, if its same then other one is already in queue to merge.

# Creates a list of paths from all downloaded files in cache dir
file_list = [str(entry) for entry in Path(cached_repo_dir).rglob("*.[bp][it][n]") if entry.is_file()]
path = Path(cached_repo_dir)
globs = ["*.bin", "*.pt", "*.safetensors"]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we need some kind of if else check ; cached repo dir can have both .safetensors and .pt files.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code that is already there loads .bin and .pt files, can the cached repo have both of those as well? I might be misunderstanding something

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it can have both.

@tjs-intel
Copy link
Copy Markdown
Author

@vivekgoe I will confirm on Monday that #15 resolves my issue

@tjs-intel
Copy link
Copy Markdown
Author

Resolved with #15 which is already good to merge

@tjs-intel tjs-intel closed this Feb 2, 2024
@tjs-intel tjs-intel deleted the support-safetensors branch February 2, 2024 17:25
@tjs-intel tjs-intel restored the support-safetensors branch February 5, 2024 19:32
@tjs-intel tjs-intel mentioned this pull request Feb 5, 2024
3 tasks
astachowiczhabana pushed a commit that referenced this pull request Nov 29, 2024
User provided min,max value can be set to a dynamic tensor.
xinyu-intel pushed a commit that referenced this pull request Mar 4, 2025
User provided min,max value can be set to a dynamic tensor.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants