Skip to content

Enable Falcon FP8 inference#777

Merged
regisss merged 16 commits into
synapse_1.15from
schoi/falcon_180_quant_OH
Mar 27, 2024
Merged

Enable Falcon FP8 inference#777
regisss merged 16 commits into
synapse_1.15from
schoi/falcon_180_quant_OH

Conversation

@schoi-habana
Copy link
Copy Markdown
Collaborator

@schoi-habana schoi-habana commented Mar 8, 2024

falcon-7b, falcon-40b, falcon-180b quantization enabled with --reuse_cache

dependent to #773, #831 and deepspeed v1.15.0

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@schoi-habana schoi-habana force-pushed the schoi/falcon_180_quant_OH branch 2 times, most recently from 0a5e16a to bf2576f Compare March 8, 2024 01:20
@schoi-habana schoi-habana force-pushed the schoi/falcon_180_quant_OH branch 2 times, most recently from e4baf9f to c0b6494 Compare March 11, 2024 23:43
@mandy-li mandy-li added the run-test Run CI for PRs from external contributors label Mar 13, 2024
@schoi-habana schoi-habana force-pushed the schoi/falcon_180_quant_OH branch from 4416635 to d90366a Compare March 14, 2024 04:13
@schoi-habana schoi-habana force-pushed the schoi/falcon_180_quant_OH branch from d0fa61e to 68f6359 Compare March 18, 2024 22:17
@schoi-habana schoi-habana marked this pull request as ready for review March 18, 2024 22:18
@schoi-habana schoi-habana requested a review from a user March 18, 2024 22:18
@schoi-habana schoi-habana requested a review from regisss as a code owner March 18, 2024 22:18
Comment thread optimum/habana/transformers/models/falcon/modeling_falcon.py Outdated
@schoi-habana schoi-habana force-pushed the schoi/falcon_180_quant_OH branch from 540698f to fe92094 Compare March 21, 2024 00:43
Comment thread tests/test_text_generation_example.py
PR#15 reads a set of ckpt file names from the index json file.
When OH downloads files from the hub instead of loading from a cache dir, get_repo_root()
skips downloading the index json file. Thus the PR#15 fails to load file names.
This PR scans the path and returns a list of names that matches the pattern
Comment thread optimum/habana/transformers/models/falcon/modeling_falcon.py Outdated
Comment thread optimum/habana/transformers/models/falcon/modeling_falcon.py Outdated
@regisss regisss changed the base branch from main to synapse_1.15 March 27, 2024 12:53
@regisss regisss merged commit 6b107fa into synapse_1.15 Mar 27, 2024
@regisss regisss deleted the schoi/falcon_180_quant_OH branch March 27, 2024 14:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

run-test Run CI for PRs from external contributors synapse 1.15

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants