Skip to content

Add mark_step for llama inference#875

Merged
regisss merged 13 commits into
mainfrom
llama_markstep
Apr 9, 2024
Merged

Add mark_step for llama inference#875
regisss merged 13 commits into
mainfrom
llama_markstep

Conversation

@libinta
Copy link
Copy Markdown
Collaborator

@libinta libinta commented Apr 8, 2024

What does this PR do?

For better memory optimization, add extra mark_step for llama inference.

Fixes # (issue)

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

regisss and others added 12 commits March 29, 2024 23:08
Co-authored-by: regisss <15324346+regisss@users.noreply.github.com>
Co-authored-by: regisss <15324346+regisss@users.noreply.github.com>
Co-authored-by: regisss <15324346+regisss@users.noreply.github.com>
Signed-off-by: Puneesh Khanna <pkhanna@habana.ai>
Co-authored-by: regisss <15324346+regisss@users.noreply.github.com>
Co-authored-by: Libin Tang <litang@habana.ai>
@libinta libinta requested a review from a user April 8, 2024 21:43
@libinta libinta requested a review from regisss as a code owner April 8, 2024 21:43
@libinta libinta changed the base branch from main to v1.11-release April 8, 2024 21:44
@regisss regisss changed the base branch from v1.11-release to main April 9, 2024 06:58
@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@regisss regisss merged commit 0b14d8e into main Apr 9, 2024
@regisss regisss deleted the llama_markstep branch April 9, 2024 07:31
regisss added a commit that referenced this pull request Apr 9, 2024
Signed-off-by: Puneesh Khanna <pkhanna@habana.ai>
Co-authored-by: regisss <15324346+regisss@users.noreply.github.com>
Co-authored-by: Sayantan Sarkar <supersarkar@gmail.com>
Co-authored-by: Puneesh Khanna <pkhanna@habana.ai>
Co-authored-by: Witold Szczurek <152967125+wszczurekhabana@users.noreply.github.com>
@ZhaiFeiyue
Copy link
Copy Markdown
Contributor

@libinta how much memory can be saved with this PR? is there any perf data that I can refer?

skaulintel added a commit to HabanaAI/optimum-habana-fork that referenced this pull request Apr 11, 2024
skaulintel added a commit to HabanaAI/optimum-habana-fork that referenced this pull request Apr 12, 2024
* port llama related changes/optimizations to mistral if applicable.

* add mark step as in huggingface#875
skaulintel added a commit to HabanaAI/optimum-habana-fork that referenced this pull request Apr 12, 2024
* port llama related changes/optimizations to mistral if applicable.

* add mark step as in huggingface#875

* add fusedrope optimization for mistral

* add fused rope condition back in
dsmertin pushed a commit to dsmertin/optimum-habana that referenced this pull request Apr 17, 2024
Signed-off-by: Puneesh Khanna <pkhanna@habana.ai>
Co-authored-by: regisss <15324346+regisss@users.noreply.github.com>
Co-authored-by: Sayantan Sarkar <supersarkar@gmail.com>
Co-authored-by: Puneesh Khanna <pkhanna@habana.ai>
Co-authored-by: Witold Szczurek <152967125+wszczurekhabana@users.noreply.github.com>
@skavulya skavulya mentioned this pull request Feb 13, 2025
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants