Skip to content

Conversation

@fabianlim
Copy link
Contributor

achew010 and others added 8 commits May 28, 2024 15:18
* linting and formatting changes

* removed AutoGPTQ dep in linting

* added additional comments in tox
* workaround low-mem patch

* resolve conflicts and define patch function

* resolve conflicts and define patch function

* Apply suggestions from code review

Co-authored-by: Yu Chin Fabian Lim <fabianlim@users.noreply.github.com>

* revert hack to avoid low memory bug in HF memory metrics calculation

* reversed formatting

* reverse more formatting

---------

Co-authored-by: Yu Chin Fabian Lim <fabianlim@users.noreply.github.com>
* group memory field names with  prefix and minor fixes

* change to drop index on index reset
* initial commit

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* add fast quantized plugin

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* add mistral and fix plugin

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* add licensing notices and instructions for adding new plugin.

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* handle linting, formatting

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* 2nd round of linting

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* activate workflow and some more lint fixes

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* add sample config

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* updates to benchmark, scenarios

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* fix tests

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

---------

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>
Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>
* refactor

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* fixes

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* refactor mistral

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* add mixtral

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* some refactoring after introducing mlp

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* remove extranous files

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* add bnb

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* lint + fmt and improvements to readme

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* bench fixes

* need to handle lora adapters device due to #26

* allow replay of failed benches, addressing comment in #14

* update benches (remove l40)

---------

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>
#31)

* properly ignore lora adapters

* handle qlora quant state

* improve fix

* further simplification of fix

* updated benchmark reference (#34)

---------

Co-authored-by: achew010 <165894159+achew010@users.noreply.github.com>
* shift gpu mem computation to gather_report

* addressed comments
@fabianlim fabianlim merged commit 40aad46 into main Jun 7, 2024
@fabianlim fabianlim added the main Merged dev to main label Jun 7, 2024
@fabianlim fabianlim changed the title Fused Ops and Kernels, FSDP and Memory Fixes Upstream Main: Fused Ops and Kernels, FSDP and Memory Fixes Jun 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

main Merged dev to main

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants