2024 Release#96
Merged
Merged
Conversation
danielhanchen
added a commit
to shimmyshimmer/unsloth-staging-4
that referenced
this pull request
Feb 6, 2024
commit 35f2ab4a8b4deecbbbe9fbd95f4efde8694233db
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Feb 4 17:35:56 2024 +1100
2x faster inference (#151)
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
* Fast inference repatch
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update mistral.py
* Update __init__.py
* Fix inference
* Update mistral.py
* fast lm_head
* Remove fast path
* Update rope_embedding.py
* Update loader.py
* LlamaAttention_fast_forward_inference
* if past_key_value is not None and q_len == 1:
* revert inference
* Update loader.py
* past_key_value
* Update llama.py
* Update llama.py
* Fix SDPA
* Update llama.py
* padding
* Inference
* Update llama.py
* Revert
* Update mistral.py
* faster inference
* inference
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* inference
* Update llama.py
* Update utils.py
* faster inference
* Update llama.py
* revert
* lm_head
* Update llama.py
* inference
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* faster inference
* Update llama.py
* fast inference
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* torch compile
* past_key_values
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update llama.py
* fast inference + saving config.json
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* fast inference again
* more temp matrices
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update mistral.py
* Update llama.py
* SDPA
* attention_mask
* New version
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update utils.py
* Update utils.py
commit 051a73b0e63d3ae3acd7c4d962349280f69bbdb0
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 31 04:03:37 2024 +1100
Hotfix - fix inference (#146)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
* Fast inference repatch
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update mistral.py
* Update __init__.py
* Fix inference
* Update mistral.py
* fast lm_head
* Remove fast path
* Update rope_embedding.py
* Update loader.py
* LlamaAttention_fast_forward_inference
* if past_key_value is not None and q_len == 1:
* revert inference
* Update loader.py
* past_key_value
commit 05624642802c7f90dcc7aeea0e1c8d447cde006e
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 17:49:54 2024 +1100
Fix inference attention mask (#142)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
commit 206a9b65f090bd71ccaad7dd88b67ba2bfde0b58
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 03:45:07 2024 +1100
Nightly (#140)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
commit 8faf469f028a05852b2dc29ec8df1f36998fab33
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 02:52:39 2024 +1100
Fix saving issues (#139)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
commit 1ecc0185a5759c7a0c95dfc96aceea5023cebdfc
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 28 04:30:29 2024 +1100
1 more bug (#138)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
commit cd32ba76b71adf3317ede9de7d1cf6f30ad3bf0d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 28 04:20:06 2024 +1100
Fix bugs + more accurate Swiglu (#137)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
commit 89daa0efcc38c7690abbb8170b5d9f3d364796ce
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 27 04:50:22 2024 +1100
Inference bug fix (#134)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
commit 87a7ef1049f6fca409a0673f51f4758e0aff248d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 27 04:47:54 2024 +1100
More bug fixes (#133)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
commit 3d67790901696e953171f64b4bf9d980780051a0
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 26 04:19:17 2024 +1100
Fix bugs (#129)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
commit a833f403462e9cfc1f96b3b84d9da15d7d8db5ee
Author: Daniel Han <danielhanchen@gmail.com>
Date: Tue Jan 23 03:55:24 2024 +1100
2-4x faster native HF inference (#119)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
commit b370c9c8aacc31a7845404566dd95dfa8c0e3bac
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 21 22:20:22 2024 +1100
Hotfix (#118)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
commit 57a5b5a49da588b1db8e9a988cc985dc20393d34
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 05:00:37 2024 +1100
Update save.py
commit 5145a61e69ab9b3035465f649e1c1e5aae749f8f
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 04:21:54 2024 +1100
Update save.py
commit a7bd8d119c16433de4f8b6a36903ef7131f225e5
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 04:13:03 2024 +1100
Update save.py
commit be4b97e7d89074b6dd1d2e984fa429051d328192
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 21 03:43:49 2024 +1100
Fixed saving! (#113)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
* Fix quantization_method
* Fix quantization_config
* patch model
* Update llama.py
* Update llama.py
* Update llama.py
* Update save.py
* Update save.py
* tokenizer_save_settings
* Update save.py
* quantization and loftq
* Update save.py
* Update llama.py
* Update save.py
* upload_to_huggingface
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
commit abb462be71e8cf01ad989dca0efaa17441113651
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 20 23:23:00 2024 +1100
Hotfix for Jan 2024 Release (#110)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
* Fix quantization_method
* Fix quantization_config
* patch model
* Update llama.py
* Update llama.py
* Update llama.py
* Update save.py
* Update save.py
* tokenizer_save_settings
* Update save.py
* quantization and loftq
* Update save.py
* Update llama.py
* Update save.py
commit 31e2d71720e64b854145d7779833b7d2d3d4177e
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 20 04:25:06 2024 +1100
Quick fixes (#106)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
commit 8846337e5c8c2f206a4ac8fe6d239f3d1221f7ac
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sat Jan 20 02:30:31 2024 +1100
Update _utils.py
commit d378df87e5f3945474915a098c9aa58313465064
Merge: c1e7480 920e3c2
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 23:15:38 2024 +1100
Merge branch 'main' of https://github.com/unslothai/unsloth
commit c1e7480ac2ad0e5efa05e84fe0997619ccdd86a4
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 23:15:20 2024 +1100
Revert quantization methods
commit 920e3c2ea07a044addeb7c3fa8be6f0189cb7f84
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 22:57:22 2024 +1100
getattr issues (#103)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
commit fc25ab0df032f8ee5ea750f27c68d63f49d2d9a9
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 22:52:30 2024 +1100
Quick fixes (#101)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
commit b8b1eafda35d124046e11766aeeb6343957e0daf
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 04:51:19 2024 +1100
2024 Release (#96)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
commit 4112eb4a3df4c0911e36211b47381086c963b4e0
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 03:41:00 2024 +1100
Update pyproject.toml
commit 59d74753362ff59e664cb6d650b564511e6e20f3
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 03:35:17 2024 +1100
Update pyproject.toml
commit c1ac4d2707574868767345e76ebe49c8353f9057
Author: Daniel Han <danielhanchen@gmail.com>
Date: Thu Jan 11 04:08:03 2024 +1100
Fix some bugs (#83)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
commit d3887c7fd93d9b910bf6ee3ab3c7fd485fc55e46
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 10 23:10:48 2024 +1100
Update README.md (#81)
commit b5d94d9a0ad9532494e1b3c7badbb94fa92c50eb
Author: shimmy <107991372+shimmyshimmer@users.noreply.github.com>
Date: Wed Jan 10 23:10:23 2024 +1100
Discord button redo (#80)
commit 01d7f58e11373ab07b9282a42bc14f542dbdabf0
Author: shimmy <107991372+shimmyshimmer@users.noreply.github.com>
Date: Wed Jan 10 23:02:20 2024 +1100
Update logos (#79)
* HF Perf Button
* Update README.md
Adding new buttons cleanup
* Update README.md
* Delete images/Discord.png
* Delete images/try live demo green.png
* new transparent logos
* Revamping page
* Revamp mainpage
* Update README.md
* Update README.md
commit 9faaf5b388e025f8ffc302450a12ffb84e7e1750
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 10 20:03:01 2024 +1100
Create FUNDING.yml (#78)
commit 82e6fece0b78011707090639823d2d7acf5a3864
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Wed Jan 10 01:02:44 2024 +1100
fix_tokenizer
commit b52278199b7ae2764f242622275bb8a85ba7b721
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Tue Jan 9 23:40:43 2024 +1100
check_tokenizer
danielhanchen
added a commit
that referenced
this pull request
Feb 6, 2024
* HF Perf Button
* Update README.md
Adding new buttons cleanup
* Update README.md
* Delete images/Discord.png
* Delete images/try live demo green.png
* new transparent logos
* Revamping page
* Revamp mainpage
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* finetune button
* Delete start free finetune button.png
* free finetune button
* Add files via upload
* Update README.md
* Update README.md
* Add files via upload
* Add files via upload
* Update README.md
* Add files via upload
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Squashed commit of the following:
commit 35f2ab4a8b4deecbbbe9fbd95f4efde8694233db
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Feb 4 17:35:56 2024 +1100
2x faster inference (#151)
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
* Fast inference repatch
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update mistral.py
* Update __init__.py
* Fix inference
* Update mistral.py
* fast lm_head
* Remove fast path
* Update rope_embedding.py
* Update loader.py
* LlamaAttention_fast_forward_inference
* if past_key_value is not None and q_len == 1:
* revert inference
* Update loader.py
* past_key_value
* Update llama.py
* Update llama.py
* Fix SDPA
* Update llama.py
* padding
* Inference
* Update llama.py
* Revert
* Update mistral.py
* faster inference
* inference
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* inference
* Update llama.py
* Update utils.py
* faster inference
* Update llama.py
* revert
* lm_head
* Update llama.py
* inference
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* faster inference
* Update llama.py
* fast inference
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* torch compile
* past_key_values
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update llama.py
* fast inference + saving config.json
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* fast inference again
* more temp matrices
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update mistral.py
* Update llama.py
* SDPA
* attention_mask
* New version
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update utils.py
* Update utils.py
commit 051a73b0e63d3ae3acd7c4d962349280f69bbdb0
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 31 04:03:37 2024 +1100
Hotfix - fix inference (#146)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
* Fast inference repatch
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update mistral.py
* Update __init__.py
* Fix inference
* Update mistral.py
* fast lm_head
* Remove fast path
* Update rope_embedding.py
* Update loader.py
* LlamaAttention_fast_forward_inference
* if past_key_value is not None and q_len == 1:
* revert inference
* Update loader.py
* past_key_value
commit 05624642802c7f90dcc7aeea0e1c8d447cde006e
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 17:49:54 2024 +1100
Fix inference attention mask (#142)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
commit 206a9b65f090bd71ccaad7dd88b67ba2bfde0b58
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 03:45:07 2024 +1100
Nightly (#140)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
commit 8faf469f028a05852b2dc29ec8df1f36998fab33
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 02:52:39 2024 +1100
Fix saving issues (#139)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
commit 1ecc0185a5759c7a0c95dfc96aceea5023cebdfc
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 28 04:30:29 2024 +1100
1 more bug (#138)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
commit cd32ba76b71adf3317ede9de7d1cf6f30ad3bf0d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 28 04:20:06 2024 +1100
Fix bugs + more accurate Swiglu (#137)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
commit 89daa0efcc38c7690abbb8170b5d9f3d364796ce
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 27 04:50:22 2024 +1100
Inference bug fix (#134)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
commit 87a7ef1049f6fca409a0673f51f4758e0aff248d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 27 04:47:54 2024 +1100
More bug fixes (#133)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
commit 3d67790901696e953171f64b4bf9d980780051a0
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 26 04:19:17 2024 +1100
Fix bugs (#129)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
commit a833f403462e9cfc1f96b3b84d9da15d7d8db5ee
Author: Daniel Han <danielhanchen@gmail.com>
Date: Tue Jan 23 03:55:24 2024 +1100
2-4x faster native HF inference (#119)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
commit b370c9c8aacc31a7845404566dd95dfa8c0e3bac
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 21 22:20:22 2024 +1100
Hotfix (#118)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
commit 57a5b5a49da588b1db8e9a988cc985dc20393d34
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 05:00:37 2024 +1100
Update save.py
commit 5145a61e69ab9b3035465f649e1c1e5aae749f8f
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 04:21:54 2024 +1100
Update save.py
commit a7bd8d119c16433de4f8b6a36903ef7131f225e5
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 04:13:03 2024 +1100
Update save.py
commit be4b97e7d89074b6dd1d2e984fa429051d328192
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 21 03:43:49 2024 +1100
Fixed saving! (#113)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
* Fix quantization_method
* Fix quantization_config
* patch model
* Update llama.py
* Update llama.py
* Update llama.py
* Update save.py
* Update save.py
* tokenizer_save_settings
* Update save.py
* quantization and loftq
* Update save.py
* Update llama.py
* Update save.py
* upload_to_huggingface
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
commit abb462be71e8cf01ad989dca0efaa17441113651
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 20 23:23:00 2024 +1100
Hotfix for Jan 2024 Release (#110)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
* Fix quantization_method
* Fix quantization_config
* patch model
* Update llama.py
* Update llama.py
* Update llama.py
* Update save.py
* Update save.py
* tokenizer_save_settings
* Update save.py
* quantization and loftq
* Update save.py
* Update llama.py
* Update save.py
commit 31e2d71720e64b854145d7779833b7d2d3d4177e
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 20 04:25:06 2024 +1100
Quick fixes (#106)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
commit 8846337e5c8c2f206a4ac8fe6d239f3d1221f7ac
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sat Jan 20 02:30:31 2024 +1100
Update _utils.py
commit d378df87e5f3945474915a098c9aa58313465064
Merge: c1e7480 920e3c2
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 23:15:38 2024 +1100
Merge branch 'main' of https://github.com/unslothai/unsloth
commit c1e7480ac2ad0e5efa05e84fe0997619ccdd86a4
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 23:15:20 2024 +1100
Revert quantization methods
commit 920e3c2ea07a044addeb7c3fa8be6f0189cb7f84
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 22:57:22 2024 +1100
getattr issues (#103)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
commit fc25ab0df032f8ee5ea750f27c68d63f49d2d9a9
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 22:52:30 2024 +1100
Quick fixes (#101)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
commit b8b1eafda35d124046e11766aeeb6343957e0daf
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 04:51:19 2024 +1100
2024 Release (#96)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
commit 4112eb4a3df4c0911e36211b47381086c963b4e0
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 03:41:00 2024 +1100
Update pyproject.toml
commit 59d74753362ff59e664cb6d650b564511e6e20f3
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 03:35:17 2024 +1100
Update pyproject.toml
commit c1ac4d2707574868767345e76ebe49c8353f9057
Author: Daniel Han <danielhanchen@gmail.com>
Date: Thu Jan 11 04:08:03 2024 +1100
Fix some bugs (#83)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
commit d3887c7fd93d9b910bf6ee3ab3c7fd485fc55e46
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 10 23:10:48 2024 +1100
Update README.md (#81)
commit b5d94d9a0ad9532494e1b3c7badbb94fa92c50eb
Author: shimmy <107991372+shimmyshimmer@users.noreply.github.com>
Date: Wed Jan 10 23:10:23 2024 +1100
Discord button redo (#80)
commit 01d7f58e11373ab07b9282a42bc14f542dbdabf0
Author: shimmy <107991372+shimmyshimmer@users.noreply.github.com>
Date: Wed Jan 10 23:02:20 2024 +1100
Update logos (#79)
* HF Perf Button
* Update README.md
Adding new buttons cleanup
* Update README.md
* Delete images/Discord.png
* Delete images/try live demo green.png
* new transparent logos
* Revamping page
* Revamp mainpage
* Update README.md
* Update README.md
commit 9faaf5b388e025f8ffc302450a12ffb84e7e1750
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 10 20:03:01 2024 +1100
Create FUNDING.yml (#78)
commit 82e6fece0b78011707090639823d2d7acf5a3864
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Wed Jan 10 01:02:44 2024 +1100
fix_tokenizer
commit b52278199b7ae2764f242622275bb8a85ba7b721
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Tue Jan 9 23:40:43 2024 +1100
check_tokenizer
---------
Co-authored-by: Daniel Han <danielhanchen@gmail.com>
cm2435
pushed a commit
to cm2435/unsloth
that referenced
this pull request
Feb 26, 2024
* HF Perf Button
* Update README.md
Adding new buttons cleanup
* Update README.md
* Delete images/Discord.png
* Delete images/try live demo green.png
* new transparent logos
* Revamping page
* Revamp mainpage
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* finetune button
* Delete start free finetune button.png
* free finetune button
* Add files via upload
* Update README.md
* Update README.md
* Add files via upload
* Add files via upload
* Update README.md
* Add files via upload
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Squashed commit of the following:
commit 35f2ab4a8b4deecbbbe9fbd95f4efde8694233db
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Feb 4 17:35:56 2024 +1100
2x faster inference (#151)
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
* Fast inference repatch
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update mistral.py
* Update __init__.py
* Fix inference
* Update mistral.py
* fast lm_head
* Remove fast path
* Update rope_embedding.py
* Update loader.py
* LlamaAttention_fast_forward_inference
* if past_key_value is not None and q_len == 1:
* revert inference
* Update loader.py
* past_key_value
* Update llama.py
* Update llama.py
* Fix SDPA
* Update llama.py
* padding
* Inference
* Update llama.py
* Revert
* Update mistral.py
* faster inference
* inference
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* inference
* Update llama.py
* Update utils.py
* faster inference
* Update llama.py
* revert
* lm_head
* Update llama.py
* inference
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* faster inference
* Update llama.py
* fast inference
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* torch compile
* past_key_values
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update llama.py
* fast inference + saving config.json
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* fast inference again
* more temp matrices
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update mistral.py
* Update llama.py
* SDPA
* attention_mask
* New version
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update utils.py
* Update utils.py
commit 051a73b0e63d3ae3acd7c4d962349280f69bbdb0
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 31 04:03:37 2024 +1100
Hotfix - fix inference (#146)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
* Fast inference repatch
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update mistral.py
* Update __init__.py
* Fix inference
* Update mistral.py
* fast lm_head
* Remove fast path
* Update rope_embedding.py
* Update loader.py
* LlamaAttention_fast_forward_inference
* if past_key_value is not None and q_len == 1:
* revert inference
* Update loader.py
* past_key_value
commit 05624642802c7f90dcc7aeea0e1c8d447cde006e
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 17:49:54 2024 +1100
Fix inference attention mask (#142)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
commit 206a9b65f090bd71ccaad7dd88b67ba2bfde0b58
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 03:45:07 2024 +1100
Nightly (#140)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
commit 8faf469f028a05852b2dc29ec8df1f36998fab33
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 02:52:39 2024 +1100
Fix saving issues (#139)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
commit 1ecc0185a5759c7a0c95dfc96aceea5023cebdfc
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 28 04:30:29 2024 +1100
1 more bug (#138)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
commit cd32ba76b71adf3317ede9de7d1cf6f30ad3bf0d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 28 04:20:06 2024 +1100
Fix bugs + more accurate Swiglu (#137)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
commit 89daa0efcc38c7690abbb8170b5d9f3d364796ce
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 27 04:50:22 2024 +1100
Inference bug fix (#134)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
commit 87a7ef1049f6fca409a0673f51f4758e0aff248d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 27 04:47:54 2024 +1100
More bug fixes (#133)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
commit 3d67790901696e953171f64b4bf9d980780051a0
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 26 04:19:17 2024 +1100
Fix bugs (#129)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
commit a833f403462e9cfc1f96b3b84d9da15d7d8db5ee
Author: Daniel Han <danielhanchen@gmail.com>
Date: Tue Jan 23 03:55:24 2024 +1100
2-4x faster native HF inference (#119)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
commit b370c9c8aacc31a7845404566dd95dfa8c0e3bac
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 21 22:20:22 2024 +1100
Hotfix (#118)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
commit 57a5b5a49da588b1db8e9a988cc985dc20393d34
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 05:00:37 2024 +1100
Update save.py
commit 5145a61e69ab9b3035465f649e1c1e5aae749f8f
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 04:21:54 2024 +1100
Update save.py
commit a7bd8d119c16433de4f8b6a36903ef7131f225e5
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 04:13:03 2024 +1100
Update save.py
commit be4b97e7d89074b6dd1d2e984fa429051d328192
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 21 03:43:49 2024 +1100
Fixed saving! (#113)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
* Fix quantization_method
* Fix quantization_config
* patch model
* Update llama.py
* Update llama.py
* Update llama.py
* Update save.py
* Update save.py
* tokenizer_save_settings
* Update save.py
* quantization and loftq
* Update save.py
* Update llama.py
* Update save.py
* upload_to_huggingface
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
commit abb462be71e8cf01ad989dca0efaa17441113651
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 20 23:23:00 2024 +1100
Hotfix for Jan 2024 Release (#110)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
* Fix quantization_method
* Fix quantization_config
* patch model
* Update llama.py
* Update llama.py
* Update llama.py
* Update save.py
* Update save.py
* tokenizer_save_settings
* Update save.py
* quantization and loftq
* Update save.py
* Update llama.py
* Update save.py
commit 31e2d71720e64b854145d7779833b7d2d3d4177e
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 20 04:25:06 2024 +1100
Quick fixes (#106)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
commit 8846337e5c8c2f206a4ac8fe6d239f3d1221f7ac
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sat Jan 20 02:30:31 2024 +1100
Update _utils.py
commit d378df87e5f3945474915a098c9aa58313465064
Merge: c1e7480 920e3c2
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 23:15:38 2024 +1100
Merge branch 'main' of https://github.com/unslothai/unsloth
commit c1e7480ac2ad0e5efa05e84fe0997619ccdd86a4
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 23:15:20 2024 +1100
Revert quantization methods
commit 920e3c2ea07a044addeb7c3fa8be6f0189cb7f84
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 22:57:22 2024 +1100
getattr issues (#103)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
commit fc25ab0df032f8ee5ea750f27c68d63f49d2d9a9
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 22:52:30 2024 +1100
Quick fixes (#101)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
commit b8b1eafda35d124046e11766aeeb6343957e0daf
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 04:51:19 2024 +1100
2024 Release (#96)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
commit 4112eb4a3df4c0911e36211b47381086c963b4e0
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 03:41:00 2024 +1100
Update pyproject.toml
commit 59d74753362ff59e664cb6d650b564511e6e20f3
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 03:35:17 2024 +1100
Update pyproject.toml
commit c1ac4d2707574868767345e76ebe49c8353f9057
Author: Daniel Han <danielhanchen@gmail.com>
Date: Thu Jan 11 04:08:03 2024 +1100
Fix some bugs (#83)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
commit d3887c7fd93d9b910bf6ee3ab3c7fd485fc55e46
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 10 23:10:48 2024 +1100
Update README.md (#81)
commit b5d94d9a0ad9532494e1b3c7badbb94fa92c50eb
Author: shimmy <107991372+shimmyshimmer@users.noreply.github.com>
Date: Wed Jan 10 23:10:23 2024 +1100
Discord button redo (#80)
commit 01d7f58e11373ab07b9282a42bc14f542dbdabf0
Author: shimmy <107991372+shimmyshimmer@users.noreply.github.com>
Date: Wed Jan 10 23:02:20 2024 +1100
Update logos (#79)
* HF Perf Button
* Update README.md
Adding new buttons cleanup
* Update README.md
* Delete images/Discord.png
* Delete images/try live demo green.png
* new transparent logos
* Revamping page
* Revamp mainpage
* Update README.md
* Update README.md
commit 9faaf5b388e025f8ffc302450a12ffb84e7e1750
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 10 20:03:01 2024 +1100
Create FUNDING.yml (#78)
commit 82e6fece0b78011707090639823d2d7acf5a3864
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Wed Jan 10 01:02:44 2024 +1100
fix_tokenizer
commit b52278199b7ae2764f242622275bb8a85ba7b721
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Tue Jan 9 23:40:43 2024 +1100
check_tokenizer
---------
Co-authored-by: Daniel Han <danielhanchen@gmail.com>
mmathew23
added a commit
to mmathew23/unsloth
that referenced
this pull request
Jun 8, 2025
* gemma require gradients fix * Update peft_utils.py --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com>
mmathew23
added a commit
to mmathew23/unsloth
that referenced
this pull request
Jun 8, 2025
* Update dataset_utils.py * Update dataset_utils.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update loss_utils.py * Update loss_utils.py * gpu_memory_utilization * Update temporary_patches.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * train on completions VLMs * Update dataset_utils.py * Update dataset_utils.py * Update dataset_utils.py * Update dataset_utils.py * VLM train only on completions * Update loss_utils.py * Update dataset_utils.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update saving_utils.py * Update llama_cpp.py * Update llama_cpp.py * Update saving_utils.py * Update saving_utils.py * Update __init__.py * Update compiler.py * Update loss_utils.py * Update compiler.py * Update loss_utils.py * Update loss_utils.py * Update llama_cpp.py * Update loss_utils.py * Update compiler.py * Update llama_cpp.py * Update compiler.py * Update vllm_utils.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update training_utils.py * Update dataset_utils.py * Update dataset_utils.py * Revert "Update dataset_utils.py" This reverts commit 3b690ad. * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update compiler.py * Remove prints * Update compiler.py * Update saving_utils.py * Update temporary_patches.py * Update __init__.py * Update pyproject.toml * Update vllm_utils.py * bug fix unslothai#2008 unsloth issue - load_in_4bit = True + fast_inference = True (unslothai#79) * bug fix unslothai#2008 unsloth * non-quant dtype fix * Update vllm_utils.py --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com> * Update dataset_utils.py * Update compiler.py * Update temporary_patches.py * Gemma 3 fixes * Update temporary_patches.py * Update compiler.py * Update compiler.py * Gemma 3 fixes * Update patching_utils.py * Update compiler.py * Update compiler.py * Update patching_utils.py * Update temporary_patches.py * Update compiler.py * Update compiler.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * compiler * Update gradient_checkpointing.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * causal mask dtype * Fix checkpoint and save from local file (unslothai#74) * Enhance gradient checkpointing and add original model ID retrieval in saving utilities * In case adapter_config.json as well * Update patching_utils.py * Update patching_utils.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update loss_utils.py * Update compiler.py * Update vllm_utils.py * Update compiler.py * Update peft_utils.py * Update rl_replacements.py * Update vllm_utils.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update vllm_lora_worker_manager.py * Update utils.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update dataset_utils.py * bidirectional attention * Update vllm_utils.py * Update __init__.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_lora_worker_manager.py * Update vllm_lora_worker_manager.py * Update vllm_lora_worker_manager.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update loss_utils.py * Update loss_utils.py * Update loss_utils.py * Update loss_utils.py * Update loss_utils.py * Update __init__.py * fix: AsyncLLMEngine bugs (unslothai#82) * fixed a typo in L119, removing unnecessary len() (unslothai#84) Co-authored-by: Xiaochen Zhu <xz479@cl.cam.ac.uk> * Fix gradient checkpointing warning filter implementation * Input grads fix for gemma3 (unslothai#96) * gemma require gradients fix * Update peft_utils.py --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com> * Update vision_utils.py * Vision requires grad * Check SDPA for Mistral / Pixtral * Update compiler.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update __init__.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vllm_utils.py (unslothai#99) Fix bugs in generate_batches.py.Original output = [] will result in duplication of results. * Update vision_utils.py * Fixes to support IterableDataset (unslothai#98) * Support Iterable Datasets * Update dataset_utils.py * Update dataset_utils.py * Update dataset_utils.py * Update dataset_utils.py * Preserve batch size from iterable dataset * Preserve batch size from iterable dataset * Support train_on_response_only with IterableDataset * Support train_on_response_only with IterableDataset * Support train_on_response_only with IterableDataset * Support train_on_response_only with IterableDataset --------- Co-authored-by: Mukkesh Ganesh <mukmckenzie@gmail.com> Co-authored-by: Edd <68678137+Erland366@users.noreply.github.com> Co-authored-by: Brad Hilton <brad.hilton.nw@gmail.com> Co-authored-by: SpaceHunter <30568250+SpaceHunterInf@users.noreply.github.com> Co-authored-by: Xiaochen Zhu <xz479@cl.cam.ac.uk> Co-authored-by: Roland Tannous <rolandtannous@gonovel.co> Co-authored-by: DoubleMathew <mmathew23@gmail.com> Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com> Co-authored-by: Qian Wu <121997440+5k5000@users.noreply.github.com> Co-authored-by: marcandrelarochelle <marcandrelarochelle1820@gmail.com>
mmathew23
added a commit
to mmathew23/unsloth
that referenced
this pull request
Jun 8, 2025
* Update vision_utils.py * Update vision_utils.py * train on completions VLMs * Update dataset_utils.py * Update dataset_utils.py * Update dataset_utils.py * Update dataset_utils.py * VLM train only on completions * Update loss_utils.py * Update dataset_utils.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update saving_utils.py * Update llama_cpp.py * Update llama_cpp.py * Update saving_utils.py * Update saving_utils.py * Update __init__.py * Update compiler.py * Update loss_utils.py * Update compiler.py * Update loss_utils.py * Update loss_utils.py * Update llama_cpp.py * Update loss_utils.py * Update compiler.py * Update llama_cpp.py * Update compiler.py * Update vllm_utils.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update training_utils.py * Update dataset_utils.py * Update dataset_utils.py * Revert "Update dataset_utils.py" This reverts commit 3b690ad. * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update compiler.py * Remove prints * Update compiler.py * Update saving_utils.py * Update temporary_patches.py * Update __init__.py * Update pyproject.toml * Update vllm_utils.py * bug fix unslothai#2008 unsloth issue - load_in_4bit = True + fast_inference = True (unslothai#79) * bug fix unslothai#2008 unsloth * non-quant dtype fix * Update vllm_utils.py --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com> * Update dataset_utils.py * Update compiler.py * Update temporary_patches.py * Gemma 3 fixes * Update temporary_patches.py * Update compiler.py * Update compiler.py * Gemma 3 fixes * Update patching_utils.py * Update compiler.py * Update compiler.py * Update patching_utils.py * Update temporary_patches.py * Update compiler.py * Update compiler.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * compiler * Update gradient_checkpointing.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * causal mask dtype * Fix checkpoint and save from local file (unslothai#74) * Enhance gradient checkpointing and add original model ID retrieval in saving utilities * In case adapter_config.json as well * Update patching_utils.py * Update patching_utils.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update loss_utils.py * Update compiler.py * Update vllm_utils.py * Update compiler.py * Update peft_utils.py * Update rl_replacements.py * Update vllm_utils.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update vllm_lora_worker_manager.py * Update utils.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update dataset_utils.py * bidirectional attention * Update vllm_utils.py * Update __init__.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_lora_worker_manager.py * Update vllm_lora_worker_manager.py * Update vllm_lora_worker_manager.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update loss_utils.py * Update loss_utils.py * Update loss_utils.py * Update loss_utils.py * Update loss_utils.py * Update __init__.py * fix: AsyncLLMEngine bugs (unslothai#82) * fixed a typo in L119, removing unnecessary len() (unslothai#84) Co-authored-by: Xiaochen Zhu <xz479@cl.cam.ac.uk> * Fix gradient checkpointing warning filter implementation * Input grads fix for gemma3 (unslothai#96) * gemma require gradients fix * Update peft_utils.py --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com> * Update vision_utils.py * Vision requires grad * Check SDPA for Mistral / Pixtral * Update compiler.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update __init__.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vllm_utils.py (unslothai#99) Fix bugs in generate_batches.py.Original output = [] will result in duplication of results. * Update vision_utils.py * Fixes to support IterableDataset (unslothai#98) * Support Iterable Datasets * Update dataset_utils.py * Update dataset_utils.py * Update dataset_utils.py * Update dataset_utils.py * Preserve batch size from iterable dataset * Preserve batch size from iterable dataset * Support train_on_response_only with IterableDataset * Support train_on_response_only with IterableDataset * Support train_on_response_only with IterableDataset * Support train_on_response_only with IterableDataset * Update vllm_utils.py * Create vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * vLLM for Qwen 3 * Update vllm_utils.py * Update vllm_utils.py --------- Co-authored-by: Mukkesh Ganesh <mukmckenzie@gmail.com> Co-authored-by: Edd <68678137+Erland366@users.noreply.github.com> Co-authored-by: Brad Hilton <brad.hilton.nw@gmail.com> Co-authored-by: SpaceHunter <30568250+SpaceHunterInf@users.noreply.github.com> Co-authored-by: Xiaochen Zhu <xz479@cl.cam.ac.uk> Co-authored-by: Roland Tannous <rolandtannous@gonovel.co> Co-authored-by: DoubleMathew <mmathew23@gmail.com> Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com> Co-authored-by: Qian Wu <121997440+5k5000@users.noreply.github.com> Co-authored-by: marcandrelarochelle <marcandrelarochelle1820@gmail.com>
mmathew23
added a commit
to mmathew23/unsloth
that referenced
this pull request
Jun 8, 2025
* Update compiler.py * Update compiler.py * Update compiler.py * Update saving_utils.py * Update llama_cpp.py * Update llama_cpp.py * Update saving_utils.py * Update saving_utils.py * Update __init__.py * Update compiler.py * Update loss_utils.py * Update compiler.py * Update loss_utils.py * Update loss_utils.py * Update llama_cpp.py * Update loss_utils.py * Update compiler.py * Update llama_cpp.py * Update compiler.py * Update vllm_utils.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update training_utils.py * Update dataset_utils.py * Update dataset_utils.py * Revert "Update dataset_utils.py" This reverts commit 3b690ad. * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update compiler.py * Remove prints * Update compiler.py * Update saving_utils.py * Update temporary_patches.py * Update __init__.py * Update pyproject.toml * Update vllm_utils.py * bug fix unslothai#2008 unsloth issue - load_in_4bit = True + fast_inference = True (unslothai#79) * bug fix unslothai#2008 unsloth * non-quant dtype fix * Update vllm_utils.py --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com> * Update dataset_utils.py * Update compiler.py * Update temporary_patches.py * Gemma 3 fixes * Update temporary_patches.py * Update compiler.py * Update compiler.py * Gemma 3 fixes * Update patching_utils.py * Update compiler.py * Update compiler.py * Update patching_utils.py * Update temporary_patches.py * Update compiler.py * Update compiler.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * compiler * Update gradient_checkpointing.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * causal mask dtype * Fix checkpoint and save from local file (unslothai#74) * Enhance gradient checkpointing and add original model ID retrieval in saving utilities * In case adapter_config.json as well * Update patching_utils.py * Update patching_utils.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update loss_utils.py * Update compiler.py * Update vllm_utils.py * Update compiler.py * Update peft_utils.py * Update rl_replacements.py * Update vllm_utils.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update vllm_lora_worker_manager.py * Update utils.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update dataset_utils.py * bidirectional attention * Update vllm_utils.py * Update __init__.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_lora_worker_manager.py * Update vllm_lora_worker_manager.py * Update vllm_lora_worker_manager.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update loss_utils.py * Update loss_utils.py * Update loss_utils.py * Update loss_utils.py * Update loss_utils.py * Update __init__.py * fix: AsyncLLMEngine bugs (unslothai#82) * fixed a typo in L119, removing unnecessary len() (unslothai#84) Co-authored-by: Xiaochen Zhu <xz479@cl.cam.ac.uk> * Fix gradient checkpointing warning filter implementation * Input grads fix for gemma3 (unslothai#96) * gemma require gradients fix * Update peft_utils.py --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com> * Update vision_utils.py * Vision requires grad * Check SDPA for Mistral / Pixtral * Update compiler.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update __init__.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vllm_utils.py (unslothai#99) Fix bugs in generate_batches.py.Original output = [] will result in duplication of results. * Update vision_utils.py * Fixes to support IterableDataset (unslothai#98) * Support Iterable Datasets * Update dataset_utils.py * Update dataset_utils.py * Update dataset_utils.py * Update dataset_utils.py * Preserve batch size from iterable dataset * Preserve batch size from iterable dataset * Support train_on_response_only with IterableDataset * Support train_on_response_only with IterableDataset * Support train_on_response_only with IterableDataset * Support train_on_response_only with IterableDataset * Update vllm_utils.py * Create vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * vLLM for Qwen 3 * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py --------- Co-authored-by: Mukkesh Ganesh <mukmckenzie@gmail.com> Co-authored-by: Edd <68678137+Erland366@users.noreply.github.com> Co-authored-by: Brad Hilton <brad.hilton.nw@gmail.com> Co-authored-by: SpaceHunter <30568250+SpaceHunterInf@users.noreply.github.com> Co-authored-by: Xiaochen Zhu <xz479@cl.cam.ac.uk> Co-authored-by: Roland Tannous <rolandtannous@gonovel.co> Co-authored-by: DoubleMathew <mmathew23@gmail.com> Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com> Co-authored-by: Qian Wu <121997440+5k5000@users.noreply.github.com> Co-authored-by: marcandrelarochelle <marcandrelarochelle1820@gmail.com>
mmathew23
added a commit
to mmathew23/unsloth
that referenced
this pull request
Jun 8, 2025
* bug fix unslothai#2008 unsloth issue - load_in_4bit = True + fast_inference = True (unslothai#79) * bug fix unslothai#2008 unsloth * non-quant dtype fix * Update vllm_utils.py --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com> * Update dataset_utils.py * Update compiler.py * Update temporary_patches.py * Gemma 3 fixes * Update temporary_patches.py * Update compiler.py * Update compiler.py * Gemma 3 fixes * Update patching_utils.py * Update compiler.py * Update compiler.py * Update patching_utils.py * Update temporary_patches.py * Update compiler.py * Update compiler.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * compiler * Update gradient_checkpointing.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * causal mask dtype * Fix checkpoint and save from local file (unslothai#74) * Enhance gradient checkpointing and add original model ID retrieval in saving utilities * In case adapter_config.json as well * Update patching_utils.py * Update patching_utils.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update loss_utils.py * Update compiler.py * Update vllm_utils.py * Update compiler.py * Update peft_utils.py * Update rl_replacements.py * Update vllm_utils.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update vllm_lora_worker_manager.py * Update utils.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update dataset_utils.py * bidirectional attention * Update vllm_utils.py * Update __init__.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_lora_worker_manager.py * Update vllm_lora_worker_manager.py * Update vllm_lora_worker_manager.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update loss_utils.py * Update loss_utils.py * Update loss_utils.py * Update loss_utils.py * Update loss_utils.py * Update __init__.py * fix: AsyncLLMEngine bugs (unslothai#82) * fixed a typo in L119, removing unnecessary len() (unslothai#84) Co-authored-by: Xiaochen Zhu <xz479@cl.cam.ac.uk> * Fix gradient checkpointing warning filter implementation * Input grads fix for gemma3 (unslothai#96) * gemma require gradients fix * Update peft_utils.py --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com> * Update vision_utils.py * Vision requires grad * Check SDPA for Mistral / Pixtral * Update compiler.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update __init__.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vllm_utils.py (unslothai#99) Fix bugs in generate_batches.py.Original output = [] will result in duplication of results. * Update vision_utils.py * Fixes to support IterableDataset (unslothai#98) * Support Iterable Datasets * Update dataset_utils.py * Update dataset_utils.py * Update dataset_utils.py * Update dataset_utils.py * Preserve batch size from iterable dataset * Preserve batch size from iterable dataset * Support train_on_response_only with IterableDataset * Support train_on_response_only with IterableDataset * Support train_on_response_only with IterableDataset * Support train_on_response_only with IterableDataset * Update vllm_utils.py * Create vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * vLLM for Qwen 3 * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update compiler.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Swap space reduce * Update vllm_utils.py * Update vllm_utils.py * Update rl_replacements.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update __init__.py --------- Co-authored-by: Mukkesh Ganesh <mukmckenzie@gmail.com> Co-authored-by: Edd <68678137+Erland366@users.noreply.github.com> Co-authored-by: Brad Hilton <brad.hilton.nw@gmail.com> Co-authored-by: SpaceHunter <30568250+SpaceHunterInf@users.noreply.github.com> Co-authored-by: Xiaochen Zhu <xz479@cl.cam.ac.uk> Co-authored-by: Roland Tannous <rolandtannous@gonovel.co> Co-authored-by: DoubleMathew <mmathew23@gmail.com> Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com> Co-authored-by: Qian Wu <121997440+5k5000@users.noreply.github.com> Co-authored-by: marcandrelarochelle <marcandrelarochelle1820@gmail.com>
mmathew23
added a commit
to mmathew23/unsloth
that referenced
this pull request
Jun 8, 2025
* Update compiler.py * Update patching_utils.py * Update temporary_patches.py * Update compiler.py * Update compiler.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * compiler * Update gradient_checkpointing.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * causal mask dtype * Fix checkpoint and save from local file (unslothai#74) * Enhance gradient checkpointing and add original model ID retrieval in saving utilities * In case adapter_config.json as well * Update patching_utils.py * Update patching_utils.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update loss_utils.py * Update compiler.py * Update vllm_utils.py * Update compiler.py * Update peft_utils.py * Update rl_replacements.py * Update vllm_utils.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update vllm_lora_worker_manager.py * Update utils.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update dataset_utils.py * bidirectional attention * Update vllm_utils.py * Update __init__.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_lora_worker_manager.py * Update vllm_lora_worker_manager.py * Update vllm_lora_worker_manager.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update loss_utils.py * Update loss_utils.py * Update loss_utils.py * Update loss_utils.py * Update loss_utils.py * Update __init__.py * fix: AsyncLLMEngine bugs (unslothai#82) * fixed a typo in L119, removing unnecessary len() (unslothai#84) Co-authored-by: Xiaochen Zhu <xz479@cl.cam.ac.uk> * Fix gradient checkpointing warning filter implementation * Input grads fix for gemma3 (unslothai#96) * gemma require gradients fix * Update peft_utils.py --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com> * Update vision_utils.py * Vision requires grad * Check SDPA for Mistral / Pixtral * Update compiler.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update __init__.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vllm_utils.py (unslothai#99) Fix bugs in generate_batches.py.Original output = [] will result in duplication of results. * Update vision_utils.py * Fixes to support IterableDataset (unslothai#98) * Support Iterable Datasets * Update dataset_utils.py * Update dataset_utils.py * Update dataset_utils.py * Update dataset_utils.py * Preserve batch size from iterable dataset * Preserve batch size from iterable dataset * Support train_on_response_only with IterableDataset * Support train_on_response_only with IterableDataset * Support train_on_response_only with IterableDataset * Support train_on_response_only with IterableDataset * Update vllm_utils.py * Create vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * vLLM for Qwen 3 * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update compiler.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Swap space reduce * Update vllm_utils.py * Update vllm_utils.py * Update rl_replacements.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update __init__.py * Update rl_replacements.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update rl_replacements.py * Update vllm_utils.py * Update rl_replacements.py * Revert "Update rl_replacements.py" This reverts commit c0a4022. * Update __init__.py --------- Co-authored-by: Edd <68678137+Erland366@users.noreply.github.com> Co-authored-by: Brad Hilton <brad.hilton.nw@gmail.com> Co-authored-by: SpaceHunter <30568250+SpaceHunterInf@users.noreply.github.com> Co-authored-by: Xiaochen Zhu <xz479@cl.cam.ac.uk> Co-authored-by: Roland Tannous <rolandtannous@gonovel.co> Co-authored-by: DoubleMathew <mmathew23@gmail.com> Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com> Co-authored-by: Qian Wu <121997440+5k5000@users.noreply.github.com> Co-authored-by: marcandrelarochelle <marcandrelarochelle1820@gmail.com>
mmathew23
added a commit
to mmathew23/unsloth
that referenced
this pull request
Jun 8, 2025
* Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * causal mask dtype * Fix checkpoint and save from local file (unslothai#74) * Enhance gradient checkpointing and add original model ID retrieval in saving utilities * In case adapter_config.json as well * Update patching_utils.py * Update patching_utils.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update loss_utils.py * Update compiler.py * Update vllm_utils.py * Update compiler.py * Update peft_utils.py * Update rl_replacements.py * Update vllm_utils.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update vllm_lora_worker_manager.py * Update utils.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update dataset_utils.py * bidirectional attention * Update vllm_utils.py * Update __init__.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_lora_worker_manager.py * Update vllm_lora_worker_manager.py * Update vllm_lora_worker_manager.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update loss_utils.py * Update loss_utils.py * Update loss_utils.py * Update loss_utils.py * Update loss_utils.py * Update __init__.py * fix: AsyncLLMEngine bugs (unslothai#82) * fixed a typo in L119, removing unnecessary len() (unslothai#84) Co-authored-by: Xiaochen Zhu <xz479@cl.cam.ac.uk> * Fix gradient checkpointing warning filter implementation * Input grads fix for gemma3 (unslothai#96) * gemma require gradients fix * Update peft_utils.py --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com> * Update vision_utils.py * Vision requires grad * Check SDPA for Mistral / Pixtral * Update compiler.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update __init__.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vllm_utils.py (unslothai#99) Fix bugs in generate_batches.py.Original output = [] will result in duplication of results. * Update vision_utils.py * Fixes to support IterableDataset (unslothai#98) * Support Iterable Datasets * Update dataset_utils.py * Update dataset_utils.py * Update dataset_utils.py * Update dataset_utils.py * Preserve batch size from iterable dataset * Preserve batch size from iterable dataset * Support train_on_response_only with IterableDataset * Support train_on_response_only with IterableDataset * Support train_on_response_only with IterableDataset * Support train_on_response_only with IterableDataset * Update vllm_utils.py * Create vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * vLLM for Qwen 3 * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update compiler.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Swap space reduce * Update vllm_utils.py * Update vllm_utils.py * Update rl_replacements.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update __init__.py * Update rl_replacements.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update rl_replacements.py * Update vllm_utils.py * Update rl_replacements.py * Revert "Update rl_replacements.py" This reverts commit c0a4022. * Update __init__.py * Update patching_utils.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Fixes * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * revert * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update __init__.py * Update compiler.py * Update temporary_patches.py * Update compiler.py * Update temporary_patches.py --------- Co-authored-by: Edd <68678137+Erland366@users.noreply.github.com> Co-authored-by: Brad Hilton <brad.hilton.nw@gmail.com> Co-authored-by: SpaceHunter <30568250+SpaceHunterInf@users.noreply.github.com> Co-authored-by: Xiaochen Zhu <xz479@cl.cam.ac.uk> Co-authored-by: Roland Tannous <rolandtannous@gonovel.co> Co-authored-by: DoubleMathew <mmathew23@gmail.com> Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com> Co-authored-by: Qian Wu <121997440+5k5000@users.noreply.github.com> Co-authored-by: marcandrelarochelle <marcandrelarochelle1820@gmail.com>
mmathew23
added a commit
to mmathew23/unsloth
that referenced
this pull request
Jun 25, 2025
* gemma require gradients fix * Update peft_utils.py --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com>
mmathew23
added a commit
to mmathew23/unsloth
that referenced
this pull request
Jun 25, 2025
* Update dataset_utils.py * Update dataset_utils.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update loss_utils.py * Update loss_utils.py * gpu_memory_utilization * Update temporary_patches.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * train on completions VLMs * Update dataset_utils.py * Update dataset_utils.py * Update dataset_utils.py * Update dataset_utils.py * VLM train only on completions * Update loss_utils.py * Update dataset_utils.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update saving_utils.py * Update llama_cpp.py * Update llama_cpp.py * Update saving_utils.py * Update saving_utils.py * Update __init__.py * Update compiler.py * Update loss_utils.py * Update compiler.py * Update loss_utils.py * Update loss_utils.py * Update llama_cpp.py * Update loss_utils.py * Update compiler.py * Update llama_cpp.py * Update compiler.py * Update vllm_utils.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update training_utils.py * Update dataset_utils.py * Update dataset_utils.py * Revert "Update dataset_utils.py" This reverts commit 3b690ad. * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update compiler.py * Remove prints * Update compiler.py * Update saving_utils.py * Update temporary_patches.py * Update __init__.py * Update pyproject.toml * Update vllm_utils.py * bug fix unslothai#2008 unsloth issue - load_in_4bit = True + fast_inference = True (unslothai#79) * bug fix unslothai#2008 unsloth * non-quant dtype fix * Update vllm_utils.py --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com> * Update dataset_utils.py * Update compiler.py * Update temporary_patches.py * Gemma 3 fixes * Update temporary_patches.py * Update compiler.py * Update compiler.py * Gemma 3 fixes * Update patching_utils.py * Update compiler.py * Update compiler.py * Update patching_utils.py * Update temporary_patches.py * Update compiler.py * Update compiler.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * compiler * Update gradient_checkpointing.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * causal mask dtype * Fix checkpoint and save from local file (unslothai#74) * Enhance gradient checkpointing and add original model ID retrieval in saving utilities * In case adapter_config.json as well * Update patching_utils.py * Update patching_utils.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update loss_utils.py * Update compiler.py * Update vllm_utils.py * Update compiler.py * Update peft_utils.py * Update rl_replacements.py * Update vllm_utils.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update vllm_lora_worker_manager.py * Update utils.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update dataset_utils.py * bidirectional attention * Update vllm_utils.py * Update __init__.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_lora_worker_manager.py * Update vllm_lora_worker_manager.py * Update vllm_lora_worker_manager.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update loss_utils.py * Update loss_utils.py * Update loss_utils.py * Update loss_utils.py * Update loss_utils.py * Update __init__.py * fix: AsyncLLMEngine bugs (unslothai#82) * fixed a typo in L119, removing unnecessary len() (unslothai#84) Co-authored-by: Xiaochen Zhu <xz479@cl.cam.ac.uk> * Fix gradient checkpointing warning filter implementation * Input grads fix for gemma3 (unslothai#96) * gemma require gradients fix * Update peft_utils.py --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com> * Update vision_utils.py * Vision requires grad * Check SDPA for Mistral / Pixtral * Update compiler.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update __init__.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vllm_utils.py (unslothai#99) Fix bugs in generate_batches.py.Original output = [] will result in duplication of results. * Update vision_utils.py * Fixes to support IterableDataset (unslothai#98) * Support Iterable Datasets * Update dataset_utils.py * Update dataset_utils.py * Update dataset_utils.py * Update dataset_utils.py * Preserve batch size from iterable dataset * Preserve batch size from iterable dataset * Support train_on_response_only with IterableDataset * Support train_on_response_only with IterableDataset * Support train_on_response_only with IterableDataset * Support train_on_response_only with IterableDataset --------- Co-authored-by: Mukkesh Ganesh <mukmckenzie@gmail.com> Co-authored-by: Edd <68678137+Erland366@users.noreply.github.com> Co-authored-by: Brad Hilton <brad.hilton.nw@gmail.com> Co-authored-by: SpaceHunter <30568250+SpaceHunterInf@users.noreply.github.com> Co-authored-by: Xiaochen Zhu <xz479@cl.cam.ac.uk> Co-authored-by: Roland Tannous <rolandtannous@gonovel.co> Co-authored-by: DoubleMathew <mmathew23@gmail.com> Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com> Co-authored-by: Qian Wu <121997440+5k5000@users.noreply.github.com> Co-authored-by: marcandrelarochelle <marcandrelarochelle1820@gmail.com>
mmathew23
added a commit
to mmathew23/unsloth
that referenced
this pull request
Jun 25, 2025
* Update vision_utils.py * Update vision_utils.py * train on completions VLMs * Update dataset_utils.py * Update dataset_utils.py * Update dataset_utils.py * Update dataset_utils.py * VLM train only on completions * Update loss_utils.py * Update dataset_utils.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update saving_utils.py * Update llama_cpp.py * Update llama_cpp.py * Update saving_utils.py * Update saving_utils.py * Update __init__.py * Update compiler.py * Update loss_utils.py * Update compiler.py * Update loss_utils.py * Update loss_utils.py * Update llama_cpp.py * Update loss_utils.py * Update compiler.py * Update llama_cpp.py * Update compiler.py * Update vllm_utils.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update training_utils.py * Update dataset_utils.py * Update dataset_utils.py * Revert "Update dataset_utils.py" This reverts commit 3b690ad. * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update compiler.py * Remove prints * Update compiler.py * Update saving_utils.py * Update temporary_patches.py * Update __init__.py * Update pyproject.toml * Update vllm_utils.py * bug fix unslothai#2008 unsloth issue - load_in_4bit = True + fast_inference = True (unslothai#79) * bug fix unslothai#2008 unsloth * non-quant dtype fix * Update vllm_utils.py --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com> * Update dataset_utils.py * Update compiler.py * Update temporary_patches.py * Gemma 3 fixes * Update temporary_patches.py * Update compiler.py * Update compiler.py * Gemma 3 fixes * Update patching_utils.py * Update compiler.py * Update compiler.py * Update patching_utils.py * Update temporary_patches.py * Update compiler.py * Update compiler.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * compiler * Update gradient_checkpointing.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * causal mask dtype * Fix checkpoint and save from local file (unslothai#74) * Enhance gradient checkpointing and add original model ID retrieval in saving utilities * In case adapter_config.json as well * Update patching_utils.py * Update patching_utils.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update loss_utils.py * Update compiler.py * Update vllm_utils.py * Update compiler.py * Update peft_utils.py * Update rl_replacements.py * Update vllm_utils.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update vllm_lora_worker_manager.py * Update utils.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update dataset_utils.py * bidirectional attention * Update vllm_utils.py * Update __init__.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_lora_worker_manager.py * Update vllm_lora_worker_manager.py * Update vllm_lora_worker_manager.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update loss_utils.py * Update loss_utils.py * Update loss_utils.py * Update loss_utils.py * Update loss_utils.py * Update __init__.py * fix: AsyncLLMEngine bugs (unslothai#82) * fixed a typo in L119, removing unnecessary len() (unslothai#84) Co-authored-by: Xiaochen Zhu <xz479@cl.cam.ac.uk> * Fix gradient checkpointing warning filter implementation * Input grads fix for gemma3 (unslothai#96) * gemma require gradients fix * Update peft_utils.py --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com> * Update vision_utils.py * Vision requires grad * Check SDPA for Mistral / Pixtral * Update compiler.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update __init__.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vllm_utils.py (unslothai#99) Fix bugs in generate_batches.py.Original output = [] will result in duplication of results. * Update vision_utils.py * Fixes to support IterableDataset (unslothai#98) * Support Iterable Datasets * Update dataset_utils.py * Update dataset_utils.py * Update dataset_utils.py * Update dataset_utils.py * Preserve batch size from iterable dataset * Preserve batch size from iterable dataset * Support train_on_response_only with IterableDataset * Support train_on_response_only with IterableDataset * Support train_on_response_only with IterableDataset * Support train_on_response_only with IterableDataset * Update vllm_utils.py * Create vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * vLLM for Qwen 3 * Update vllm_utils.py * Update vllm_utils.py --------- Co-authored-by: Mukkesh Ganesh <mukmckenzie@gmail.com> Co-authored-by: Edd <68678137+Erland366@users.noreply.github.com> Co-authored-by: Brad Hilton <brad.hilton.nw@gmail.com> Co-authored-by: SpaceHunter <30568250+SpaceHunterInf@users.noreply.github.com> Co-authored-by: Xiaochen Zhu <xz479@cl.cam.ac.uk> Co-authored-by: Roland Tannous <rolandtannous@gonovel.co> Co-authored-by: DoubleMathew <mmathew23@gmail.com> Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com> Co-authored-by: Qian Wu <121997440+5k5000@users.noreply.github.com> Co-authored-by: marcandrelarochelle <marcandrelarochelle1820@gmail.com>
mmathew23
added a commit
to mmathew23/unsloth
that referenced
this pull request
Jun 25, 2025
* Update compiler.py * Update compiler.py * Update compiler.py * Update saving_utils.py * Update llama_cpp.py * Update llama_cpp.py * Update saving_utils.py * Update saving_utils.py * Update __init__.py * Update compiler.py * Update loss_utils.py * Update compiler.py * Update loss_utils.py * Update loss_utils.py * Update llama_cpp.py * Update loss_utils.py * Update compiler.py * Update llama_cpp.py * Update compiler.py * Update vllm_utils.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update training_utils.py * Update dataset_utils.py * Update dataset_utils.py * Revert "Update dataset_utils.py" This reverts commit 3b690ad. * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update compiler.py * Remove prints * Update compiler.py * Update saving_utils.py * Update temporary_patches.py * Update __init__.py * Update pyproject.toml * Update vllm_utils.py * bug fix unslothai#2008 unsloth issue - load_in_4bit = True + fast_inference = True (unslothai#79) * bug fix unslothai#2008 unsloth * non-quant dtype fix * Update vllm_utils.py --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com> * Update dataset_utils.py * Update compiler.py * Update temporary_patches.py * Gemma 3 fixes * Update temporary_patches.py * Update compiler.py * Update compiler.py * Gemma 3 fixes * Update patching_utils.py * Update compiler.py * Update compiler.py * Update patching_utils.py * Update temporary_patches.py * Update compiler.py * Update compiler.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * compiler * Update gradient_checkpointing.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * causal mask dtype * Fix checkpoint and save from local file (unslothai#74) * Enhance gradient checkpointing and add original model ID retrieval in saving utilities * In case adapter_config.json as well * Update patching_utils.py * Update patching_utils.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update loss_utils.py * Update compiler.py * Update vllm_utils.py * Update compiler.py * Update peft_utils.py * Update rl_replacements.py * Update vllm_utils.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update vllm_lora_worker_manager.py * Update utils.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update dataset_utils.py * bidirectional attention * Update vllm_utils.py * Update __init__.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_lora_worker_manager.py * Update vllm_lora_worker_manager.py * Update vllm_lora_worker_manager.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update loss_utils.py * Update loss_utils.py * Update loss_utils.py * Update loss_utils.py * Update loss_utils.py * Update __init__.py * fix: AsyncLLMEngine bugs (unslothai#82) * fixed a typo in L119, removing unnecessary len() (unslothai#84) Co-authored-by: Xiaochen Zhu <xz479@cl.cam.ac.uk> * Fix gradient checkpointing warning filter implementation * Input grads fix for gemma3 (unslothai#96) * gemma require gradients fix * Update peft_utils.py --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com> * Update vision_utils.py * Vision requires grad * Check SDPA for Mistral / Pixtral * Update compiler.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update __init__.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vllm_utils.py (unslothai#99) Fix bugs in generate_batches.py.Original output = [] will result in duplication of results. * Update vision_utils.py * Fixes to support IterableDataset (unslothai#98) * Support Iterable Datasets * Update dataset_utils.py * Update dataset_utils.py * Update dataset_utils.py * Update dataset_utils.py * Preserve batch size from iterable dataset * Preserve batch size from iterable dataset * Support train_on_response_only with IterableDataset * Support train_on_response_only with IterableDataset * Support train_on_response_only with IterableDataset * Support train_on_response_only with IterableDataset * Update vllm_utils.py * Create vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * vLLM for Qwen 3 * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py --------- Co-authored-by: Mukkesh Ganesh <mukmckenzie@gmail.com> Co-authored-by: Edd <68678137+Erland366@users.noreply.github.com> Co-authored-by: Brad Hilton <brad.hilton.nw@gmail.com> Co-authored-by: SpaceHunter <30568250+SpaceHunterInf@users.noreply.github.com> Co-authored-by: Xiaochen Zhu <xz479@cl.cam.ac.uk> Co-authored-by: Roland Tannous <rolandtannous@gonovel.co> Co-authored-by: DoubleMathew <mmathew23@gmail.com> Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com> Co-authored-by: Qian Wu <121997440+5k5000@users.noreply.github.com> Co-authored-by: marcandrelarochelle <marcandrelarochelle1820@gmail.com>
mmathew23
added a commit
to mmathew23/unsloth
that referenced
this pull request
Jun 25, 2025
* bug fix unslothai#2008 unsloth issue - load_in_4bit = True + fast_inference = True (unslothai#79) * bug fix unslothai#2008 unsloth * non-quant dtype fix * Update vllm_utils.py --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com> * Update dataset_utils.py * Update compiler.py * Update temporary_patches.py * Gemma 3 fixes * Update temporary_patches.py * Update compiler.py * Update compiler.py * Gemma 3 fixes * Update patching_utils.py * Update compiler.py * Update compiler.py * Update patching_utils.py * Update temporary_patches.py * Update compiler.py * Update compiler.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * compiler * Update gradient_checkpointing.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * causal mask dtype * Fix checkpoint and save from local file (unslothai#74) * Enhance gradient checkpointing and add original model ID retrieval in saving utilities * In case adapter_config.json as well * Update patching_utils.py * Update patching_utils.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update loss_utils.py * Update compiler.py * Update vllm_utils.py * Update compiler.py * Update peft_utils.py * Update rl_replacements.py * Update vllm_utils.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update vllm_lora_worker_manager.py * Update utils.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update dataset_utils.py * bidirectional attention * Update vllm_utils.py * Update __init__.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_lora_worker_manager.py * Update vllm_lora_worker_manager.py * Update vllm_lora_worker_manager.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update loss_utils.py * Update loss_utils.py * Update loss_utils.py * Update loss_utils.py * Update loss_utils.py * Update __init__.py * fix: AsyncLLMEngine bugs (unslothai#82) * fixed a typo in L119, removing unnecessary len() (unslothai#84) Co-authored-by: Xiaochen Zhu <xz479@cl.cam.ac.uk> * Fix gradient checkpointing warning filter implementation * Input grads fix for gemma3 (unslothai#96) * gemma require gradients fix * Update peft_utils.py --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com> * Update vision_utils.py * Vision requires grad * Check SDPA for Mistral / Pixtral * Update compiler.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update __init__.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vllm_utils.py (unslothai#99) Fix bugs in generate_batches.py.Original output = [] will result in duplication of results. * Update vision_utils.py * Fixes to support IterableDataset (unslothai#98) * Support Iterable Datasets * Update dataset_utils.py * Update dataset_utils.py * Update dataset_utils.py * Update dataset_utils.py * Preserve batch size from iterable dataset * Preserve batch size from iterable dataset * Support train_on_response_only with IterableDataset * Support train_on_response_only with IterableDataset * Support train_on_response_only with IterableDataset * Support train_on_response_only with IterableDataset * Update vllm_utils.py * Create vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * vLLM for Qwen 3 * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update compiler.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Swap space reduce * Update vllm_utils.py * Update vllm_utils.py * Update rl_replacements.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update __init__.py --------- Co-authored-by: Mukkesh Ganesh <mukmckenzie@gmail.com> Co-authored-by: Edd <68678137+Erland366@users.noreply.github.com> Co-authored-by: Brad Hilton <brad.hilton.nw@gmail.com> Co-authored-by: SpaceHunter <30568250+SpaceHunterInf@users.noreply.github.com> Co-authored-by: Xiaochen Zhu <xz479@cl.cam.ac.uk> Co-authored-by: Roland Tannous <rolandtannous@gonovel.co> Co-authored-by: DoubleMathew <mmathew23@gmail.com> Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com> Co-authored-by: Qian Wu <121997440+5k5000@users.noreply.github.com> Co-authored-by: marcandrelarochelle <marcandrelarochelle1820@gmail.com>
mmathew23
added a commit
to mmathew23/unsloth
that referenced
this pull request
Jun 25, 2025
* Update compiler.py * Update patching_utils.py * Update temporary_patches.py * Update compiler.py * Update compiler.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * compiler * Update gradient_checkpointing.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * causal mask dtype * Fix checkpoint and save from local file (unslothai#74) * Enhance gradient checkpointing and add original model ID retrieval in saving utilities * In case adapter_config.json as well * Update patching_utils.py * Update patching_utils.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update loss_utils.py * Update compiler.py * Update vllm_utils.py * Update compiler.py * Update peft_utils.py * Update rl_replacements.py * Update vllm_utils.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update vllm_lora_worker_manager.py * Update utils.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update dataset_utils.py * bidirectional attention * Update vllm_utils.py * Update __init__.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_lora_worker_manager.py * Update vllm_lora_worker_manager.py * Update vllm_lora_worker_manager.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update loss_utils.py * Update loss_utils.py * Update loss_utils.py * Update loss_utils.py * Update loss_utils.py * Update __init__.py * fix: AsyncLLMEngine bugs (unslothai#82) * fixed a typo in L119, removing unnecessary len() (unslothai#84) Co-authored-by: Xiaochen Zhu <xz479@cl.cam.ac.uk> * Fix gradient checkpointing warning filter implementation * Input grads fix for gemma3 (unslothai#96) * gemma require gradients fix * Update peft_utils.py --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com> * Update vision_utils.py * Vision requires grad * Check SDPA for Mistral / Pixtral * Update compiler.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update __init__.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vllm_utils.py (unslothai#99) Fix bugs in generate_batches.py.Original output = [] will result in duplication of results. * Update vision_utils.py * Fixes to support IterableDataset (unslothai#98) * Support Iterable Datasets * Update dataset_utils.py * Update dataset_utils.py * Update dataset_utils.py * Update dataset_utils.py * Preserve batch size from iterable dataset * Preserve batch size from iterable dataset * Support train_on_response_only with IterableDataset * Support train_on_response_only with IterableDataset * Support train_on_response_only with IterableDataset * Support train_on_response_only with IterableDataset * Update vllm_utils.py * Create vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * vLLM for Qwen 3 * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update compiler.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Swap space reduce * Update vllm_utils.py * Update vllm_utils.py * Update rl_replacements.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update __init__.py * Update rl_replacements.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update rl_replacements.py * Update vllm_utils.py * Update rl_replacements.py * Revert "Update rl_replacements.py" This reverts commit c0a4022. * Update __init__.py --------- Co-authored-by: Edd <68678137+Erland366@users.noreply.github.com> Co-authored-by: Brad Hilton <brad.hilton.nw@gmail.com> Co-authored-by: SpaceHunter <30568250+SpaceHunterInf@users.noreply.github.com> Co-authored-by: Xiaochen Zhu <xz479@cl.cam.ac.uk> Co-authored-by: Roland Tannous <rolandtannous@gonovel.co> Co-authored-by: DoubleMathew <mmathew23@gmail.com> Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com> Co-authored-by: Qian Wu <121997440+5k5000@users.noreply.github.com> Co-authored-by: marcandrelarochelle <marcandrelarochelle1820@gmail.com>
mmathew23
added a commit
to mmathew23/unsloth
that referenced
this pull request
Jun 25, 2025
* Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * causal mask dtype * Fix checkpoint and save from local file (unslothai#74) * Enhance gradient checkpointing and add original model ID retrieval in saving utilities * In case adapter_config.json as well * Update patching_utils.py * Update patching_utils.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update loss_utils.py * Update compiler.py * Update vllm_utils.py * Update compiler.py * Update peft_utils.py * Update rl_replacements.py * Update vllm_utils.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update vllm_lora_worker_manager.py * Update utils.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update dataset_utils.py * bidirectional attention * Update vllm_utils.py * Update __init__.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_lora_worker_manager.py * Update vllm_lora_worker_manager.py * Update vllm_lora_worker_manager.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update loss_utils.py * Update loss_utils.py * Update loss_utils.py * Update loss_utils.py * Update loss_utils.py * Update __init__.py * fix: AsyncLLMEngine bugs (unslothai#82) * fixed a typo in L119, removing unnecessary len() (unslothai#84) Co-authored-by: Xiaochen Zhu <xz479@cl.cam.ac.uk> * Fix gradient checkpointing warning filter implementation * Input grads fix for gemma3 (unslothai#96) * gemma require gradients fix * Update peft_utils.py --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com> * Update vision_utils.py * Vision requires grad * Check SDPA for Mistral / Pixtral * Update compiler.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update __init__.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vllm_utils.py (unslothai#99) Fix bugs in generate_batches.py.Original output = [] will result in duplication of results. * Update vision_utils.py * Fixes to support IterableDataset (unslothai#98) * Support Iterable Datasets * Update dataset_utils.py * Update dataset_utils.py * Update dataset_utils.py * Update dataset_utils.py * Preserve batch size from iterable dataset * Preserve batch size from iterable dataset * Support train_on_response_only with IterableDataset * Support train_on_response_only with IterableDataset * Support train_on_response_only with IterableDataset * Support train_on_response_only with IterableDataset * Update vllm_utils.py * Create vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * vLLM for Qwen 3 * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update compiler.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Swap space reduce * Update vllm_utils.py * Update vllm_utils.py * Update rl_replacements.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update __init__.py * Update rl_replacements.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update rl_replacements.py * Update vllm_utils.py * Update rl_replacements.py * Revert "Update rl_replacements.py" This reverts commit c0a4022. * Update __init__.py * Update patching_utils.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Fixes * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * revert * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update __init__.py * Update compiler.py * Update temporary_patches.py * Update compiler.py * Update temporary_patches.py --------- Co-authored-by: Edd <68678137+Erland366@users.noreply.github.com> Co-authored-by: Brad Hilton <brad.hilton.nw@gmail.com> Co-authored-by: SpaceHunter <30568250+SpaceHunterInf@users.noreply.github.com> Co-authored-by: Xiaochen Zhu <xz479@cl.cam.ac.uk> Co-authored-by: Roland Tannous <rolandtannous@gonovel.co> Co-authored-by: DoubleMathew <mmathew23@gmail.com> Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com> Co-authored-by: Qian Wu <121997440+5k5000@users.noreply.github.com> Co-authored-by: marcandrelarochelle <marcandrelarochelle1820@gmail.com>
mmathew23
added a commit
to mmathew23/unsloth
that referenced
this pull request
Jul 7, 2025
* gemma require gradients fix * Update peft_utils.py --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com>
mmathew23
added a commit
to mmathew23/unsloth
that referenced
this pull request
Jul 7, 2025
* Update dataset_utils.py * Update dataset_utils.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update loss_utils.py * Update loss_utils.py * gpu_memory_utilization * Update temporary_patches.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * train on completions VLMs * Update dataset_utils.py * Update dataset_utils.py * Update dataset_utils.py * Update dataset_utils.py * VLM train only on completions * Update loss_utils.py * Update dataset_utils.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update saving_utils.py * Update llama_cpp.py * Update llama_cpp.py * Update saving_utils.py * Update saving_utils.py * Update __init__.py * Update compiler.py * Update loss_utils.py * Update compiler.py * Update loss_utils.py * Update loss_utils.py * Update llama_cpp.py * Update loss_utils.py * Update compiler.py * Update llama_cpp.py * Update compiler.py * Update vllm_utils.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update training_utils.py * Update dataset_utils.py * Update dataset_utils.py * Revert "Update dataset_utils.py" This reverts commit 7da25fe. * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update compiler.py * Remove prints * Update compiler.py * Update saving_utils.py * Update temporary_patches.py * Update __init__.py * Update pyproject.toml * Update vllm_utils.py * bug fix unslothai#2008 unsloth issue - load_in_4bit = True + fast_inference = True (unslothai#79) * bug fix unslothai#2008 unsloth * non-quant dtype fix * Update vllm_utils.py --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com> * Update dataset_utils.py * Update compiler.py * Update temporary_patches.py * Gemma 3 fixes * Update temporary_patches.py * Update compiler.py * Update compiler.py * Gemma 3 fixes * Update patching_utils.py * Update compiler.py * Update compiler.py * Update patching_utils.py * Update temporary_patches.py * Update compiler.py * Update compiler.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * compiler * Update gradient_checkpointing.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * causal mask dtype * Fix checkpoint and save from local file (unslothai#74) * Enhance gradient checkpointing and add original model ID retrieval in saving utilities * In case adapter_config.json as well * Update patching_utils.py * Update patching_utils.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update loss_utils.py * Update compiler.py * Update vllm_utils.py * Update compiler.py * Update peft_utils.py * Update rl_replacements.py * Update vllm_utils.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update vllm_lora_worker_manager.py * Update utils.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update dataset_utils.py * bidirectional attention * Update vllm_utils.py * Update __init__.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_lora_worker_manager.py * Update vllm_lora_worker_manager.py * Update vllm_lora_worker_manager.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update loss_utils.py * Update loss_utils.py * Update loss_utils.py * Update loss_utils.py * Update loss_utils.py * Update __init__.py * fix: AsyncLLMEngine bugs (unslothai#82) * fixed a typo in L119, removing unnecessary len() (unslothai#84) Co-authored-by: Xiaochen Zhu <xz479@cl.cam.ac.uk> * Fix gradient checkpointing warning filter implementation * Input grads fix for gemma3 (unslothai#96) * gemma require gradients fix * Update peft_utils.py --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com> * Update vision_utils.py * Vision requires grad * Check SDPA for Mistral / Pixtral * Update compiler.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update __init__.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vllm_utils.py (unslothai#99) Fix bugs in generate_batches.py.Original output = [] will result in duplication of results. * Update vision_utils.py * Fixes to support IterableDataset (unslothai#98) * Support Iterable Datasets * Update dataset_utils.py * Update dataset_utils.py * Update dataset_utils.py * Update dataset_utils.py * Preserve batch size from iterable dataset * Preserve batch size from iterable dataset * Support train_on_response_only with IterableDataset * Support train_on_response_only with IterableDataset * Support train_on_response_only with IterableDataset * Support train_on_response_only with IterableDataset --------- Co-authored-by: Mukkesh Ganesh <mukmckenzie@gmail.com> Co-authored-by: Edd <68678137+Erland366@users.noreply.github.com> Co-authored-by: Brad Hilton <brad.hilton.nw@gmail.com> Co-authored-by: SpaceHunter <30568250+SpaceHunterInf@users.noreply.github.com> Co-authored-by: Xiaochen Zhu <xz479@cl.cam.ac.uk> Co-authored-by: Roland Tannous <rolandtannous@gonovel.co> Co-authored-by: DoubleMathew <mmathew23@gmail.com> Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com> Co-authored-by: Qian Wu <121997440+5k5000@users.noreply.github.com> Co-authored-by: marcandrelarochelle <marcandrelarochelle1820@gmail.com>
mmathew23
added a commit
to mmathew23/unsloth
that referenced
this pull request
Jul 7, 2025
* Update vision_utils.py * Update vision_utils.py * train on completions VLMs * Update dataset_utils.py * Update dataset_utils.py * Update dataset_utils.py * Update dataset_utils.py * VLM train only on completions * Update loss_utils.py * Update dataset_utils.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update saving_utils.py * Update llama_cpp.py * Update llama_cpp.py * Update saving_utils.py * Update saving_utils.py * Update __init__.py * Update compiler.py * Update loss_utils.py * Update compiler.py * Update loss_utils.py * Update loss_utils.py * Update llama_cpp.py * Update loss_utils.py * Update compiler.py * Update llama_cpp.py * Update compiler.py * Update vllm_utils.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update training_utils.py * Update dataset_utils.py * Update dataset_utils.py * Revert "Update dataset_utils.py" This reverts commit 7da25fe. * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update compiler.py * Remove prints * Update compiler.py * Update saving_utils.py * Update temporary_patches.py * Update __init__.py * Update pyproject.toml * Update vllm_utils.py * bug fix unslothai#2008 unsloth issue - load_in_4bit = True + fast_inference = True (unslothai#79) * bug fix unslothai#2008 unsloth * non-quant dtype fix * Update vllm_utils.py --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com> * Update dataset_utils.py * Update compiler.py * Update temporary_patches.py * Gemma 3 fixes * Update temporary_patches.py * Update compiler.py * Update compiler.py * Gemma 3 fixes * Update patching_utils.py * Update compiler.py * Update compiler.py * Update patching_utils.py * Update temporary_patches.py * Update compiler.py * Update compiler.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * compiler * Update gradient_checkpointing.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * causal mask dtype * Fix checkpoint and save from local file (unslothai#74) * Enhance gradient checkpointing and add original model ID retrieval in saving utilities * In case adapter_config.json as well * Update patching_utils.py * Update patching_utils.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update loss_utils.py * Update compiler.py * Update vllm_utils.py * Update compiler.py * Update peft_utils.py * Update rl_replacements.py * Update vllm_utils.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update vllm_lora_worker_manager.py * Update utils.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update dataset_utils.py * bidirectional attention * Update vllm_utils.py * Update __init__.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_lora_worker_manager.py * Update vllm_lora_worker_manager.py * Update vllm_lora_worker_manager.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update loss_utils.py * Update loss_utils.py * Update loss_utils.py * Update loss_utils.py * Update loss_utils.py * Update __init__.py * fix: AsyncLLMEngine bugs (unslothai#82) * fixed a typo in L119, removing unnecessary len() (unslothai#84) Co-authored-by: Xiaochen Zhu <xz479@cl.cam.ac.uk> * Fix gradient checkpointing warning filter implementation * Input grads fix for gemma3 (unslothai#96) * gemma require gradients fix * Update peft_utils.py --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com> * Update vision_utils.py * Vision requires grad * Check SDPA for Mistral / Pixtral * Update compiler.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update __init__.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vllm_utils.py (unslothai#99) Fix bugs in generate_batches.py.Original output = [] will result in duplication of results. * Update vision_utils.py * Fixes to support IterableDataset (unslothai#98) * Support Iterable Datasets * Update dataset_utils.py * Update dataset_utils.py * Update dataset_utils.py * Update dataset_utils.py * Preserve batch size from iterable dataset * Preserve batch size from iterable dataset * Support train_on_response_only with IterableDataset * Support train_on_response_only with IterableDataset * Support train_on_response_only with IterableDataset * Support train_on_response_only with IterableDataset * Update vllm_utils.py * Create vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * vLLM for Qwen 3 * Update vllm_utils.py * Update vllm_utils.py --------- Co-authored-by: Mukkesh Ganesh <mukmckenzie@gmail.com> Co-authored-by: Edd <68678137+Erland366@users.noreply.github.com> Co-authored-by: Brad Hilton <brad.hilton.nw@gmail.com> Co-authored-by: SpaceHunter <30568250+SpaceHunterInf@users.noreply.github.com> Co-authored-by: Xiaochen Zhu <xz479@cl.cam.ac.uk> Co-authored-by: Roland Tannous <rolandtannous@gonovel.co> Co-authored-by: DoubleMathew <mmathew23@gmail.com> Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com> Co-authored-by: Qian Wu <121997440+5k5000@users.noreply.github.com> Co-authored-by: marcandrelarochelle <marcandrelarochelle1820@gmail.com>
mmathew23
added a commit
to mmathew23/unsloth
that referenced
this pull request
Jul 7, 2025
* Update compiler.py * Update compiler.py * Update compiler.py * Update saving_utils.py * Update llama_cpp.py * Update llama_cpp.py * Update saving_utils.py * Update saving_utils.py * Update __init__.py * Update compiler.py * Update loss_utils.py * Update compiler.py * Update loss_utils.py * Update loss_utils.py * Update llama_cpp.py * Update loss_utils.py * Update compiler.py * Update llama_cpp.py * Update compiler.py * Update vllm_utils.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update training_utils.py * Update dataset_utils.py * Update dataset_utils.py * Revert "Update dataset_utils.py" This reverts commit 7da25fe. * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update compiler.py * Remove prints * Update compiler.py * Update saving_utils.py * Update temporary_patches.py * Update __init__.py * Update pyproject.toml * Update vllm_utils.py * bug fix unslothai#2008 unsloth issue - load_in_4bit = True + fast_inference = True (unslothai#79) * bug fix unslothai#2008 unsloth * non-quant dtype fix * Update vllm_utils.py --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com> * Update dataset_utils.py * Update compiler.py * Update temporary_patches.py * Gemma 3 fixes * Update temporary_patches.py * Update compiler.py * Update compiler.py * Gemma 3 fixes * Update patching_utils.py * Update compiler.py * Update compiler.py * Update patching_utils.py * Update temporary_patches.py * Update compiler.py * Update compiler.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * compiler * Update gradient_checkpointing.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * causal mask dtype * Fix checkpoint and save from local file (unslothai#74) * Enhance gradient checkpointing and add original model ID retrieval in saving utilities * In case adapter_config.json as well * Update patching_utils.py * Update patching_utils.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update loss_utils.py * Update compiler.py * Update vllm_utils.py * Update compiler.py * Update peft_utils.py * Update rl_replacements.py * Update vllm_utils.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update vllm_lora_worker_manager.py * Update utils.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update dataset_utils.py * bidirectional attention * Update vllm_utils.py * Update __init__.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_lora_worker_manager.py * Update vllm_lora_worker_manager.py * Update vllm_lora_worker_manager.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update loss_utils.py * Update loss_utils.py * Update loss_utils.py * Update loss_utils.py * Update loss_utils.py * Update __init__.py * fix: AsyncLLMEngine bugs (unslothai#82) * fixed a typo in L119, removing unnecessary len() (unslothai#84) Co-authored-by: Xiaochen Zhu <xz479@cl.cam.ac.uk> * Fix gradient checkpointing warning filter implementation * Input grads fix for gemma3 (unslothai#96) * gemma require gradients fix * Update peft_utils.py --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com> * Update vision_utils.py * Vision requires grad * Check SDPA for Mistral / Pixtral * Update compiler.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update __init__.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vllm_utils.py (unslothai#99) Fix bugs in generate_batches.py.Original output = [] will result in duplication of results. * Update vision_utils.py * Fixes to support IterableDataset (unslothai#98) * Support Iterable Datasets * Update dataset_utils.py * Update dataset_utils.py * Update dataset_utils.py * Update dataset_utils.py * Preserve batch size from iterable dataset * Preserve batch size from iterable dataset * Support train_on_response_only with IterableDataset * Support train_on_response_only with IterableDataset * Support train_on_response_only with IterableDataset * Support train_on_response_only with IterableDataset * Update vllm_utils.py * Create vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * vLLM for Qwen 3 * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py --------- Co-authored-by: Mukkesh Ganesh <mukmckenzie@gmail.com> Co-authored-by: Edd <68678137+Erland366@users.noreply.github.com> Co-authored-by: Brad Hilton <brad.hilton.nw@gmail.com> Co-authored-by: SpaceHunter <30568250+SpaceHunterInf@users.noreply.github.com> Co-authored-by: Xiaochen Zhu <xz479@cl.cam.ac.uk> Co-authored-by: Roland Tannous <rolandtannous@gonovel.co> Co-authored-by: DoubleMathew <mmathew23@gmail.com> Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com> Co-authored-by: Qian Wu <121997440+5k5000@users.noreply.github.com> Co-authored-by: marcandrelarochelle <marcandrelarochelle1820@gmail.com>
mmathew23
added a commit
to mmathew23/unsloth
that referenced
this pull request
Jul 7, 2025
* bug fix unslothai#2008 unsloth issue - load_in_4bit = True + fast_inference = True (unslothai#79) * bug fix unslothai#2008 unsloth * non-quant dtype fix * Update vllm_utils.py --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com> * Update dataset_utils.py * Update compiler.py * Update temporary_patches.py * Gemma 3 fixes * Update temporary_patches.py * Update compiler.py * Update compiler.py * Gemma 3 fixes * Update patching_utils.py * Update compiler.py * Update compiler.py * Update patching_utils.py * Update temporary_patches.py * Update compiler.py * Update compiler.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * compiler * Update gradient_checkpointing.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * causal mask dtype * Fix checkpoint and save from local file (unslothai#74) * Enhance gradient checkpointing and add original model ID retrieval in saving utilities * In case adapter_config.json as well * Update patching_utils.py * Update patching_utils.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update loss_utils.py * Update compiler.py * Update vllm_utils.py * Update compiler.py * Update peft_utils.py * Update rl_replacements.py * Update vllm_utils.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update vllm_lora_worker_manager.py * Update utils.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update dataset_utils.py * bidirectional attention * Update vllm_utils.py * Update __init__.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_lora_worker_manager.py * Update vllm_lora_worker_manager.py * Update vllm_lora_worker_manager.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update loss_utils.py * Update loss_utils.py * Update loss_utils.py * Update loss_utils.py * Update loss_utils.py * Update __init__.py * fix: AsyncLLMEngine bugs (unslothai#82) * fixed a typo in L119, removing unnecessary len() (unslothai#84) Co-authored-by: Xiaochen Zhu <xz479@cl.cam.ac.uk> * Fix gradient checkpointing warning filter implementation * Input grads fix for gemma3 (unslothai#96) * gemma require gradients fix * Update peft_utils.py --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com> * Update vision_utils.py * Vision requires grad * Check SDPA for Mistral / Pixtral * Update compiler.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update __init__.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vllm_utils.py (unslothai#99) Fix bugs in generate_batches.py.Original output = [] will result in duplication of results. * Update vision_utils.py * Fixes to support IterableDataset (unslothai#98) * Support Iterable Datasets * Update dataset_utils.py * Update dataset_utils.py * Update dataset_utils.py * Update dataset_utils.py * Preserve batch size from iterable dataset * Preserve batch size from iterable dataset * Support train_on_response_only with IterableDataset * Support train_on_response_only with IterableDataset * Support train_on_response_only with IterableDataset * Support train_on_response_only with IterableDataset * Update vllm_utils.py * Create vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * vLLM for Qwen 3 * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update compiler.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Swap space reduce * Update vllm_utils.py * Update vllm_utils.py * Update rl_replacements.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update __init__.py --------- Co-authored-by: Mukkesh Ganesh <mukmckenzie@gmail.com> Co-authored-by: Edd <68678137+Erland366@users.noreply.github.com> Co-authored-by: Brad Hilton <brad.hilton.nw@gmail.com> Co-authored-by: SpaceHunter <30568250+SpaceHunterInf@users.noreply.github.com> Co-authored-by: Xiaochen Zhu <xz479@cl.cam.ac.uk> Co-authored-by: Roland Tannous <rolandtannous@gonovel.co> Co-authored-by: DoubleMathew <mmathew23@gmail.com> Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com> Co-authored-by: Qian Wu <121997440+5k5000@users.noreply.github.com> Co-authored-by: marcandrelarochelle <marcandrelarochelle1820@gmail.com>
mmathew23
added a commit
to mmathew23/unsloth
that referenced
this pull request
Jul 7, 2025
* Update compiler.py * Update patching_utils.py * Update temporary_patches.py * Update compiler.py * Update compiler.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * compiler * Update gradient_checkpointing.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * causal mask dtype * Fix checkpoint and save from local file (unslothai#74) * Enhance gradient checkpointing and add original model ID retrieval in saving utilities * In case adapter_config.json as well * Update patching_utils.py * Update patching_utils.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update loss_utils.py * Update compiler.py * Update vllm_utils.py * Update compiler.py * Update peft_utils.py * Update rl_replacements.py * Update vllm_utils.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update vllm_lora_worker_manager.py * Update utils.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update dataset_utils.py * bidirectional attention * Update vllm_utils.py * Update __init__.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_lora_worker_manager.py * Update vllm_lora_worker_manager.py * Update vllm_lora_worker_manager.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update loss_utils.py * Update loss_utils.py * Update loss_utils.py * Update loss_utils.py * Update loss_utils.py * Update __init__.py * fix: AsyncLLMEngine bugs (unslothai#82) * fixed a typo in L119, removing unnecessary len() (unslothai#84) Co-authored-by: Xiaochen Zhu <xz479@cl.cam.ac.uk> * Fix gradient checkpointing warning filter implementation * Input grads fix for gemma3 (unslothai#96) * gemma require gradients fix * Update peft_utils.py --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com> * Update vision_utils.py * Vision requires grad * Check SDPA for Mistral / Pixtral * Update compiler.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update __init__.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vllm_utils.py (unslothai#99) Fix bugs in generate_batches.py.Original output = [] will result in duplication of results. * Update vision_utils.py * Fixes to support IterableDataset (unslothai#98) * Support Iterable Datasets * Update dataset_utils.py * Update dataset_utils.py * Update dataset_utils.py * Update dataset_utils.py * Preserve batch size from iterable dataset * Preserve batch size from iterable dataset * Support train_on_response_only with IterableDataset * Support train_on_response_only with IterableDataset * Support train_on_response_only with IterableDataset * Support train_on_response_only with IterableDataset * Update vllm_utils.py * Create vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * vLLM for Qwen 3 * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update compiler.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Swap space reduce * Update vllm_utils.py * Update vllm_utils.py * Update rl_replacements.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update __init__.py * Update rl_replacements.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update rl_replacements.py * Update vllm_utils.py * Update rl_replacements.py * Revert "Update rl_replacements.py" This reverts commit c0a4022. * Update __init__.py --------- Co-authored-by: Edd <68678137+Erland366@users.noreply.github.com> Co-authored-by: Brad Hilton <brad.hilton.nw@gmail.com> Co-authored-by: SpaceHunter <30568250+SpaceHunterInf@users.noreply.github.com> Co-authored-by: Xiaochen Zhu <xz479@cl.cam.ac.uk> Co-authored-by: Roland Tannous <rolandtannous@gonovel.co> Co-authored-by: DoubleMathew <mmathew23@gmail.com> Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com> Co-authored-by: Qian Wu <121997440+5k5000@users.noreply.github.com> Co-authored-by: marcandrelarochelle <marcandrelarochelle1820@gmail.com>
mmathew23
added a commit
to mmathew23/unsloth
that referenced
this pull request
Jul 7, 2025
* Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * causal mask dtype * Fix checkpoint and save from local file (unslothai#74) * Enhance gradient checkpointing and add original model ID retrieval in saving utilities * In case adapter_config.json as well * Update patching_utils.py * Update patching_utils.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update loss_utils.py * Update compiler.py * Update vllm_utils.py * Update compiler.py * Update peft_utils.py * Update rl_replacements.py * Update vllm_utils.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update vllm_lora_worker_manager.py * Update utils.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update dataset_utils.py * bidirectional attention * Update vllm_utils.py * Update __init__.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_lora_worker_manager.py * Update vllm_lora_worker_manager.py * Update vllm_lora_worker_manager.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update loss_utils.py * Update loss_utils.py * Update loss_utils.py * Update loss_utils.py * Update loss_utils.py * Update __init__.py * fix: AsyncLLMEngine bugs (unslothai#82) * fixed a typo in L119, removing unnecessary len() (unslothai#84) Co-authored-by: Xiaochen Zhu <xz479@cl.cam.ac.uk> * Fix gradient checkpointing warning filter implementation * Input grads fix for gemma3 (unslothai#96) * gemma require gradients fix * Update peft_utils.py --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com> * Update vision_utils.py * Vision requires grad * Check SDPA for Mistral / Pixtral * Update compiler.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update __init__.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vision_utils.py * Update vllm_utils.py (unslothai#99) Fix bugs in generate_batches.py.Original output = [] will result in duplication of results. * Update vision_utils.py * Fixes to support IterableDataset (unslothai#98) * Support Iterable Datasets * Update dataset_utils.py * Update dataset_utils.py * Update dataset_utils.py * Update dataset_utils.py * Preserve batch size from iterable dataset * Preserve batch size from iterable dataset * Support train_on_response_only with IterableDataset * Support train_on_response_only with IterableDataset * Support train_on_response_only with IterableDataset * Support train_on_response_only with IterableDataset * Update vllm_utils.py * Create vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * Update vllm_rlhf_utils.py * vLLM for Qwen 3 * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update compiler.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Swap space reduce * Update vllm_utils.py * Update vllm_utils.py * Update rl_replacements.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update __init__.py * Update rl_replacements.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update vllm_utils.py * Update rl_replacements.py * Update vllm_utils.py * Update rl_replacements.py * Revert "Update rl_replacements.py" This reverts commit c0a4022. * Update __init__.py * Update patching_utils.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Fixes * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update compiler.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update compiler.py * revert * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update temporary_patches.py * Update __init__.py * Update compiler.py * Update temporary_patches.py * Update compiler.py * Update temporary_patches.py --------- Co-authored-by: Edd <68678137+Erland366@users.noreply.github.com> Co-authored-by: Brad Hilton <brad.hilton.nw@gmail.com> Co-authored-by: SpaceHunter <30568250+SpaceHunterInf@users.noreply.github.com> Co-authored-by: Xiaochen Zhu <xz479@cl.cam.ac.uk> Co-authored-by: Roland Tannous <rolandtannous@gonovel.co> Co-authored-by: DoubleMathew <mmathew23@gmail.com> Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com> Co-authored-by: Qian Wu <121997440+5k5000@users.noreply.github.com> Co-authored-by: marcandrelarochelle <marcandrelarochelle1820@gmail.com>
rolandtannous
added a commit
to rolandtannous/unsloth
that referenced
this pull request
Mar 11, 2026
…-ordered Added the inference defaults for models
rolandtannous
added a commit
that referenced
this pull request
Mar 12, 2026
Added the inference defaults for models
Stanley00
pushed a commit
to stanley-fork/unsloth
that referenced
this pull request
Mar 12, 2026
…-ordered Added the inference defaults for models
abiswas-realadvice
pushed a commit
to abiswas-realadvice/unsloth
that referenced
this pull request
May 14, 2026
* Fix tokenizer, dropout, bias for LoRA * Update loader.py * Fix LoRA downcasting * Update _utils.py * Saving to GGUF * fix * colab_quantize_to_gguf * move save modules * save module * Update __init__.py * Update save.py * Temp downgrade due to TRL issue * Fix up bugs * Faster saving + other changes * Update llama.py * Saving modules * spelling * Update llama.py * Update save.py * Update save.py * Update loader.py * Update llama.py * patch saving * Update save.py * Update save.py * Update save.py * patch saving * Update save.py * Update save.py * Update save.py * Update save.py * Update save.py * Update save.py * Update save.py * Update save.py * Update save.py * Update save.py * Update save.py * Update save.py * Update save.py * Update save.py * Update save.py * original_model * Update save.py * Update save.py * Update save.py * Update save.py * Update save.py * Update save.py * Update save.py * Update save.py * Update save.py * Update save.py * Update save.py * Update save.py * Update save.py * Update save.py * Update save.py * Update save.py * Update save.py * Update save.py * Update save.py * Update save.py * Update save.py * Update save.py * Update save.py * saving to RAM leakage? * Update save.py * new_save_directory * Update save.py * Update save.py * Update save.py * Update save.py * Update pyproject.toml * Update pyproject.toml * Update pyproject.toml
abiswas-realadvice
pushed a commit
to abiswas-realadvice/unsloth
that referenced
this pull request
May 14, 2026
* HF Perf Button
* Update README.md
Adding new buttons cleanup
* Update README.md
* Delete images/Discord.png
* Delete images/try live demo green.png
* new transparent logos
* Revamping page
* Revamp mainpage
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* finetune button
* Delete start free finetune button.png
* free finetune button
* Add files via upload
* Update README.md
* Update README.md
* Add files via upload
* Add files via upload
* Update README.md
* Add files via upload
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Squashed commit of the following:
commit efa0d2332ebc6d8f215aec07d5cc9907f4e84f34
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Feb 4 17:35:56 2024 +1100
2x faster inference (#151)
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
* Fast inference repatch
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update mistral.py
* Update __init__.py
* Fix inference
* Update mistral.py
* fast lm_head
* Remove fast path
* Update rope_embedding.py
* Update loader.py
* LlamaAttention_fast_forward_inference
* if past_key_value is not None and q_len == 1:
* revert inference
* Update loader.py
* past_key_value
* Update llama.py
* Update llama.py
* Fix SDPA
* Update llama.py
* padding
* Inference
* Update llama.py
* Revert
* Update mistral.py
* faster inference
* inference
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* inference
* Update llama.py
* Update utils.py
* faster inference
* Update llama.py
* revert
* lm_head
* Update llama.py
* inference
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* faster inference
* Update llama.py
* fast inference
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* torch compile
* past_key_values
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update llama.py
* fast inference + saving config.json
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* fast inference again
* more temp matrices
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update mistral.py
* Update llama.py
* SDPA
* attention_mask
* New version
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update utils.py
* Update utils.py
commit 2f55935f941eb61816b145575389f91dde4e00f7
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 31 04:03:37 2024 +1100
Hotfix - fix inference (#146)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
* Fast inference repatch
* Update llama.py
* Update utils.py
* Update utils.py
* Update utils.py
* Update mistral.py
* Update __init__.py
* Fix inference
* Update mistral.py
* fast lm_head
* Remove fast path
* Update rope_embedding.py
* Update loader.py
* LlamaAttention_fast_forward_inference
* if past_key_value is not None and q_len == 1:
* revert inference
* Update loader.py
* past_key_value
commit a3a2ad93821cede32723843dfb3dfbfe0387d25e
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 17:49:54 2024 +1100
Fix inference attention mask (#142)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
* Update llama.py
* Update llama.py
commit 90309ca8dcb06f0611c1bde4a61eb08fb7317993
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 03:45:07 2024 +1100
Nightly (#140)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
* Mistral patch
* Update mistral.py
* Update save.py
* saving
commit a16bc73e8077fd3c6a034741ae782bcfeb9fa278
Author: Daniel Han <danielhanchen@gmail.com>
Date: Mon Jan 29 02:52:39 2024 +1100
Fix saving issues (#139)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
* Update mistral.py
* attention mask
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update dpo.py
* Patch saving
* Update save.py
* Update save.py
* patch_saving_functions
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* print
commit af332245543b1f9ac129b67e5c350047c967846d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 28 04:30:29 2024 +1100
1 more bug (#138)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
* Update save.py
* Update save.py
commit e2bbd3819e0899e09787a985cd11c08961f09c09
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 28 04:20:06 2024 +1100
Fix bugs + more accurate Swiglu (#137)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
* Works?
* Update pyproject.toml
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Swiglu
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* attention_mask
* Update llama.py
* Update llama.py
* labels
* Update mistral.py
* Update llama.py
* attention mask
commit a81aff286f1e67c82b2a5105679c85866f624629
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 27 04:50:22 2024 +1100
Inference bug fix (#134)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Revert "Update llama.py"
This reverts commit a208ec46e012cf470ecefe6268a66358215df7b6.
* Update llama.py
commit 7da0c50f757b6b2d9cbe660ee68d23700f2e2b0d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 27 04:47:54 2024 +1100
More bug fixes (#133)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update llama.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update swiglu.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update fast_lora.py
* Update save.py
* Update fast_lora.py
* Update utils.py
* Update llama.py
* Update fast_lora.py
* Update swiglu.py
* Update save.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
commit 62fae3aa740869db2fe1522ea38b334ef090d5e7
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 26 04:19:17 2024 +1100
Fix bugs (#129)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
* Update llama.py
* hidden_states
* q_len == 1
* q_len issue
* Update mistral.py
* Update mistral.py
* incorrect inference
* Update to transformers 4.37
* Graceful FA2 error + torch 2.1.1
* Update mapper.py
* Update pyproject.toml
* Fix saving and bnb-4bit
* Update fast_lora.py
* Update fast_lora.py
* remove patching
* Update llama.py
* Update llama.py
* Update swiglu.py
* Repatch
* Update fast_lora.py
commit 04f8771821a57fda5109d60b0fe49bb31d0df15b
Author: Daniel Han <danielhanchen@gmail.com>
Date: Tue Jan 23 03:55:24 2024 +1100
2-4x faster native HF inference (#119)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* fast inference
* Update llama.py
* Update save.py
* Update llama.py
* Mistral correct RoPE scaling
* Max sequence lengths
* Apache 2
* fast_linear_forward
* Update utils.py
* Update utils.py
* No print
* Update utils.py
* Update utils.py
* inference
* Update llama.py
* Fast inference RoPE
* Update llama.py
* Update llama.py
* RoPE
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* LoRA
* Fast LoRA saving
commit 3a9b2dee98fd0547789da9b68e765f054484abc4
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 21 22:20:22 2024 +1100
Hotfix (#118)
* faster saving & inference
* Update llama.py
* Update save.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update mistral.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
* Update llama.py
commit a6f4fb007510aeb2a86500d874f2117e81853d7e
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 05:00:37 2024 +1100
Update save.py
commit 705cac03576fe2fff3923841c102a8bd6b72a65b
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 04:21:54 2024 +1100
Update save.py
commit 16edcb3be2c328f3377aff6555e6435b28980a52
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sun Jan 21 04:13:03 2024 +1100
Update save.py
commit 3d05a74b12edd39638aacf3b44eca65818c6708a
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sun Jan 21 03:43:49 2024 +1100
Fixed saving! (#113)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
* Fix quantization_method
* Fix quantization_config
* patch model
* Update llama.py
* Update llama.py
* Update llama.py
* Update save.py
* Update save.py
* tokenizer_save_settings
* Update save.py
* quantization and loftq
* Update save.py
* Update llama.py
* Update save.py
* upload_to_huggingface
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
commit bb05d6b6e2af2c8807ae4842dcbc2805c9356599
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 20 23:23:00 2024 +1100
Hotfix for Jan 2024 Release (#110)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
* Fix quantization_method
* Fix quantization_config
* patch model
* Update llama.py
* Update llama.py
* Update llama.py
* Update save.py
* Update save.py
* tokenizer_save_settings
* Update save.py
* quantization and loftq
* Update save.py
* Update llama.py
* Update save.py
commit 12e75c93d040f99d5a0cc4c4ee162d804c9fbbf4
Author: Daniel Han <danielhanchen@gmail.com>
Date: Sat Jan 20 04:25:06 2024 +1100
Quick fixes (#106)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
* RSLoRA and LoftQ direct support
* Update llama.py
* Update llama.py
* Update llama.py
* Fix DPO + GGUF
commit 52b5ef31e0cdd96d5b980a1581d3c26c5b89c86c
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Sat Jan 20 02:30:31 2024 +1100
Update _utils.py
commit 1a19c38675a35e6121fa4a95438525f306bca26b
Merge: 0a52390 0d6e52b
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 23:15:38 2024 +1100
Merge branch 'main' of https://github.com/unslothai/unsloth
commit 0a52390ac29a78399b033349070fe1d1280bd296
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 23:15:20 2024 +1100
Revert quantization methods
commit 0d6e52b5c7723ed5c78b54c9a6eb67a1997f6038
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 22:57:22 2024 +1100
getattr issues (#103)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
* getattr
commit b3fcea642127ee381a3cf19d33fb8910d066643c
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 22:52:30 2024 +1100
Quick fixes (#101)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
* Quick fixes
* Update llama.py
* Update llama.py
* Update dpo.py
* Update dpo.py
* Update llama.py
* Update save.py
commit d691516ab9d64ea61b0af277f3955336a434694d
Author: Daniel Han <danielhanchen@gmail.com>
Date: Fri Jan 19 04:51:19 2024 +1100
2024 Release (#96)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
* Faster saving + other changes
* Update llama.py
* Saving modules
* spelling
* Update llama.py
* Update save.py
* Update save.py
* Update loader.py
* Update llama.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* patch saving
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* original_model
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* saving to RAM leakage?
* Update save.py
* new_save_directory
* Update save.py
* Update save.py
* Update save.py
* Update save.py
* Update pyproject.toml
* Update pyproject.toml
* Update pyproject.toml
commit 9e2dec16fb29ee97572b4431e892e3f7ca867422
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 03:41:00 2024 +1100
Update pyproject.toml
commit 396c7245dda2c913e6b97729fd34e7551dc8e9fa
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Fri Jan 19 03:35:17 2024 +1100
Update pyproject.toml
commit 738e91591f3fb39ce03238134fd0d82a84f4b2e3
Author: Daniel Han <danielhanchen@gmail.com>
Date: Thu Jan 11 04:08:03 2024 +1100
Fix some bugs (#83)
* Fix tokenizer, dropout, bias for LoRA
* Update loader.py
* Fix LoRA downcasting
* Update _utils.py
* Saving to GGUF
* fix
* colab_quantize_to_gguf
* move save modules
* save module
* Update __init__.py
* Update save.py
* Temp downgrade due to TRL issue
* Fix up bugs
commit a1da50b5ce53f8e57a1b01db607b32f4d0d862e5
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 10 23:10:48 2024 +1100
Update README.md (#81)
commit 606e8a928440f396601c1d57a003c0401ba26ec0
Author: shimmy <107991372+shimmyshimmer@users.noreply.github.com>
Date: Wed Jan 10 23:10:23 2024 +1100
Discord button redo (#80)
commit 0169294ffb19fdb877170529381f25bd0f83fc3c
Author: shimmy <107991372+shimmyshimmer@users.noreply.github.com>
Date: Wed Jan 10 23:02:20 2024 +1100
Update logos (#79)
* HF Perf Button
* Update README.md
Adding new buttons cleanup
* Update README.md
* Delete images/Discord.png
* Delete images/try live demo green.png
* new transparent logos
* Revamping page
* Revamp mainpage
* Update README.md
* Update README.md
commit b2a8c33430e4a31cf7baafe184d448bb50595bb1
Author: Daniel Han <danielhanchen@gmail.com>
Date: Wed Jan 10 20:03:01 2024 +1100
Create FUNDING.yml (#78)
commit c9c1abf29045b3831f62099ff03c5b54b99522a6
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Wed Jan 10 01:02:44 2024 +1100
fix_tokenizer
commit 6efffb46e42543986c637690a045092226af5d61
Author: Daniel Han-Chen <danielhanchen@gmail.com>
Date: Tue Jan 9 23:40:43 2024 +1100
check_tokenizer
---------
Co-authored-by: Daniel Han <danielhanchen@gmail.com>
danielhanchen
added a commit
that referenced
this pull request
May 19, 2026
Three feedback items rolled in:
1. gemini-code-assist (medium): the __IMAGES__ sentinel stripper in
safetensors_agentic.py used `rsplit("\n__IMAGES__:", 1)`, which
leaves the marker visible to the model when the sentinel appears at
the very start of the result (no leading newline) and when multiple
sentinels follow each other back-to-back. Switched to a
`split("__IMAGES__:", 1)[0].rstrip()` cut so the first occurrence
truncates the entire image block. Two new tests pin both edge
cases: leading sentinel and consecutive sentinels.
2. gemini-code-assist (medium): `_detect_safetensors_features` had a
bare `except Exception: pass` around the gpt-oss override probe.
Replaced with `logger.debug(..., exc_info=True)` so unexpected
classifier failures are at least visible in the structured log.
3. CodeQL py/stack-trace-exposure (alerts #95 and #96, CWE-209): the
safetensors tool stream and non-streaming tool completion paths
passed `_friendly_error(e)` into the SSE/JSON error response. The
helper itself never leaks a raw traceback, but with `tb` and `e` in
the same scope CodeQL flags the taint sink. Tightened both
handlers to log the exception server-side (logger.exception) and
emit a constant "An internal error occurred." string over the
wire. The GGUF tool stream handler is left as-is because it talks
to a managed llama-server with a known error surface that
`_friendly_error` already classifies safely.
Tests: 43 tool-loop + 11 capability-advertise + 190 adjacent
regression tests all green locally.
Also merged origin/main (47 commits) so the branch ships against the
current main rather than its base SHA from a week ago.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.