Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
362 commits
Select commit Hold shift + click to select a range
c7b2220
Update cross_entropy_loss.py
danielhanchen Nov 4, 2024
2d0ab26
Update cross_entropy_loss.py
danielhanchen Nov 4, 2024
428f662
Update cross_entropy_loss.py
danielhanchen Nov 4, 2024
5023ce9
Update cross_entropy_loss.py
danielhanchen Nov 4, 2024
5ca3d4a
Update cross_entropy_loss.py
danielhanchen Nov 4, 2024
3b32d81
int64
danielhanchen Nov 4, 2024
9bae6e2
Update _utils.py
danielhanchen Nov 4, 2024
5123623
Update cross_entropy_loss.py
danielhanchen Nov 4, 2024
4b1d9e2
constexpr
danielhanchen Nov 4, 2024
7d5111a
constexpr
danielhanchen Nov 4, 2024
dff5a52
Update cross_entropy_loss.py
danielhanchen Nov 4, 2024
969d1bd
Update cross_entropy_loss.py
danielhanchen Nov 4, 2024
4b5847f
Update _utils.py
danielhanchen Nov 4, 2024
766bf1e
Update _utils.py
danielhanchen Nov 4, 2024
646f1b7
Update _utils.py
danielhanchen Nov 5, 2024
97f37ac
CE
danielhanchen Nov 5, 2024
cc563fa
Update cross_entropy_loss.py
danielhanchen Nov 5, 2024
f643148
Update _utils.py
danielhanchen Nov 5, 2024
f28d7f6
Update llama.py
danielhanchen Nov 5, 2024
d8103e1
Update _utils.py
danielhanchen Nov 5, 2024
b9e1a49
Update rms_layernorm.py
danielhanchen Nov 5, 2024
56af302
Update rms_layernorm.py
danielhanchen Nov 5, 2024
a3c84a3
Update rms_layernorm.py
danielhanchen Nov 5, 2024
f7d5c56
Update rms_layernorm.py
danielhanchen Nov 5, 2024
8496ff6
Update rms_layernorm.py
danielhanchen Nov 5, 2024
2909eaf
Update rms_layernorm.py
danielhanchen Nov 5, 2024
afc8af6
Update utils.py
danielhanchen Nov 5, 2024
2d8d1e1
Update rms_layernorm.py
danielhanchen Nov 5, 2024
ecc1ad2
Update rms_layernorm.py
danielhanchen Nov 5, 2024
ae7cb78
Update rms_layernorm.py
danielhanchen Nov 5, 2024
22da266
Update rms_layernorm.py
danielhanchen Nov 5, 2024
beb6854
Update rms_layernorm.py
danielhanchen Nov 5, 2024
14c3d2f
Update rms_layernorm.py
danielhanchen Nov 5, 2024
ef4b079
Update rms_layernorm.py
danielhanchen Nov 5, 2024
ef684f8
Update rms_layernorm.py
danielhanchen Nov 5, 2024
3e4c42f
Update rms_layernorm.py
danielhanchen Nov 5, 2024
8f825eb
Update rms_layernorm.py
danielhanchen Nov 5, 2024
bd4ac7b
Update rms_layernorm.py
danielhanchen Nov 5, 2024
6f38731
Update rms_layernorm.py
danielhanchen Nov 5, 2024
2df35d4
typing
danielhanchen Nov 5, 2024
74d89d1
Update rope_embedding.py
danielhanchen Nov 5, 2024
98927ee
types
danielhanchen Nov 5, 2024
f3e2bd6
Disable compiling
danielhanchen Nov 5, 2024
c30bd2a
Update _utils.py
danielhanchen Nov 5, 2024
813cbdd
Update _utils.py
danielhanchen Nov 5, 2024
34ce5d1
Forward hook
danielhanchen Nov 5, 2024
f84cf4b
Update _utils.py
danielhanchen Nov 5, 2024
745814c
Update llama.py
danielhanchen Nov 5, 2024
ab9f8e1
Update _utils.py
danielhanchen Nov 5, 2024
daa7909
Update llama.py
danielhanchen Nov 5, 2024
536a1a6
Update llama.py
danielhanchen Nov 5, 2024
648ca59
Update _utils.py
danielhanchen Nov 5, 2024
486d0d6
Update pyproject.toml
danielhanchen Nov 5, 2024
eb4da9d
Update _utils.py
danielhanchen Nov 5, 2024
da397f4
Update llama.py
danielhanchen Nov 5, 2024
70b65cf
CE Loss
danielhanchen Nov 5, 2024
aeec57e
Update cross_entropy_loss.py
danielhanchen Nov 5, 2024
fb393fc
Update _utils.py
danielhanchen Nov 5, 2024
cab1e72
Update cross_entropy_loss.py
danielhanchen Nov 6, 2024
51fea97
Update cross_entropy_loss.py
danielhanchen Nov 6, 2024
58e541b
Update cross_entropy_loss.py
danielhanchen Nov 6, 2024
0ed0532
Merge branch 'main' into nightly
danielhanchen Nov 6, 2024
ef2c56f
Update llama.py
danielhanchen Nov 6, 2024
24ab0d2
Merge branch 'main' into nightly
danielhanchen Nov 6, 2024
13d7412
Update _utils.py
danielhanchen Nov 6, 2024
5a7eaf8
Update _utils.py
danielhanchen Nov 6, 2024
d2186ed
Update _utils.py
danielhanchen Nov 6, 2024
6434447
Update _utils.py
danielhanchen Nov 6, 2024
67611e6
Update _utils.py
danielhanchen Nov 6, 2024
36c5836
Merge branch 'main' into nightly
danielhanchen Nov 6, 2024
f24aef5
Fix: cast logits to float32 in cross_entropy_forward to prevent error…
Erland366 Nov 6, 2024
3d906e6
Throw error when inferencing longer than max_popsition_embeddings (#1…
Datta0 Nov 6, 2024
de1049b
CLI now handles user input strings for dtype correctly (#1235)
Rabbidon Nov 6, 2024
be72975
Update flex_attention.py
danielhanchen Nov 6, 2024
05170cd
Update _utils.py
danielhanchen Nov 6, 2024
7e0877d
Update _utils.py
danielhanchen Nov 6, 2024
6b5c599
Update flex_attention.py
danielhanchen Nov 6, 2024
1ba9f2e
Update flex_attention.py
danielhanchen Nov 6, 2024
da61c4d
Update loader.py
danielhanchen Nov 6, 2024
3316ee2
Update loader.py
danielhanchen Nov 6, 2024
501ca84
Update flex_attention.py
danielhanchen Nov 6, 2024
ce621b7
Update flex_attention.py
danielhanchen Nov 6, 2024
4b01ff1
Update flex_attention.py
danielhanchen Nov 6, 2024
ef5052a
Update flex_attention.py
danielhanchen Nov 7, 2024
52bca32
Update _utils.py
danielhanchen Nov 7, 2024
68b8d62
Merge branch 'main' into nightly
danielhanchen Nov 7, 2024
15da065
Merge branch 'main' into nightly
danielhanchen Nov 7, 2024
8b3e9c2
Update cross_entropy_loss.py
danielhanchen Nov 7, 2024
3a1e7ef
Update _utils.py
danielhanchen Nov 7, 2024
f1ec165
Update tokenizer_utils.py
danielhanchen Nov 10, 2024
a4e9705
Update tokenizer_utils.py
danielhanchen Nov 10, 2024
92c6a27
Update tokenizer_utils.py
danielhanchen Nov 10, 2024
673f541
Update tokenizer_utils.py
danielhanchen Nov 10, 2024
8fe9109
Update tokenizer_utils.py
danielhanchen Nov 11, 2024
ad41479
triton_cast
danielhanchen Nov 11, 2024
fcf2009
Update utils.py
danielhanchen Nov 11, 2024
af9ba07
Qwen 2.5 Coder
danielhanchen Nov 12, 2024
e99acdd
Merge branch 'main' into nightly
danielhanchen Nov 13, 2024
3fec577
Fix/export mistral (#1281)
Erland366 Nov 13, 2024
03c6243
DOC Update - Update README.md with os.environ in example (#1269)
udaygirish Nov 13, 2024
10565ef
fix/get_chat_template (#1246)
Erland366 Nov 13, 2024
dc0232c
fix/sft-trainer (#1276)
Erland366 Nov 14, 2024
84d6d36
Update __init__.py
danielhanchen Nov 14, 2024
a31027c
Update trainer.py
danielhanchen Nov 14, 2024
035bcce
Update trainer.py
danielhanchen Nov 14, 2024
597169c
Update trainer.py
danielhanchen Nov 14, 2024
11b350f
Update tokenizer_utils.py
danielhanchen Nov 14, 2024
e4d1754
Merge branch 'main' into nightly
danielhanchen Nov 14, 2024
3b11ae7
Update llama.py
danielhanchen Nov 14, 2024
5eb971f
Fix #853
danielhanchen Nov 14, 2024
a146521
fix/sfttrainer-compatibility (#1293)
Erland366 Nov 15, 2024
74382de
Update rms_layernorm.py
danielhanchen Nov 16, 2024
a6b8dda
Update rms_layernorm.py
danielhanchen Nov 16, 2024
82e4466
Gemma
danielhanchen Nov 16, 2024
50b0aba
Update rms_layernorm.py
danielhanchen Nov 16, 2024
9773fee
Update gemma2.py
danielhanchen Nov 16, 2024
1a3d2d5
Cut Cross Entropy
danielhanchen Nov 17, 2024
4f51d87
Update llama.py
danielhanchen Nov 17, 2024
b18edb9
Cut Cross Entropy
danielhanchen Nov 17, 2024
0a5c519
Update llama.py
danielhanchen Nov 17, 2024
59caca9
Update llama.py
danielhanchen Nov 17, 2024
49df51f
Update llama.py
danielhanchen Nov 18, 2024
cc314c8
Update __init__.py
danielhanchen Nov 18, 2024
42a76f1
Update __init__.py
danielhanchen Nov 18, 2024
4ed6ae8
Update _utils.py
danielhanchen Nov 18, 2024
2fade27
Update _utils.py
danielhanchen Nov 18, 2024
07ee0da
Update _utils.py
danielhanchen Nov 18, 2024
8eae7f9
Update _utils.py
danielhanchen Nov 18, 2024
6ab1d3a
Update _utils.py
danielhanchen Nov 18, 2024
d5c1c17
Update _utils.py
danielhanchen Nov 18, 2024
4abf3de
Update _utils.py
danielhanchen Nov 18, 2024
b144ff4
Update _utils.py
danielhanchen Nov 18, 2024
b9b7a5b
Update mapper.py
danielhanchen Nov 18, 2024
9f93c49
Update _utils.py
danielhanchen Nov 19, 2024
d00dc52
Update _utils.py
danielhanchen Nov 19, 2024
caf4cd4
Update _utils.py
danielhanchen Nov 19, 2024
4cd14bb
Update _utils.py
danielhanchen Nov 19, 2024
a0e709b
Update _utils.py
danielhanchen Nov 19, 2024
f92c16d
Update _utils.py
danielhanchen Nov 19, 2024
c7c984f
Update _utils.py
danielhanchen Nov 19, 2024
81538c3
Update _utils.py
danielhanchen Nov 19, 2024
029f5d5
Update _utils.py
danielhanchen Nov 20, 2024
bd1a175
patch_fast_lora
danielhanchen Nov 20, 2024
cabf21f
vision
danielhanchen Nov 20, 2024
7d5c9ed
Update fast_lora.py
danielhanchen Nov 21, 2024
4ddd1bb
Update _utils.py
danielhanchen Nov 21, 2024
1c94f04
Update _utils.py
danielhanchen Nov 21, 2024
d6ccbfb
Vision
danielhanchen Nov 21, 2024
8a44b6c
Update trainer.py
danielhanchen Nov 21, 2024
f077680
Merge branch 'main' into nightly
danielhanchen Nov 21, 2024
d5b8408
Update save.py
danielhanchen Nov 21, 2024
a5d4084
FastBaseVisionModel
danielhanchen Nov 21, 2024
7f5a9a7
Update loader_utils.py
danielhanchen Nov 21, 2024
d160618
Update vision.py
danielhanchen Nov 21, 2024
2420736
Update loader.py
danielhanchen Nov 21, 2024
0747078
Update vision.py
danielhanchen Nov 21, 2024
1f32b23
Update loader.py
danielhanchen Nov 21, 2024
a45e564
Update vision.py
danielhanchen Nov 21, 2024
767a31f
Update _utils.py
danielhanchen Nov 21, 2024
1ad1b46
tokenizer_name
danielhanchen Nov 21, 2024
26f2337
Update loader.py
danielhanchen Nov 21, 2024
5ab4b60
Update vision.py
danielhanchen Nov 21, 2024
fc7d747
Update save.py
danielhanchen Nov 21, 2024
e0b14fa
Update save.py
danielhanchen Nov 21, 2024
677cf9f
Update vision.py
danielhanchen Nov 21, 2024
8ab5dcb
Update vision.py
danielhanchen Nov 21, 2024
adaf6ee
Update vision.py
danielhanchen Nov 21, 2024
1a548f3
Update vision.py
danielhanchen Nov 21, 2024
5886ecb
Update vision.py
danielhanchen Nov 21, 2024
a98fc9c
Update vision.py
danielhanchen Nov 21, 2024
535e899
Update _utils.py
danielhanchen Nov 21, 2024
482b1fc
Merge branch 'main' into nightly
danielhanchen Nov 22, 2024
3782a59
Update loader.py
danielhanchen Nov 23, 2024
80adcd6
kwargs
danielhanchen Nov 26, 2024
2bb0660
logits
danielhanchen Nov 26, 2024
e17dca4
Update llama.py
danielhanchen Nov 26, 2024
b93e9cd
Update llama.py
danielhanchen Nov 26, 2024
f7edb15
Update llama.py
danielhanchen Nov 26, 2024
f7278b2
Update _utils.py
danielhanchen Nov 26, 2024
815576e
Update _utils.py
danielhanchen Nov 26, 2024
608e31b
Update _utils.py
danielhanchen Nov 26, 2024
6595603
error
danielhanchen Nov 26, 2024
bfc1c3e
Update _utils.py
danielhanchen Nov 26, 2024
7f3adad
Update _utils.py
danielhanchen Nov 26, 2024
5bd41e3
Update _utils.py
danielhanchen Nov 26, 2024
7949ddf
Update _utils.py
danielhanchen Nov 26, 2024
2febe57
Update _utils.py
danielhanchen Nov 26, 2024
23716bf
Update _utils.py
danielhanchen Nov 26, 2024
071b29b
Update _utils.py
danielhanchen Nov 26, 2024
b509681
Update _utils.py
danielhanchen Nov 26, 2024
a6ec1e6
Update _utils.py
danielhanchen Nov 26, 2024
2c38d3d
Update _utils.py
danielhanchen Nov 26, 2024
7fef1a9
Update _utils.py
danielhanchen Nov 26, 2024
6990917
Update _utils.py
danielhanchen Nov 26, 2024
8f4f2fe
Update _utils.py
danielhanchen Nov 26, 2024
20e182b
Update _utils.py
danielhanchen Nov 26, 2024
0162d22
Update _utils.py
danielhanchen Nov 26, 2024
5d69df6
Update loader.py
danielhanchen Nov 26, 2024
833f64d
Update llama.py
danielhanchen Nov 26, 2024
4e24572
Update vision.py
danielhanchen Nov 26, 2024
cfb769a
Update loader.py
danielhanchen Nov 26, 2024
7321fe9
Old torch versions
danielhanchen Nov 26, 2024
49caeb2
Update loader.py
danielhanchen Nov 26, 2024
ed1c7a9
Update loader.py
danielhanchen Nov 26, 2024
587223b
prints
danielhanchen Nov 26, 2024
6099293
recheck
danielhanchen Nov 26, 2024
207d047
Update loader.py
danielhanchen Nov 26, 2024
90f79d2
Update loader.py
danielhanchen Nov 26, 2024
d3b147b
Update _utils.py
danielhanchen Nov 26, 2024
4e55168
Update _utils.py
danielhanchen Nov 26, 2024
8bf0404
Update mapper.py
danielhanchen Nov 26, 2024
8a6da33
Feat/kto (#1316)
Erland366 Nov 26, 2024
98a78dd
Fix orpo/dpo trainer (#1286)
dame-cell Nov 26, 2024
5994660
Merge branch 'main' into nightly
danielhanchen Nov 27, 2024
d4c06c0
skip modules
danielhanchen Nov 28, 2024
2e53938
Update vision.py
danielhanchen Nov 28, 2024
aad5b1f
Update llama.py
danielhanchen Dec 1, 2024
2515d19
Update llama.py
danielhanchen Dec 1, 2024
6d46853
Update llama.py
danielhanchen Dec 1, 2024
a2b7a5e
Update llama.py
danielhanchen Dec 1, 2024
370e460
Update llama.py
danielhanchen Dec 1, 2024
a1b9d74
Update llama.py
danielhanchen Dec 1, 2024
9dd59ae
Update llama.py
danielhanchen Dec 1, 2024
e6aa302
Update llama.py
danielhanchen Dec 1, 2024
39c01fa
Update llama.py
danielhanchen Dec 1, 2024
f2f6e1d
Update llama.py
danielhanchen Dec 1, 2024
78160ab
Update llama.py
danielhanchen Dec 1, 2024
ae7afa2
Fix llama.cpp
danielhanchen Dec 4, 2024
56fa57f
Update save.py
danielhanchen Dec 4, 2024
41a045b
Update save.py
danielhanchen Dec 4, 2024
1642ded
Update vision.py
danielhanchen Dec 4, 2024
cf993d7
Update save.py
danielhanchen Dec 4, 2024
5041f9f
Update save.py
danielhanchen Dec 4, 2024
70893fc
Update save.py
danielhanchen Dec 4, 2024
4361fde
Update save.py
danielhanchen Dec 4, 2024
4c90b56
Update save.py
danielhanchen Dec 4, 2024
c0c8264
Update save.py
danielhanchen Dec 4, 2024
236604f
Update save.py
danielhanchen Dec 4, 2024
2fbc62b
Update _utils.py
danielhanchen Dec 4, 2024
410cf59
Update save.py
danielhanchen Dec 4, 2024
6237e2b
Update save.py
danielhanchen Dec 4, 2024
8e8efab
Merge branch 'main' into nightly
danielhanchen Dec 4, 2024
4f9bbac
Merge branch 'main' into nightly
danielhanchen Dec 4, 2024
8d67597
Update mapper.py
danielhanchen Dec 4, 2024
1cf7965
modules
danielhanchen Dec 4, 2024
c91d183
Merge branch 'main' into nightly
danielhanchen Dec 5, 2024
9bc2609
Fix vision model tokenizer padding side. (#1384)
ZewenShen Dec 5, 2024
a8d8c97
Add citation section to README.md (#1377)
Erland366 Dec 5, 2024
15d7fbb
Granite support (#1218)
Datta0 Dec 5, 2024
2b5b771
Llama 3.3
danielhanchen Dec 6, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -469,6 +469,18 @@ Two Tesla T4s on Kaggle
![](https://i.ibb.co/sJ7RhGG/image-41.png)
<br>

### Citing

You can cite the Unsloth repo as follows:
```bibtex
@software{unsloth,
author = {Daniel Han, Michael Han and Unsloth team},
title = {Unsloth},
url = {https://github.com/unslothai/unsloth},
year = {2023}
}
```

### Thank You to
- [HuyNguyen-hust](https://github.com/HuyNguyen-hust) for making [RoPE Embeddings 28% faster](https://github.com/unslothai/unsloth/pull/238)
- [RandomInternetPreson](https://github.com/RandomInternetPreson) for confirming WSL support
Expand Down
2 changes: 2 additions & 0 deletions unsloth/models/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,8 @@
# See the License for the specific language governing permissions and
# limitations under the License.


from .granite import FastGraniteModel
from .loader import FastLanguageModel, FastVisionModel
from .llama import FastLlamaModel
from .mistral import FastMistralModel
Expand Down
4 changes: 2 additions & 2 deletions unsloth/models/_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
# See the License for the specific language governing permissions and
# limitations under the License.

__version__ = "2024.12.2"
__version__ = "2024.12.3"

__all__ = [
"prepare_model_for_kbit_training",
Expand Down Expand Up @@ -188,7 +188,7 @@ def patch_mistral_nemo_config(config):

from transformers import __version__ as transformers_version
from transformers import PretrainedConfig
model_architectures = ["llama", "mistral", "gemma", "gemma2", "qwen2",]
model_architectures = ["llama", "mistral", "gemma", "gemma2", "qwen2", "granite"]

for model_name in model_architectures:
config_filepath = f"transformers.models.{model_name}.configuration_{model_name}"
Expand Down
2 changes: 1 addition & 1 deletion unsloth/models/gemma2.py
Original file line number Diff line number Diff line change
Expand Up @@ -193,7 +193,7 @@ def Gemma2DecoderLayer_fast_forward(
output_attentions=output_attentions,
use_cache=use_cache,
padding_mask=padding_mask,
_flag_for_generation=True,
_flag_for_generation=self._flag_for_generation,
)
hidden_states = fast_rms_layernorm_inference_gemma(self.post_attention_layernorm, hidden_states, out_weight)
hidden_states += residual
Expand Down
Loading