Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How can I merge the LoRA weights into the base model? #74

Open
pantDevesh opened this issue Jun 16, 2024 · 7 comments
Open

How can I merge the LoRA weights into the base model? #74

pantDevesh opened this issue Jun 16, 2024 · 7 comments

Comments

@pantDevesh
Copy link

Is there a script for this?

@mkserge
Copy link

mkserge commented Jun 17, 2024

You can do something like this

from mistral_inference.model import Transformer
model = Transformer.from_folder(args.model_path, device=f"cuda:0")
model.load_lora("/path/to/lora.safetensors", device=f"cuda:0")
safetensors.torch.save_model(model, "/path/to/merged.safetensors")

@forest520
Copy link

How to perform inference with a LoRA model using Python code, if save_adapters = True?

@kehuitt
Copy link

kehuitt commented Jul 17, 2024

You can do something like this

from mistral_inference.model import Transformer
model = Transformer.from_folder(args.model_path, device=f"cuda:0")
model.load_lora("/path/to/lora.safetensors", device=f"cuda:0")
safetensors.torch.save_model(model, "/path/to/merged.safetensors")

When I run this, I got 'ImportError: cannot import name 'Transformer' from 'mistral_inference.model'', the version of mistral_inference=1.2.0, how can I fix this problem? Thx!

@pandora-s-git
Copy link
Collaborator

Try with from mistral_inference.transformer import Transformer as it was very recently updated with the codestral mamba release!

@kehuitt
Copy link

kehuitt commented Jul 20, 2024

A single GPU doesn't seem to be able to load the entire Mixtral-8x7B-v0.1-Instruct model, how should I merge the model using multiple cards? Thanks!

@leloss
Copy link

leloss commented Aug 20, 2024

A single GPU doesn't seem to be able to load the entire Mixtral-8x7B-v0.1-Instruct model, how should I merge the model using multiple cards? Thanks!

Apparently, the only merging method available today relies on loading everything on the same device, which forces us to rent out a 40GB GPU instance like the p4d.24xlarge for the 7B model. Someone (please) correct me if I'm wrong.

@abhishekdhankar95
Copy link

abhishekdhankar95 commented Sep 10, 2024

mistral-finetune has a requirement of torch==2.2, whereas mistral-inference has a requirement of torch==2.3.0 for all but the first release.
Is there anyway to have the two of them in the same conda environment without conflicting requirements?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants