Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for fine-tuning Llama 3.1 Omni. #2106

Merged
merged 7 commits into from
Sep 24, 2024

Conversation

Jintao-Huang
Copy link
Collaborator

@Jintao-Huang
Copy link
Collaborator Author

ictnlp/LLaMA-Omni#1

@Jintao-Huang
Copy link
Collaborator Author

inference

CUDA_VISIBLE_DEVICES=0 swift infer --model_type llama3_1-8b-omni

sft

NPROC_PER_NODE=4 CUDA_VISIBLE_DEVICES=0,1,2,3 swift sft --model_type llama3_1-8b-omni --dataset aishell1-zh-mini#20000 --deepspeed default-zero2 --sft_type lora --output_dir output

@Jintao-Huang Jintao-Huang merged commit 085748c into modelscope:main Sep 24, 2024
1 of 2 checks passed
tastelikefeet added a commit to tastelikefeet/swift that referenced this pull request Sep 26, 2024
* commit '57b3b9e46aa01bdc5c29b5e3d1e2da0582c9b282': (23 commits)
  fix not impl bug (modelscope#2134)
  Support fine-tuning MLLama. (modelscope#2132)
  Support for fine-tuning and deployment of the Llama 3.2 series models. (modelscope#2130)
  support got-ocr2 (modelscope#2123)
  [TorchAcc] fix: fix find_labels and can_return_loss (modelscope#2120)
  fix qwen2-audio (modelscope#2116)
  Fix qwen2-vl zero2/3 (modelscope#2114)
  support vllm & qwen2-vl video (modelscope#2110)
  Support for fine-tuning Llama 3.1 Omni. (modelscope#2106)
  fix infer device_map (modelscope#2105)
  fix cpu infer device_map (modelscope#2103)
  fix dataset preprocess (modelscope#2102)
  fix deploy openai compat (modelscope#2101)
  Fix the issue with media_offset in owl3 when batch_size > 1. (modelscope#2100)
  fix vllm tokenizer (modelscope#2099)
  Support for fine-tuning Pixtral-12B. (modelscope#2090)
  fix multiprocess remove_columns (modelscope#2088)
  fix qwen2.5 template (modelscope#2081)
  dynamic vit gradient_checkpointing (modelscope#2071)
  Support Mistral-small-inst-2409 (modelscope#2077)
  ...
@satheeshkola-532
Copy link

inference

CUDA_VISIBLE_DEVICES=0 swift infer --model_type llama3_1-8b-omni

sft

NPROC_PER_NODE=4 CUDA_VISIBLE_DEVICES=0,1,2,3 swift sft --model_type llama3_1-8b-omni --dataset aishell1-zh-mini#20000 --deepspeed default-zero2 --sft_type lora --output_dir output

hey @Jintao-Huang , i have fine tuned the llama omni by using the custom instruction dataset on another language using the above given cli command, it is generating the promising text responses in the desired language, but the audio generated is not supporting the desired lanugage(means vocoder isn't generating audio in fine tuned lanugae), so could you please check this issue once to completely fine tune the entire pipeline?

@fromlimbo
Copy link

fromlimbo commented Oct 16, 2024

@Jintao-Huang 你好,我在A800上面CUDA_VISIBLE_DEVICES=0是可以正常sft的,改成CUDA_VISIBLE_DEVICES=0,1多卡就会报错NotImplementedError: Cannot copy out of meta tensor; no data!
,请问默认模式下是不支持多卡吗?
CUDA_VISIBLE_DEVICES=0 swift sft \ --model_type llama3_1-8b-omni \ --model_id_or_path ICTNLP/Llama-3.1-8B-Omni \ --sft_type lora \ --dataset aishell1-zh-mini#5000

@SteinsHead
Copy link

inference

CUDA_VISIBLE_DEVICES=0 swift infer --model_type llama3_1-8b-omni

sft

NPROC_PER_NODE=4 CUDA_VISIBLE_DEVICES=0,1,2,3 swift sft --model_type llama3_1-8b-omni --dataset aishell1-zh-mini#20000 --deepspeed default-zero2 --sft_type lora --output_dir output

hey @Jintao-Huang , i have fine tuned the llama omni by using the custom instruction dataset on another language using the above given cli command, it is generating the promising text responses in the desired language, but the audio generated is not supporting the desired lanugage(means vocoder isn't generating audio in fine tuned lanugae), so could you please check this issue once to completely fine tune the entire pipeline?

Hello, have you solved the problem yet? We are also trying other languages and are facing the same issue. @satheeshkola-532 @Jintao-Huang

@ParikshitGehlaut
Copy link

Does anyone facing issue with hugging face accelerate version ? latest version is 1.0.1, But I getting error that data_seed requires Accelerate version accelerate >= 1.1.0.

@kadirnar
Copy link

@Jintao-Huang ,

I have never used the Swift library. I will try it for the first time. I want to train the Llama-Omni model from scratch. I want to use Llama3.3 70B and whisper large v3 model. Is this possible?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants