Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for fine-tuning Pixtral-12B. #2090

Merged
merged 9 commits into from
Sep 23, 2024

Conversation

Jintao-Huang
Copy link
Collaborator

@Jintao-Huang Jintao-Huang commented Sep 21, 2024

PR type

  • More Models or Datasets Support

transformers pixtral: huggingface/transformers#33449
issue: #2053

# infer
CUDA_VISIBLE_DEVICES=0 swift infer --model_type pixtral-12b --dtype fp16

# sft
CUDA_VISIBLE_DEVICES=0 swift sft --model_type pixtral-12b --sft_type lora --dataset coco-en-mini

@Jintao-Huang Jintao-Huang changed the title support pixtral support pixtral-12b Sep 23, 2024
@Jintao-Huang Jintao-Huang changed the title support pixtral-12b Support for fine-tuning Pixtral-12B. Sep 23, 2024
@Jintao-Huang Jintao-Huang merged commit b654118 into modelscope:main Sep 23, 2024
2 checks passed
tastelikefeet added a commit to tastelikefeet/swift that referenced this pull request Sep 26, 2024
* commit '57b3b9e46aa01bdc5c29b5e3d1e2da0582c9b282': (23 commits)
  fix not impl bug (modelscope#2134)
  Support fine-tuning MLLama. (modelscope#2132)
  Support for fine-tuning and deployment of the Llama 3.2 series models. (modelscope#2130)
  support got-ocr2 (modelscope#2123)
  [TorchAcc] fix: fix find_labels and can_return_loss (modelscope#2120)
  fix qwen2-audio (modelscope#2116)
  Fix qwen2-vl zero2/3 (modelscope#2114)
  support vllm & qwen2-vl video (modelscope#2110)
  Support for fine-tuning Llama 3.1 Omni. (modelscope#2106)
  fix infer device_map (modelscope#2105)
  fix cpu infer device_map (modelscope#2103)
  fix dataset preprocess (modelscope#2102)
  fix deploy openai compat (modelscope#2101)
  Fix the issue with media_offset in owl3 when batch_size > 1. (modelscope#2100)
  fix vllm tokenizer (modelscope#2099)
  Support for fine-tuning Pixtral-12B. (modelscope#2090)
  fix multiprocess remove_columns (modelscope#2088)
  fix qwen2.5 template (modelscope#2081)
  dynamic vit gradient_checkpointing (modelscope#2071)
  Support Mistral-small-inst-2409 (modelscope#2077)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants