[Fix] Fix Mixtral LoRA setting #312

LZHgrla · 2024-01-11T15:45:56Z

To avoid adding LoRA for gate layers

* [Improve] Redesign the `prompt_template` (#294) * update * update cfgs * update * fix bugs * upload docs * rename * update * Revert "update cfgs" This reverts commit 93966aa. * update cfgs * update * rename * rename * fix bc * fix stop_word * fix * fix * Update prompt_template.md * [Fix] Fix errors about `stop_words` (#313) * fix bugs * Update mmbench.py * [Fix] Fix Mixtral LoRA setting (#312) set target_modules * [Feature] Support DeepSeek-MoE (#311) * support deepseek moe * update docs * update * update * [Fix] Set `torch.optim.AdamW` as the default optimizer (#318) fix * [FIx] Fix `pth_to_hf` for LLaVA model (#316) Update pth_to_hf.py * [Improve] Add `demo_data` examples (#278) * update examples * add examples * add json template config * rename * update * update * update * [Feature] Support InternLM2 (#321) * add cfgs * add internlm2 template * add dispatch * add docs * update readme * update * [Fix] Fix the resume of seed (#309) * fix * Update utils.py * [Feature] Accelerate `xtuner xxx` (#307) * accelerate cli * Update entry_point.py * Update entry_point.py --------- Co-authored-by: Zhihao Lin <[email protected]> * [Fix] Fix InternLM2 url (#325) * fix * update * Update README.md * Update README_zh-CN.md * [Fix] Limit the version of python, `>=3.8, <3.11` (#327) update * [Fix] Add `trust_remote_code=True` for AutoModel (#328) update * [Docs] Improve README (#326) * update * Update README.md * Update README.md * Update README.md * Update README_zh-CN.md * update * update * fix pre-commit * update * bump verion to v0.1.12 (#323) bump v0.1.12 * set dev version (#329) Update version.py * [Docs] Add LLaVA-InternLM2 results (#332) * update results * update * Update internlm2_chat template (#339) Update internlm2 template * [Fix] Fix examples demo_data configs (#334) fix * bump version to v0.1.13 (#340) update * set dev version (#341) update * [Feature] More flexible `TrainLoop` (#348) * add new loop * rename * fix pre-commit * add max_keep_ckpts * fix * update cfgs * update examples * fix * update * update llava * update * update * update * update * [Feature]Support CEPH (#266) * support petrelfs * fix deepspeed save/load/resume * add ENV to toggle petrelfs * support hf save_pretrained * patch deepspeed engine * [Improve] Add `--repetition-penalty` for `xtuner chat` (#351) fix * [Feature] Support MMBench DDP Evaluate (#300) * support ddp mmbench evaluate * Update xtuner/tools/mmbench.py Co-authored-by: Zhihao Lin <[email protected]> * Update xtuner/tools/mmbench.py Co-authored-by: Zhihao Lin <[email protected]> * update minimum version of mmengine * Update runtime.txt --------- Co-authored-by: Zhihao Lin <[email protected]> * [Fix] `KeyError` of `encode_fn` (#361) fix * [Fix] Fix `batch_size` of full fine-tuing LLaVA-InternLM2 (#360) fix * [Fix] Remove `system` for `alpaca_map_fn` (#363) update * [Fix] Use `DEFAULT_IMAGE_TOKEN` instead of `'<image>'` (#353) Update utils.py * [Feature] Efficient SFT (#302) * add local_attn_args_to_messagehub_hook * add internlm repo sampler * add internlm repo dataset and collate_fn * dispatch internlm1 and internlm2 local attn * add internlm2 config * add internlm1 and intenrlm2 config * add internlm2 template * fix replace_internlm1_rote bugs * add internlm1 and internlm2 config templates * change priority of EvaluateChatHook * fix docs * fix config * fix bug * set rotary_base according the latest internlm2 config * add llama local attn * add llama local attn * update intern_repo_dataset docs when using aliyun * support using both hf load_dataset and intern_repo packed_dataset * add configs * add opencompass doc * update opencompass doc * use T data order * use T data order * add config * add a tool to get data order * support offline processing untokenized dataset * add docs * add doc about only saving model weights * add doc about only saving model weights * dispatch mistral * add mistral template * add mistral template * fix torch_dtype * reset pre-commit-config * fix config * fix internlm_7b_full_intern_repo_dataset_template * update local_attn to varlen_attn * rename local_attn * fix InternlmRepoSampler and train.py to support resume * modify Packer to support varlen attn * support varlen attn in default pipeline * update mmengine version requirement to 0.10.3 * Update ceph.md * delete intern_repo_collate_fn * delete intern_repo_collate_fn * delete useless files * assert pack_to_max_length=True if use_varlen_attn=True * add varlen attn doc * add varlen attn to configs * delete useless codes * update * update * update configs * fix priority of ThroughputHook and flake8 ignore W504 * using map_fn to set length attr to dataset * support split=None in process_hf_dataset * add dataset_format_mapping * support preprocess ftdp and normal dataset * refactor process_hf_dataset * support pack dataset in process_untokenized_datasets * add xtuner_dataset_timeout * using gloo backend for monitored barrier * set gloo timeout * fix bugs * fix configs * refactor intern repo dataset docs * fix doc * fix lint --------- Co-authored-by: pppppM <[email protected]> Co-authored-by: pppppM <[email protected]> * [Fix] Add `attention_mask` for `default_collate_fn` (#371) fix * [Fix] Update requirements (#369) Update runtime.txt * [Fix] Fix rotary_base, add `colors_map_fn` to `DATASET_FORMAT_MAPPING` and rename 'internlm_repo' to 'intern_repo' (#372) * fix * rename internlm_repo to intern_repo * add InternlmRepoSampler for preventing bc break * add how to install flash_attn to doc * update (#377) * Delete useless codes and refactor process_untokenized_datasets (#379) * delete useless codes * refactor process_untokenized_datasets: add ftdp to dataset-format * fix lint * [Feature] support flash attn 2 in internlm1, internlm2 and llama (#381) support flash attn 2 in internlm1, internlm2 and llama * [Fix] Fix installation docs of mmengine in `intern_repo_dataset.md` (#384) update * [Fix] Update InternLM2 `apply_rotary_pos_emb` (#383) update * [Feature] support saving eval output before save checkpoint (#385) * support saving eval output before save checkpoint * refactor * [Fix] lr scheduler setting (#394) * fix lr scheduler setting * fix more --------- Co-authored-by: zilong.guo <[email protected]> Co-authored-by: LZHgrla <[email protected]> * [Fix] Remove pre-defined `system` of `alpaca_zh_map_fn` (#395) fix * [Feature] Support `Qwen1.5` (#407) * rename * update docs * update template * update * add cfgs * update * update * [Fix] Fix no space in chat output using InternLM2. (#357) (#404) * [Fix] Fix no space in chat output using InternLM2. (#357) * Update chat.py * Update utils.py * Update utils.py * fix pre-commit --------- Co-authored-by: Zhihao Lin <[email protected]> Co-authored-by: LZHgrla <[email protected]> * [Fix] typo: `--system-prompt` to `--system-template` (#406) fix * [Improve] Add `output_with_loss` for dataset process (#408) update * [Fix] Fix dispatch to support transformers>=4.36 & Add USE_TRITON_KERNEL environment variable (#411) * dispatch support transformers>=4.36 * add USE_TRITON_KERNEL environment variable * raise RuntimeError use triton kernels on cpu * fix lint * [Feature]Add InternLM2-1_8b configs (#396) * [Feature]Add InternLM2-Chat-1_8b full config * [Feature]Add InternLM2-Chat-1_8b full config * update --------- Co-authored-by: LZHgrla <[email protected]> Co-authored-by: Zhihao Lin <[email protected]> * [Fix] Fix `extract_json_objects` (#419) * [Fix] Fix pth_to_hf error (#426) fix * [Feature] Support `Gemma` (#429) * added gemma config and template * check config and make sure the consistancy * Update xtuner/configs/gemma/gemma_2b_base/gemma_2b_base_qlora_alpaca_e3.py Co-authored-by: Zhihao Lin <[email protected]> * Update xtuner/configs/gemma/gemma_2b_base/gemma_2b_base_full_alpaca_e3.py Co-authored-by: Zhihao Lin <[email protected]> * Update xtuner/configs/gemma/gemma_7b_base/gemma_7b_base_full_alpaca_e3.py Co-authored-by: Zhihao Lin <[email protected]> * Update xtuner/configs/gemma/gemma_7b_base/gemma_7b_base_qlora_alpaca_e3.py Co-authored-by: Zhihao Lin <[email protected]> * Update xtuner/utils/templates.py Co-authored-by: Zhihao Lin <[email protected]> * update * added required version * update * update --------- Co-authored-by: Zhihao Lin <[email protected]> Co-authored-by: LZHgrla <[email protected]> * add refcoco to llava (#425) * add base dataset * update dataset generation * update refcoco * add convert refcooc * add eval_refcoco * add config * update dataset * fix bug * fix bug * update data prepare * fix error * refactor eval_refcoco * fix bug * fix error * update readme * add entry_point * update config * update config * update entry point * update * update doc * update --------- Co-authored-by: jacky <[email protected]> * [Fix] Inconsistent BatchSize of `LengthGroupedSampler` (#436) update * bump version to v0.1.14 (#431) update * set dev version (#437) * Update version.py * Update version.py * [Bugs] Fix bugs when using EpochBasedRunner (#439) fix bugs when using epochbasedrunner * [Feature] Support processing ftdp dataset and custom dataset offline (#410) * support smart_tokenizer_and_embedding_resize * replace ast with json.loads * support list_dataset_format cli * add doc about ftdp and custom dataset * add custom dataset template * add args name to process_hf_dataset * use new process_untokenized_datasets * support tokenize_ftdp_datasets * add mistral_7b_w_tokenized_dataset config * update doc * update doc * add comments * fix data save path * smart_tokenizer_and_embedding_resize support zero3 * fix lint * add data format to internlm2_7b_full_finetune_custom_dataset_e1.py * add a data format example to configs associated with finetuning custom dataset * add a data format example to configs associated with finetuning custom dataset * fix lint * Update prompt_template.md (#441) 修改了一个错别字 * [Doc] Split finetune_custom_dataset.md to 6 parts (#445) * split finetune_custom_dataset.md to 6 parts * refactor custom_dataset and ftdp_dataset related docs * fix comments * fix pre-commit --------- Co-authored-by: pppppM <[email protected]> Co-authored-by: RangiLyu <[email protected]> Co-authored-by: whcao <[email protected]> Co-authored-by: pppppM <[email protected]> Co-authored-by: gzlong96 <[email protected]> Co-authored-by: zilong.guo <[email protected]> Co-authored-by: Ko Sung <[email protected]> Co-authored-by: 不要葱姜蒜 <[email protected]> Co-authored-by: fanqiNO1 <[email protected]> Co-authored-by: PommesPeter <[email protected]> Co-authored-by: LKJacky <[email protected]> Co-authored-by: jacky <[email protected]> Co-authored-by: xzw <[email protected]>

* [Docs] Readthedocs (#304) * init readthedocs * add en docs * add zh docs * fix lint * [Fix] Support ZH Readthedocs (#305) * add zh yaml * test zh cn * test yaml path * pass * update conf.py * [Docs] Document optimization (#362) Document optimization * [Docs] Update Docs docs/en/get_started/installation.md (#364) * 更新中文 installation.md 完成中文的安装-安装流程-最佳实践 & 安装-验证安装 * Update installation.md en * Update installation.md zh typo * [Docs] Refine Quick Start (#378) * [Docs] Add zh_cn quickstart * [Fix] Fix color rendering logic for github * [Fix] Fix comments * [Fix] Add hyperlinks * [Docs] Add en quickstart * [Fix] Fix comments * Update overview.md (#412) * Update overview.md * Update overview.md 已根据要求进行修改，请查阅 * Update overview.md 进一步的修正 * Update overview.md 根据要求的完善 * Merge branch 'main' into 'docs' (#463) * [Improve] Redesign the `prompt_template` (#294) * update * update cfgs * update * fix bugs * upload docs * rename * update * Revert "update cfgs" This reverts commit 93966aa. * update cfgs * update * rename * rename * fix bc * fix stop_word * fix * fix * Update prompt_template.md * [Fix] Fix errors about `stop_words` (#313) * fix bugs * Update mmbench.py * [Fix] Fix Mixtral LoRA setting (#312) set target_modules * [Feature] Support DeepSeek-MoE (#311) * support deepseek moe * update docs * update * update * [Fix] Set `torch.optim.AdamW` as the default optimizer (#318) fix * [FIx] Fix `pth_to_hf` for LLaVA model (#316) Update pth_to_hf.py * [Improve] Add `demo_data` examples (#278) * update examples * add examples * add json template config * rename * update * update * update * [Feature] Support InternLM2 (#321) * add cfgs * add internlm2 template * add dispatch * add docs * update readme * update * [Fix] Fix the resume of seed (#309) * fix * Update utils.py * [Feature] Accelerate `xtuner xxx` (#307) * accelerate cli * Update entry_point.py * Update entry_point.py --------- Co-authored-by: Zhihao Lin <[email protected]> * [Fix] Fix InternLM2 url (#325) * fix * update * Update README.md * Update README_zh-CN.md * [Fix] Limit the version of python, `>=3.8, <3.11` (#327) update * [Fix] Add `trust_remote_code=True` for AutoModel (#328) update * [Docs] Improve README (#326) * update * Update README.md * Update README.md * Update README.md * Update README_zh-CN.md * update * update * fix pre-commit * update * bump verion to v0.1.12 (#323) bump v0.1.12 * set dev version (#329) Update version.py * [Docs] Add LLaVA-InternLM2 results (#332) * update results * update * Update internlm2_chat template (#339) Update internlm2 template * [Fix] Fix examples demo_data configs (#334) fix * bump version to v0.1.13 (#340) update * set dev version (#341) update * [Feature] More flexible `TrainLoop` (#348) * add new loop * rename * fix pre-commit * add max_keep_ckpts * fix * update cfgs * update examples * fix * update * update llava * update * update * update * update * [Feature]Support CEPH (#266) * support petrelfs * fix deepspeed save/load/resume * add ENV to toggle petrelfs * support hf save_pretrained * patch deepspeed engine * [Improve] Add `--repetition-penalty` for `xtuner chat` (#351) fix * [Feature] Support MMBench DDP Evaluate (#300) * support ddp mmbench evaluate * Update xtuner/tools/mmbench.py Co-authored-by: Zhihao Lin <[email protected]> * Update xtuner/tools/mmbench.py Co-authored-by: Zhihao Lin <[email protected]> * update minimum version of mmengine * Update runtime.txt --------- Co-authored-by: Zhihao Lin <[email protected]> * [Fix] `KeyError` of `encode_fn` (#361) fix * [Fix] Fix `batch_size` of full fine-tuing LLaVA-InternLM2 (#360) fix * [Fix] Remove `system` for `alpaca_map_fn` (#363) update * [Fix] Use `DEFAULT_IMAGE_TOKEN` instead of `'<image>'` (#353) Update utils.py * [Feature] Efficient SFT (#302) * add local_attn_args_to_messagehub_hook * add internlm repo sampler * add internlm repo dataset and collate_fn * dispatch internlm1 and internlm2 local attn * add internlm2 config * add internlm1 and intenrlm2 config * add internlm2 template * fix replace_internlm1_rote bugs * add internlm1 and internlm2 config templates * change priority of EvaluateChatHook * fix docs * fix config * fix bug * set rotary_base according the latest internlm2 config * add llama local attn * add llama local attn * update intern_repo_dataset docs when using aliyun * support using both hf load_dataset and intern_repo packed_dataset * add configs * add opencompass doc * update opencompass doc * use T data order * use T data order * add config * add a tool to get data order * support offline processing untokenized dataset * add docs * add doc about only saving model weights * add doc about only saving model weights * dispatch mistral * add mistral template * add mistral template * fix torch_dtype * reset pre-commit-config * fix config * fix internlm_7b_full_intern_repo_dataset_template * update local_attn to varlen_attn * rename local_attn * fix InternlmRepoSampler and train.py to support resume * modify Packer to support varlen attn * support varlen attn in default pipeline * update mmengine version requirement to 0.10.3 * Update ceph.md * delete intern_repo_collate_fn * delete intern_repo_collate_fn * delete useless files * assert pack_to_max_length=True if use_varlen_attn=True * add varlen attn doc * add varlen attn to configs * delete useless codes * update * update * update configs * fix priority of ThroughputHook and flake8 ignore W504 * using map_fn to set length attr to dataset * support split=None in process_hf_dataset * add dataset_format_mapping * support preprocess ftdp and normal dataset * refactor process_hf_dataset * support pack dataset in process_untokenized_datasets * add xtuner_dataset_timeout * using gloo backend for monitored barrier * set gloo timeout * fix bugs * fix configs * refactor intern repo dataset docs * fix doc * fix lint --------- Co-authored-by: pppppM <[email protected]> Co-authored-by: pppppM <[email protected]> * [Fix] Add `attention_mask` for `default_collate_fn` (#371) fix * [Fix] Update requirements (#369) Update runtime.txt * [Fix] Fix rotary_base, add `colors_map_fn` to `DATASET_FORMAT_MAPPING` and rename 'internlm_repo' to 'intern_repo' (#372) * fix * rename internlm_repo to intern_repo * add InternlmRepoSampler for preventing bc break * add how to install flash_attn to doc * update (#377) * Delete useless codes and refactor process_untokenized_datasets (#379) * delete useless codes * refactor process_untokenized_datasets: add ftdp to dataset-format * fix lint * [Feature] support flash attn 2 in internlm1, internlm2 and llama (#381) support flash attn 2 in internlm1, internlm2 and llama * [Fix] Fix installation docs of mmengine in `intern_repo_dataset.md` (#384) update * [Fix] Update InternLM2 `apply_rotary_pos_emb` (#383) update * [Feature] support saving eval output before save checkpoint (#385) * support saving eval output before save checkpoint * refactor * [Fix] lr scheduler setting (#394) * fix lr scheduler setting * fix more --------- Co-authored-by: zilong.guo <[email protected]> Co-authored-by: LZHgrla <[email protected]> * [Fix] Remove pre-defined `system` of `alpaca_zh_map_fn` (#395) fix * [Feature] Support `Qwen1.5` (#407) * rename * update docs * update template * update * add cfgs * update * update * [Fix] Fix no space in chat output using InternLM2. (#357) (#404) * [Fix] Fix no space in chat output using InternLM2. (#357) * Update chat.py * Update utils.py * Update utils.py * fix pre-commit --------- Co-authored-by: Zhihao Lin <[email protected]> Co-authored-by: LZHgrla <[email protected]> * [Fix] typo: `--system-prompt` to `--system-template` (#406) fix * [Improve] Add `output_with_loss` for dataset process (#408) update * [Fix] Fix dispatch to support transformers>=4.36 & Add USE_TRITON_KERNEL environment variable (#411) * dispatch support transformers>=4.36 * add USE_TRITON_KERNEL environment variable * raise RuntimeError use triton kernels on cpu * fix lint * [Feature]Add InternLM2-1_8b configs (#396) * [Feature]Add InternLM2-Chat-1_8b full config * [Feature]Add InternLM2-Chat-1_8b full config * update --------- Co-authored-by: LZHgrla <[email protected]> Co-authored-by: Zhihao Lin <[email protected]> * [Fix] Fix `extract_json_objects` (#419) * [Fix] Fix pth_to_hf error (#426) fix * [Feature] Support `Gemma` (#429) * added gemma config and template * check config and make sure the consistancy * Update xtuner/configs/gemma/gemma_2b_base/gemma_2b_base_qlora_alpaca_e3.py Co-authored-by: Zhihao Lin <[email protected]> * Update xtuner/configs/gemma/gemma_2b_base/gemma_2b_base_full_alpaca_e3.py Co-authored-by: Zhihao Lin <[email protected]> * Update xtuner/configs/gemma/gemma_7b_base/gemma_7b_base_full_alpaca_e3.py Co-authored-by: Zhihao Lin <[email protected]> * Update xtuner/configs/gemma/gemma_7b_base/gemma_7b_base_qlora_alpaca_e3.py Co-authored-by: Zhihao Lin <[email protected]> * Update xtuner/utils/templates.py Co-authored-by: Zhihao Lin <[email protected]> * update * added required version * update * update --------- Co-authored-by: Zhihao Lin <[email protected]> Co-authored-by: LZHgrla <[email protected]> * add refcoco to llava (#425) * add base dataset * update dataset generation * update refcoco * add convert refcooc * add eval_refcoco * add config * update dataset * fix bug * fix bug * update data prepare * fix error * refactor eval_refcoco * fix bug * fix error * update readme * add entry_point * update config * update config * update entry point * update * update doc * update --------- Co-authored-by: jacky <[email protected]> * [Fix] Inconsistent BatchSize of `LengthGroupedSampler` (#436) update * bump version to v0.1.14 (#431) update * set dev version (#437) * Update version.py * Update version.py * [Bugs] Fix bugs when using EpochBasedRunner (#439) fix bugs when using epochbasedrunner * [Feature] Support processing ftdp dataset and custom dataset offline (#410) * support smart_tokenizer_and_embedding_resize * replace ast with json.loads * support list_dataset_format cli * add doc about ftdp and custom dataset * add custom dataset template * add args name to process_hf_dataset * use new process_untokenized_datasets * support tokenize_ftdp_datasets * add mistral_7b_w_tokenized_dataset config * update doc * update doc * add comments * fix data save path * smart_tokenizer_and_embedding_resize support zero3 * fix lint * add data format to internlm2_7b_full_finetune_custom_dataset_e1.py * add a data format example to configs associated with finetuning custom dataset * add a data format example to configs associated with finetuning custom dataset * fix lint * Update prompt_template.md (#441) 修改了一个错别字 * [Doc] Split finetune_custom_dataset.md to 6 parts (#445) * split finetune_custom_dataset.md to 6 parts * refactor custom_dataset and ftdp_dataset related docs * fix comments * fix pre-commit --------- Co-authored-by: pppppM <[email protected]> Co-authored-by: RangiLyu <[email protected]> Co-authored-by: whcao <[email protected]> Co-authored-by: pppppM <[email protected]> Co-authored-by: gzlong96 <[email protected]> Co-authored-by: zilong.guo <[email protected]> Co-authored-by: Ko Sung <[email protected]> Co-authored-by: 不要葱姜蒜 <[email protected]> Co-authored-by: fanqiNO1 <[email protected]> Co-authored-by: PommesPeter <[email protected]> Co-authored-by: LKJacky <[email protected]> Co-authored-by: jacky <[email protected]> Co-authored-by: xzw <[email protected]> * [Docs] Add `docs/zh_cn/preparation/pretrained_model.md` (#462) * fix pre-commit * update * Update pretrained_model.md * Update pretrained_model.md * fix pre-commit * Update pretrained_model.md * update * update * update * update * Update pretrained_model.md * [Docs] Add `docs/zh_cn/training/multi_modal_dataset.md` (#503) * update * update * [Docs] Improve readthedocs style (#545) * update style * update style * fix requirements * fix * fix * add logo * update * update * update * [Docs] `.md` to `.rst` (#544) * update rst * update rst * update rst * [Docs] Add `docs/zh_cn/training/custom_pretrain_dataset.rst` (#535) * update * update * update rst * [Docs] Add docs about training on large scale dataset (#517) * add train_on_large_scale_dataset doc * refine doc * add llava offline doc * refine doc * replace md with rst * refine rst * refine rst * [Docs] Add internevo migration related documents (#506) * add internevo related * fix comments * refine doc * rename internlm2_7b_w_tokenized_dataset.py to internlm2_7b_w_internevo_dataset.py * refine doc * replace md with rst * refine rst * refine rst * [Docs] Add `docs/zh_cn/training/modify_settings.rst` (#490) * update * update * update * update * update * update * Update modify_settings.md * Update modify_settings.md * update * Update docs/zh_cn/training/modify_settings.md Co-authored-by: Haian Huang(深度眸) <[email protected]> * update deepspeed * update rst * update rst --------- Co-authored-by: Haian Huang(深度眸) <[email protected]> * [Docs] Add `length_grouped_sampler.rst` (#511) * update * update * update * Update length_grouped_sampler.md * update rst * Update length_grouped_sampler.rst Co-authored-by: whcao <[email protected]> --------- Co-authored-by: whcao <[email protected]> * [Docs] Add accelerate related (#504) * add accelerate related * split accelerate docs * fix comments * add speed benchmark * explain why qlora can not be used with zero3 * refine doc * fix configs * refine doc * refine doc * refine configs * add benchmark to index.rst * refine doc * add hyper-param docs * refine doc * add explanation about memory cost optimization when using zero * add figure to show the speed comparison * refine figures * refine doc * fix figures * refine figures * update figures and benchmark configs * add pack rst * delete pack md * replace md with rst * replace md with rst * replace md with rst * replace md with rst * refine rst * refine rst * refine rst * refine rst * refine rst * refine rst * refine rst * refine rst * refine rst * refine rst * refine rst * refine rst * refine rst * refine rst --------- Co-authored-by: pppppM <[email protected]> * [Docs] Add visualization docs (#516) * add visualization docs * delete other visualization tools and add explanation about how to use tensorboard * replace md with rst --------- Co-authored-by: pppppM <[email protected]> * [Docs] Add docs about SFT with custom dataset (#514) * add custom sft dataset docs * add custom dataset template configs * add openai data format * refine doc * update (#2) * replace md with rst --------- Co-authored-by: Zhihao Lin <[email protected]> Co-authored-by: pppppM <[email protected]> * [Docs] Add `docs/zh_cn/training/open_source_dataset.rst` (#502) * update * update * update * update * format table * fix typo * update rst --------- Co-authored-by: pppppM <[email protected]> * [Docs] Add `docs/zh_cn/preparation/prompt_template.rst` (#475) * update * update * Update prompt_template.md * Update prompt_template.md * update * add tips * update * update rst --------- Co-authored-by: pppppM <[email protected]> * [Docs] Add Sequence Parallel documents (#505) * add sp related * add sequence parallel supported models * refine doc * Update docs/zh_cn/training/training_extreme_long_sequence.md Co-authored-by: Haian Huang(深度眸) <[email protected]> * refine doc * refine doc * test the capability boundary of zero3 * refine doc * test rst * test rst * add training speed figure * delete debug rst * sp need flash_attn * WIP * replace md with rst * refine rst * refine rst * add explanation about why pt 2.1 is not accepted * refine rst * refine rst * add loss curve --------- Co-authored-by: Haian Huang(深度眸) <[email protected]> Co-authored-by: pppppM <[email protected]> * [Docs] Update `docs/zh_cn` outline (#556) update * [Docs] Update `docs/en` theme (#557) * update * update * update * update * update * update * update * update * [Docs] Add tokenizer to sft in Case 2 (#584) add tokenizer to sft in Case 2 * [Docs] Improve the Rendering Effect of Readthedocs (#664) * refine get_start and training * fix acceleration * update maxdepth * refine internevo migration * refine internevo * fix typos * fix lint --------- Co-authored-by: zhengjie.xu <[email protected]> Co-authored-by: Ma Zhiming <[email protected]> Co-authored-by: fanqiNO1 <[email protected]> Co-authored-by: Jianfeng777 <[email protected]> Co-authored-by: Zhihao Lin <[email protected]> Co-authored-by: RangiLyu <[email protected]> Co-authored-by: whcao <[email protected]> Co-authored-by: gzlong96 <[email protected]> Co-authored-by: zilong.guo <[email protected]> Co-authored-by: Ko Sung <[email protected]> Co-authored-by: 不要葱姜蒜 <[email protected]> Co-authored-by: PommesPeter <[email protected]> Co-authored-by: LKJacky <[email protected]> Co-authored-by: jacky <[email protected]> Co-authored-by: xzw <[email protected]> Co-authored-by: Haian Huang(深度眸) <[email protected]>

set target_modules

* [Docs] Readthedocs (InternLM#304) * init readthedocs * add en docs * add zh docs * fix lint * [Fix] Support ZH Readthedocs (InternLM#305) * add zh yaml * test zh cn * test yaml path * pass * update conf.py * [Docs] Document optimization (InternLM#362) Document optimization * [Docs] Update Docs docs/en/get_started/installation.md (InternLM#364) * 更新中文 installation.md 完成中文的安装-安装流程-最佳实践 & 安装-验证安装 * Update installation.md en * Update installation.md zh typo * [Docs] Refine Quick Start (InternLM#378) * [Docs] Add zh_cn quickstart * [Fix] Fix color rendering logic for github * [Fix] Fix comments * [Fix] Add hyperlinks * [Docs] Add en quickstart * [Fix] Fix comments * Update overview.md (InternLM#412) * Update overview.md * Update overview.md 已根据要求进行修改，请查阅 * Update overview.md 进一步的修正 * Update overview.md 根据要求的完善 * Merge branch 'main' into 'docs' (InternLM#463) * [Improve] Redesign the `prompt_template` (InternLM#294) * update * update cfgs * update * fix bugs * upload docs * rename * update * Revert "update cfgs" This reverts commit 93966aa. * update cfgs * update * rename * rename * fix bc * fix stop_word * fix * fix * Update prompt_template.md * [Fix] Fix errors about `stop_words` (InternLM#313) * fix bugs * Update mmbench.py * [Fix] Fix Mixtral LoRA setting (InternLM#312) set target_modules * [Feature] Support DeepSeek-MoE (InternLM#311) * support deepseek moe * update docs * update * update * [Fix] Set `torch.optim.AdamW` as the default optimizer (InternLM#318) fix * [FIx] Fix `pth_to_hf` for LLaVA model (InternLM#316) Update pth_to_hf.py * [Improve] Add `demo_data` examples (InternLM#278) * update examples * add examples * add json template config * rename * update * update * update * [Feature] Support InternLM2 (InternLM#321) * add cfgs * add internlm2 template * add dispatch * add docs * update readme * update * [Fix] Fix the resume of seed (InternLM#309) * fix * Update utils.py * [Feature] Accelerate `xtuner xxx` (InternLM#307) * accelerate cli * Update entry_point.py * Update entry_point.py --------- Co-authored-by: Zhihao Lin <[email protected]> * [Fix] Fix InternLM2 url (InternLM#325) * fix * update * Update README.md * Update README_zh-CN.md * [Fix] Limit the version of python, `>=3.8, <3.11` (InternLM#327) update * [Fix] Add `trust_remote_code=True` for AutoModel (InternLM#328) update * [Docs] Improve README (InternLM#326) * update * Update README.md * Update README.md * Update README.md * Update README_zh-CN.md * update * update * fix pre-commit * update * bump verion to v0.1.12 (InternLM#323) bump v0.1.12 * set dev version (InternLM#329) Update version.py * [Docs] Add LLaVA-InternLM2 results (InternLM#332) * update results * update * Update internlm2_chat template (InternLM#339) Update internlm2 template * [Fix] Fix examples demo_data configs (InternLM#334) fix * bump version to v0.1.13 (InternLM#340) update * set dev version (InternLM#341) update * [Feature] More flexible `TrainLoop` (InternLM#348) * add new loop * rename * fix pre-commit * add max_keep_ckpts * fix * update cfgs * update examples * fix * update * update llava * update * update * update * update * [Feature]Support CEPH (InternLM#266) * support petrelfs * fix deepspeed save/load/resume * add ENV to toggle petrelfs * support hf save_pretrained * patch deepspeed engine * [Improve] Add `--repetition-penalty` for `xtuner chat` (InternLM#351) fix * [Feature] Support MMBench DDP Evaluate (InternLM#300) * support ddp mmbench evaluate * Update xtuner/tools/mmbench.py Co-authored-by: Zhihao Lin <[email protected]> * Update xtuner/tools/mmbench.py Co-authored-by: Zhihao Lin <[email protected]> * update minimum version of mmengine * Update runtime.txt --------- Co-authored-by: Zhihao Lin <[email protected]> * [Fix] `KeyError` of `encode_fn` (InternLM#361) fix * [Fix] Fix `batch_size` of full fine-tuing LLaVA-InternLM2 (InternLM#360) fix * [Fix] Remove `system` for `alpaca_map_fn` (InternLM#363) update * [Fix] Use `DEFAULT_IMAGE_TOKEN` instead of `'<image>'` (InternLM#353) Update utils.py * [Feature] Efficient SFT (InternLM#302) * add local_attn_args_to_messagehub_hook * add internlm repo sampler * add internlm repo dataset and collate_fn * dispatch internlm1 and internlm2 local attn * add internlm2 config * add internlm1 and intenrlm2 config * add internlm2 template * fix replace_internlm1_rote bugs * add internlm1 and internlm2 config templates * change priority of EvaluateChatHook * fix docs * fix config * fix bug * set rotary_base according the latest internlm2 config * add llama local attn * add llama local attn * update intern_repo_dataset docs when using aliyun * support using both hf load_dataset and intern_repo packed_dataset * add configs * add opencompass doc * update opencompass doc * use T data order * use T data order * add config * add a tool to get data order * support offline processing untokenized dataset * add docs * add doc about only saving model weights * add doc about only saving model weights * dispatch mistral * add mistral template * add mistral template * fix torch_dtype * reset pre-commit-config * fix config * fix internlm_7b_full_intern_repo_dataset_template * update local_attn to varlen_attn * rename local_attn * fix InternlmRepoSampler and train.py to support resume * modify Packer to support varlen attn * support varlen attn in default pipeline * update mmengine version requirement to 0.10.3 * Update ceph.md * delete intern_repo_collate_fn * delete intern_repo_collate_fn * delete useless files * assert pack_to_max_length=True if use_varlen_attn=True * add varlen attn doc * add varlen attn to configs * delete useless codes * update * update * update configs * fix priority of ThroughputHook and flake8 ignore W504 * using map_fn to set length attr to dataset * support split=None in process_hf_dataset * add dataset_format_mapping * support preprocess ftdp and normal dataset * refactor process_hf_dataset * support pack dataset in process_untokenized_datasets * add xtuner_dataset_timeout * using gloo backend for monitored barrier * set gloo timeout * fix bugs * fix configs * refactor intern repo dataset docs * fix doc * fix lint --------- Co-authored-by: pppppM <[email protected]> Co-authored-by: pppppM <[email protected]> * [Fix] Add `attention_mask` for `default_collate_fn` (InternLM#371) fix * [Fix] Update requirements (InternLM#369) Update runtime.txt * [Fix] Fix rotary_base, add `colors_map_fn` to `DATASET_FORMAT_MAPPING` and rename 'internlm_repo' to 'intern_repo' (InternLM#372) * fix * rename internlm_repo to intern_repo * add InternlmRepoSampler for preventing bc break * add how to install flash_attn to doc * update (InternLM#377) * Delete useless codes and refactor process_untokenized_datasets (InternLM#379) * delete useless codes * refactor process_untokenized_datasets: add ftdp to dataset-format * fix lint * [Feature] support flash attn 2 in internlm1, internlm2 and llama (InternLM#381) support flash attn 2 in internlm1, internlm2 and llama * [Fix] Fix installation docs of mmengine in `intern_repo_dataset.md` (InternLM#384) update * [Fix] Update InternLM2 `apply_rotary_pos_emb` (InternLM#383) update * [Feature] support saving eval output before save checkpoint (InternLM#385) * support saving eval output before save checkpoint * refactor * [Fix] lr scheduler setting (InternLM#394) * fix lr scheduler setting * fix more --------- Co-authored-by: zilong.guo <[email protected]> Co-authored-by: LZHgrla <[email protected]> * [Fix] Remove pre-defined `system` of `alpaca_zh_map_fn` (InternLM#395) fix * [Feature] Support `Qwen1.5` (InternLM#407) * rename * update docs * update template * update * add cfgs * update * update * [Fix] Fix no space in chat output using InternLM2. (InternLM#357) (InternLM#404) * [Fix] Fix no space in chat output using InternLM2. (InternLM#357) * Update chat.py * Update utils.py * Update utils.py * fix pre-commit --------- Co-authored-by: Zhihao Lin <[email protected]> Co-authored-by: LZHgrla <[email protected]> * [Fix] typo: `--system-prompt` to `--system-template` (InternLM#406) fix * [Improve] Add `output_with_loss` for dataset process (InternLM#408) update * [Fix] Fix dispatch to support transformers>=4.36 & Add USE_TRITON_KERNEL environment variable (InternLM#411) * dispatch support transformers>=4.36 * add USE_TRITON_KERNEL environment variable * raise RuntimeError use triton kernels on cpu * fix lint * [Feature]Add InternLM2-1_8b configs (InternLM#396) * [Feature]Add InternLM2-Chat-1_8b full config * [Feature]Add InternLM2-Chat-1_8b full config * update --------- Co-authored-by: LZHgrla <[email protected]> Co-authored-by: Zhihao Lin <[email protected]> * [Fix] Fix `extract_json_objects` (InternLM#419) * [Fix] Fix pth_to_hf error (InternLM#426) fix * [Feature] Support `Gemma` (InternLM#429) * added gemma config and template * check config and make sure the consistancy * Update xtuner/configs/gemma/gemma_2b_base/gemma_2b_base_qlora_alpaca_e3.py Co-authored-by: Zhihao Lin <[email protected]> * Update xtuner/configs/gemma/gemma_2b_base/gemma_2b_base_full_alpaca_e3.py Co-authored-by: Zhihao Lin <[email protected]> * Update xtuner/configs/gemma/gemma_7b_base/gemma_7b_base_full_alpaca_e3.py Co-authored-by: Zhihao Lin <[email protected]> * Update xtuner/configs/gemma/gemma_7b_base/gemma_7b_base_qlora_alpaca_e3.py Co-authored-by: Zhihao Lin <[email protected]> * Update xtuner/utils/templates.py Co-authored-by: Zhihao Lin <[email protected]> * update * added required version * update * update --------- Co-authored-by: Zhihao Lin <[email protected]> Co-authored-by: LZHgrla <[email protected]> * add refcoco to llava (InternLM#425) * add base dataset * update dataset generation * update refcoco * add convert refcooc * add eval_refcoco * add config * update dataset * fix bug * fix bug * update data prepare * fix error * refactor eval_refcoco * fix bug * fix error * update readme * add entry_point * update config * update config * update entry point * update * update doc * update --------- Co-authored-by: jacky <[email protected]> * [Fix] Inconsistent BatchSize of `LengthGroupedSampler` (InternLM#436) update * bump version to v0.1.14 (InternLM#431) update * set dev version (InternLM#437) * Update version.py * Update version.py * [Bugs] Fix bugs when using EpochBasedRunner (InternLM#439) fix bugs when using epochbasedrunner * [Feature] Support processing ftdp dataset and custom dataset offline (InternLM#410) * support smart_tokenizer_and_embedding_resize * replace ast with json.loads * support list_dataset_format cli * add doc about ftdp and custom dataset * add custom dataset template * add args name to process_hf_dataset * use new process_untokenized_datasets * support tokenize_ftdp_datasets * add mistral_7b_w_tokenized_dataset config * update doc * update doc * add comments * fix data save path * smart_tokenizer_and_embedding_resize support zero3 * fix lint * add data format to internlm2_7b_full_finetune_custom_dataset_e1.py * add a data format example to configs associated with finetuning custom dataset * add a data format example to configs associated with finetuning custom dataset * fix lint * Update prompt_template.md (InternLM#441) 修改了一个错别字 * [Doc] Split finetune_custom_dataset.md to 6 parts (InternLM#445) * split finetune_custom_dataset.md to 6 parts * refactor custom_dataset and ftdp_dataset related docs * fix comments * fix pre-commit --------- Co-authored-by: pppppM <[email protected]> Co-authored-by: RangiLyu <[email protected]> Co-authored-by: whcao <[email protected]> Co-authored-by: pppppM <[email protected]> Co-authored-by: gzlong96 <[email protected]> Co-authored-by: zilong.guo <[email protected]> Co-authored-by: Ko Sung <[email protected]> Co-authored-by: 不要葱姜蒜 <[email protected]> Co-authored-by: fanqiNO1 <[email protected]> Co-authored-by: PommesPeter <[email protected]> Co-authored-by: LKJacky <[email protected]> Co-authored-by: jacky <[email protected]> Co-authored-by: xzw <[email protected]> * [Docs] Add `docs/zh_cn/preparation/pretrained_model.md` (InternLM#462) * fix pre-commit * update * Update pretrained_model.md * Update pretrained_model.md * fix pre-commit * Update pretrained_model.md * update * update * update * update * Update pretrained_model.md * [Docs] Add `docs/zh_cn/training/multi_modal_dataset.md` (InternLM#503) * update * update * [Docs] Improve readthedocs style (InternLM#545) * update style * update style * fix requirements * fix * fix * add logo * update * update * update * [Docs] `.md` to `.rst` (InternLM#544) * update rst * update rst * update rst * [Docs] Add `docs/zh_cn/training/custom_pretrain_dataset.rst` (InternLM#535) * update * update * update rst * [Docs] Add docs about training on large scale dataset (InternLM#517) * add train_on_large_scale_dataset doc * refine doc * add llava offline doc * refine doc * replace md with rst * refine rst * refine rst * [Docs] Add internevo migration related documents (InternLM#506) * add internevo related * fix comments * refine doc * rename internlm2_7b_w_tokenized_dataset.py to internlm2_7b_w_internevo_dataset.py * refine doc * replace md with rst * refine rst * refine rst * [Docs] Add `docs/zh_cn/training/modify_settings.rst` (InternLM#490) * update * update * update * update * update * update * Update modify_settings.md * Update modify_settings.md * update * Update docs/zh_cn/training/modify_settings.md Co-authored-by: Haian Huang(深度眸) <[email protected]> * update deepspeed * update rst * update rst --------- Co-authored-by: Haian Huang(深度眸) <[email protected]> * [Docs] Add `length_grouped_sampler.rst` (InternLM#511) * update * update * update * Update length_grouped_sampler.md * update rst * Update length_grouped_sampler.rst Co-authored-by: whcao <[email protected]> --------- Co-authored-by: whcao <[email protected]> * [Docs] Add accelerate related (InternLM#504) * add accelerate related * split accelerate docs * fix comments * add speed benchmark * explain why qlora can not be used with zero3 * refine doc * fix configs * refine doc * refine doc * refine configs * add benchmark to index.rst * refine doc * add hyper-param docs * refine doc * add explanation about memory cost optimization when using zero * add figure to show the speed comparison * refine figures * refine doc * fix figures * refine figures * update figures and benchmark configs * add pack rst * delete pack md * replace md with rst * replace md with rst * replace md with rst * replace md with rst * refine rst * refine rst * refine rst * refine rst * refine rst * refine rst * refine rst * refine rst * refine rst * refine rst * refine rst * refine rst * refine rst * refine rst --------- Co-authored-by: pppppM <[email protected]> * [Docs] Add visualization docs (InternLM#516) * add visualization docs * delete other visualization tools and add explanation about how to use tensorboard * replace md with rst --------- Co-authored-by: pppppM <[email protected]> * [Docs] Add docs about SFT with custom dataset (InternLM#514) * add custom sft dataset docs * add custom dataset template configs * add openai data format * refine doc * update (InternLM#2) * replace md with rst --------- Co-authored-by: Zhihao Lin <[email protected]> Co-authored-by: pppppM <[email protected]> * [Docs] Add `docs/zh_cn/training/open_source_dataset.rst` (InternLM#502) * update * update * update * update * format table * fix typo * update rst --------- Co-authored-by: pppppM <[email protected]> * [Docs] Add `docs/zh_cn/preparation/prompt_template.rst` (InternLM#475) * update * update * Update prompt_template.md * Update prompt_template.md * update * add tips * update * update rst --------- Co-authored-by: pppppM <[email protected]> * [Docs] Add Sequence Parallel documents (InternLM#505) * add sp related * add sequence parallel supported models * refine doc * Update docs/zh_cn/training/training_extreme_long_sequence.md Co-authored-by: Haian Huang(深度眸) <[email protected]> * refine doc * refine doc * test the capability boundary of zero3 * refine doc * test rst * test rst * add training speed figure * delete debug rst * sp need flash_attn * WIP * replace md with rst * refine rst * refine rst * add explanation about why pt 2.1 is not accepted * refine rst * refine rst * add loss curve --------- Co-authored-by: Haian Huang(深度眸) <[email protected]> Co-authored-by: pppppM <[email protected]> * [Docs] Update `docs/zh_cn` outline (InternLM#556) update * [Docs] Update `docs/en` theme (InternLM#557) * update * update * update * update * update * update * update * update * [Docs] Add tokenizer to sft in Case 2 (InternLM#584) add tokenizer to sft in Case 2 * [Docs] Improve the Rendering Effect of Readthedocs (InternLM#664) * refine get_start and training * fix acceleration * update maxdepth * refine internevo migration * refine internevo * fix typos * fix lint --------- Co-authored-by: zhengjie.xu <[email protected]> Co-authored-by: Ma Zhiming <[email protected]> Co-authored-by: fanqiNO1 <[email protected]> Co-authored-by: Jianfeng777 <[email protected]> Co-authored-by: Zhihao Lin <[email protected]> Co-authored-by: RangiLyu <[email protected]> Co-authored-by: whcao <[email protected]> Co-authored-by: gzlong96 <[email protected]> Co-authored-by: zilong.guo <[email protected]> Co-authored-by: Ko Sung <[email protected]> Co-authored-by: 不要葱姜蒜 <[email protected]> Co-authored-by: PommesPeter <[email protected]> Co-authored-by: LKJacky <[email protected]> Co-authored-by: jacky <[email protected]> Co-authored-by: xzw <[email protected]> Co-authored-by: Haian Huang(深度眸) <[email protected]>

set target_modules

6bd9ddd

pppppM approved these changes Jan 12, 2024

View reviewed changes

LZHgrla merged commit 8ab2762 into InternLM:main Jan 12, 2024
1 check passed

llkn-2 pushed a commit to llkn-2/xtuner that referenced this pull request Jul 31, 2024

[Fix] Fix Mixtral LoRA setting (InternLM#312)

0a824b7

set target_modules

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Fix] Fix Mixtral LoRA setting #312

[Fix] Fix Mixtral LoRA setting #312

LZHgrla commented Jan 11, 2024 •

edited

Loading

[Fix] Fix Mixtral LoRA setting #312

[Fix] Fix Mixtral LoRA setting #312

Conversation

LZHgrla commented Jan 11, 2024 • edited Loading

LZHgrla commented Jan 11, 2024 •

edited

Loading