Sampler #2905

tastelikefeet · 2025-01-13T08:48:20Z

PR type

Bug Fix
New Feature
Document Updates
More Models or Datasets Support

PR information

Write the detail information belongs to this PR.

Experiment results

Paste your experiment result here(if needed).

* main: Support LoRA-GA (modelscope#2650) fix swift/Infinity-Instruct (modelscope#2651) update truncation_strategy (modelscope#2647) fix bugs (modelscope#2645) Fix post encode (modelscope#2643) fix app-ui (modelscope#2641) fix bugs & support openbuddy llama3.3 & update docs (modelscope#2638) fix dataset (modelscope#2636) fix add_default_tag (modelscope#2631) support reward model (modelscope#2628)

* commit '07f10d2a94e7342413fa7762b6ce6b101b93d130': (86 commits) Move optimizer to create_optimizer (modelscope#2851) support reward_model (modelscope#2849) 1. fix hub ignore-pattern (modelscope#2848) Fix bugs (modelscope#2838) Update base_to_chat shell (modelscope#2833) Update padding side (modelscope#2832) Fix glm4v suffix (modelscope#2829) add 'right' option for 'truncation_strategy' (modelscope#2754) update docs (specific model arguments) (modelscope#2822) fix enable_cache (modelscope#2813) fix citest (modelscope#2812) support ZhipuAI/cogagent-9b-20241220 (modelscope#2810) fix swift deploy log error (repeat log) (modelscope#2808) fix glm4v (modelscope#2806) update base_model deploy example (modelscope#2803) fix world_size (modelscope#2801) fix (modelscope#2800) support swift app (modelscope#2792) fix some web-ui bugs (modelscope#2794) fix stream infer (modelscope#2793) ... # Conflicts: # examples/train/multi-gpu/ddp/train.sh # swift/llm/__init__.py # swift/llm/argument/rlhf_args.py # swift/llm/template/base.py # swift/llm/template/template_inputs.py # swift/llm/template/utils.py # swift/llm/train/tuner.py # swift/trainers/mixin.py

* commit 'a0d0351400d522392fb4535567bab83d8b9d45b2': Support infer n parameter (modelscope#2893) support multi round dpo (modelscope#2884) fix docs (modelscope#2882) update qlora shell (modelscope#2880) fix bugs (modelscope#2876) fix citest (modelscope#2873) Support ppo (modelscope#2783) fix bugs (modelscope#2869) Update agent demo (modelscope#2867) support mps (modelscope#2866) fix vllm video (modelscope#2864) support reward model train (modelscope#2862) fix jsonl writer (modelscope#2860) Support quant bert reward (modelscope#2859) # Conflicts: # examples/train/rlhf/ppo.sh # swift/trainers/__init__.py # swift/trainers/mixin.py # swift/trainers/rlhf_trainer/ppo_trainer.py

* commit '65e4b26cc433878dbb4d67b0d1ae97287814bfa4': fix link & bug (modelscope#2902) Add phi4 (modelscope#2895) fix infer engine (modelscope#2898) Fix qwen vl eval (modelscope#2892)

tastelikefeet added 30 commits December 11, 2024 20:49

a first version for rlft

fc387d3

wip

e3edef6

fix

bf72b27

fix

022eaa1

fix

33d3754

fix

0c2b8b8

fix

f865713

fix

c74abcf

fix

a500ab3

fix

cdb3e63

fix

3c92780

fix

58831f4

fix

5d5e52d

fix

5b3879c

test

05bff80

fix

07ced40

fix

ab33782

fix

99d386d

wip

fb6e419

wip

0a6feac

wip

d5edfab

wip

0fbd0e8

fix

3227061

wip

0ea123d

wip

49b9d0d

fix

b6935ca

wip

14336d4

wip

a65503c

tastelikefeet and others added 28 commits January 6, 2025 14:24

fix

6818b0f

fix

6d601cc

support dataset cache

977fafa

fix

6f27a51

fix

090f91a

fix

962546a

fix

3070a3b

fix

cd0d075

fix

cac1045

wip

06f1b71

wip

741bb1c

wip

a7cf321

fix

b1875d5

fix

9aa858d

wip

e0116d6

wip

7a9049e

fix

9b0cd55

fix

bb82859

wip

75deaae

wip

ba274a2

ready for pr

1ad56df

lint

738e506

revert some code

4f135c9

Merge commit '65e4b26cc433878dbb4d67b0d1ae97287814bfa4' into feat/rlft

a88b1c5

* commit '65e4b26cc433878dbb4d67b0d1ae97287814bfa4': fix link & bug (modelscope#2902) Add phi4 (modelscope#2895) fix infer engine (modelscope#2898) Fix qwen vl eval (modelscope#2892)

fix

62166fd

remove useless dataset

49ca582

fix comments

1bc636c

Jintao-Huang approved these changes Jan 13, 2025

View reviewed changes

tastelikefeet merged commit e9f4f9f into modelscope:main Jan 13, 2025
1 of 2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sampler #2905

Sampler #2905

tastelikefeet commented Jan 13, 2025

Sampler #2905

Sampler #2905

Conversation

tastelikefeet commented Jan 13, 2025

PR type

PR information

Experiment results