Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

同一个数据集和模型,相同参数设置,训练两次,0.5epoch时会因为模型见到数据顺序不同的原因导致很大效果差异吗? #7200

Open
1 task done
tiphaineeee opened this issue Mar 7, 2025 · 0 comments
Labels
invalid This doesn't seem right

Comments

@tiphaineeee
Copy link

Reminder

  • I have read the above rules and searched the existing issues.

System Info

model

model_name_or_path: qwen/Qwen2-1.5B-Instruct

method

stage: sft
do_train: true
finetuning_type: full

lora_target: all

deepspeed: examples/deepspeed/ds_z3_config.json

dataset

dataset: xxx
template: qwen
cutoff_len: 1024
max_samples: 1000000000 #1000
overwrite_cache: true
preprocessing_num_workers: 16

output

output_dir: xxx

output_dir: xxx
logging_steps: 10
save_steps: 200
plot_loss: true
overwrite_output_dir: true
report_to: tensorboard
logging_dir: xxx

train

per_device_train_batch_size: 4
gradient_accumulation_steps: 16
learning_rate: 1.0e-4
num_train_epochs: 5
lr_scheduler_type: cosine
warmup_ratio: 0.1
bf16: true
ddp_timeout: 180000000

eval

val_size: 0.002
per_device_eval_batch_size: 4
eval_strategy: steps
eval_steps: 200 #500

Reproduction

Put your message here.

Others

No response

@tiphaineeee tiphaineeee added bug Something isn't working pending This problem is yet to be addressed labels Mar 7, 2025
@tiphaineeee tiphaineeee changed the title 同一个数据集和模型,相同参数设置,训练两次,0.5epoch时会因为模型见到数据顺序的原因导致很大效果差异吗? 同一个数据集和模型,相同参数设置,训练两次,0.5epoch时会因为模型见到数据顺序不同的原因导致很大效果差异吗? Mar 7, 2025
@hiyouga hiyouga added invalid This doesn't seem right and removed bug Something isn't working pending This problem is yet to be addressed labels Mar 7, 2025
@tiphaineeee tiphaineeee reopened this Mar 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
invalid This doesn't seem right
Projects
None yet
Development

No branches or pull requests

2 participants