You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
tiphaineeee
changed the title
同一个数据集和模型,相同参数设置,训练两次,0.5epoch时会因为模型见到数据顺序的原因导致很大效果差异吗?
同一个数据集和模型,相同参数设置,训练两次,0.5epoch时会因为模型见到数据顺序不同的原因导致很大效果差异吗?
Mar 7, 2025
Reminder
System Info
model
model_name_or_path: qwen/Qwen2-1.5B-Instruct
method
stage: sft
do_train: true
finetuning_type: full
lora_target: all
deepspeed: examples/deepspeed/ds_z3_config.json
dataset
dataset: xxx
template: qwen
cutoff_len: 1024
max_samples: 1000000000 #1000
overwrite_cache: true
preprocessing_num_workers: 16
output
output_dir: xxx
output_dir: xxx
logging_steps: 10
save_steps: 200
plot_loss: true
overwrite_output_dir: true
report_to: tensorboard
logging_dir: xxx
train
per_device_train_batch_size: 4
gradient_accumulation_steps: 16
learning_rate: 1.0e-4
num_train_epochs: 5
lr_scheduler_type: cosine
warmup_ratio: 0.1
bf16: true
ddp_timeout: 180000000
eval
val_size: 0.002
per_device_eval_batch_size: 4
eval_strategy: steps
eval_steps: 200 #500
Reproduction
Others
No response
The text was updated successfully, but these errors were encountered: