Skip to content

Conversation

@stas00
Copy link
Contributor

@stas00 stas00 commented Jul 23, 2021

Was getting:

 File "/mnt/nvme1/code/huggingface/Megatron-DeepSpeed-master/megatron/training.py", line 132, in pretrain
    = build_train_valid_test_data_iterators(
  File "/mnt/nvme1/code/huggingface/Megatron-DeepSpeed-master/megatron/training.py", line 842, in build_train_valid_test_data_iterators
    = build_train_valid_test_data_iterators(
  File "/mnt/nvme1/code/huggingface/Megatron-DeepSpeed-master/megatron/training.py", line 842, in build_train_valid_test_data_iterators
    assert args.train_samples is None, \
AssertionError: only backward compatiblity support for iteration-based training

on restart after a very short first run.

  • re-doing as the first version had a problem and we want this self-contained to forward to upstream.

@stas00 stas00 merged commit 9189c4e into main Jul 23, 2021
@stas00 stas00 deleted the restarting-with-no-eval branch July 23, 2021 22:36
janEbert pushed a commit to janEbert/Megatron-DeepSpeed that referenced this pull request Dec 13, 2022
adammoody pushed a commit to adammoody/Megatron-DeepSpeed that referenced this pull request Jun 21, 2023
…rkshop#100)

* staging_data_efficiency_v1 (bigscience-workshop#12)

* refactor and clean

* script refactor

* fix

* fix

* fix

* fix

* refactor

* script

* CL diff type

* script cleanup

* fix for MP

* refactor

* refactor

* fix

* apply feedback
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants