Skip to content

Conversation

@stas00
Copy link
Contributor

@stas00 stas00 commented Jun 3, 2022

Fixes #17151 to run from_pretrained with emulated dist env.

While it was working on my setup on CI it failed with:

tests/deepspeed/test_deepspeed.py:756: in test_load_best_model
    model = T5ForConditionalGeneration.from_pretrained(T5_TINY)
src/transformers/modeling_utils.py:2116: in from_pretrained
    init_contexts = [deepspeed.zero.Init(config_dict_or_path=deepspeed_config())] + init_contexts
/opt/conda/lib/python3.8/site-packages/deepspeed/runtime/zero/partition_parameters.py:693: in __init__
    self.local_device = torch.device('cuda:{}'.format(os.environ["LOCAL_RANK"]))
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
self = environ({'NPP_VERSION': '11.3.2.139', 'NVIDIA_VISIBLE_DEVICES': 'all', 'DALI_BUILD': '2054952', 'GITHUB_WORKSPACE': '/...RRENT_TEST': 'tests/deepspeed/test_deepspeed.py::TrainerIntegrationDeepSpeed::test_load_best_model_zero3_fp16 (call)'})
key = 'LOCAL_RANK'
    def __getitem__(self, key):
        try:
            value = self._data[self.encodekey(key)]
        except KeyError:
            # raise KeyError with the original key value
>           raise KeyError(key) from None
E           KeyError: 'LOCAL_RANK'

the test is exactly the same, just moved a big chunk of it into the with mockenv_context - no code changes

@sgugger

Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing!

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Jun 3, 2022

The documentation is not available anymore as the PR was closed or merged.

@stas00 stas00 merged commit 26e5e12 into main Jun 3, 2022
@stas00 stas00 deleted the load_best_model-2 branch June 3, 2022 18:19
Narsil pushed a commit to Narsil/transformers that referenced this pull request Jun 7, 2022
elusenji pushed a commit to elusenji/transformers that referenced this pull request Jun 12, 2022
amyeroberts pushed a commit to amyeroberts/transformers that referenced this pull request Jun 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants