Skip to content

deps: pull in transformers latest version#235

Closed
anhuong wants to merge 3 commits intofoundation-model-stack:mainfrom
anhuong:unpin-transformers
Closed

deps: pull in transformers latest version#235
anhuong wants to merge 3 commits intofoundation-model-stack:mainfrom
anhuong:unpin-transformers

Conversation

@anhuong
Copy link
Collaborator

@anhuong anhuong commented Jul 8, 2024

Description of the change

  • set lower limit of transformers to 4.41.0 which uses TrainingArgumenteval_strategy instead of evaluation_strategy
  • set upper limit of transformers to 5.0, thus pulling in new versions of transformers

Related issue number

Previously we had seen degraded performance in v4.41 of transformers as described in issue #201. After further evaluation described in the issue, the throughput degradation on llama3-8B was only 17%, only on 2 A100, and not at all on 4 A100, and that memory utilization was significantly improved by moving to 4.41.

How to verify the PR

Was the PR tested

  • I have added >=1 unit test(s) for every new method I have added.
  • I have ensured all unit tests pass

Signed-off-by: Anh-Uong <anh.uong@ibm.com>
@anhuong
Copy link
Collaborator Author

anhuong commented Jul 8, 2024

Note that transformers starting with v4.42.0 unit tests fail when running inference with error KeyError: 'Cache only has 0 layers, attempted to access layer with index 0'. Tuning succeeded and only failed on test cases that ran Prompt Tuning and ran inference after failed, the LoRA and fine tuning tests succeeded.
See unit tests for list of dep versions. Full error:

            # Run inference on the text
>           output_inference = loaded_model.run(
                "### Text: @NortonSupport Thanks much.\n\n### Label:", max_new_tokens=50
            )

tests/test_sft_trainer.py:248: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
scripts/run_inference.py:246: in run
    peft_outputs = self.peft_model.generate(
.tox/py/lib/python3.10/site-packages/peft/peft_model.py:1493: in generate
    outputs = self.base_model.generate(**kwargs)
.tox/py/lib/python3.10/site-packages/torch/utils/_contextlib.py:115: in decorate_context
    return func(*args, **kwargs)
.tox/py/lib/python3.10/site-packages/transformers/generation/utils.py:1914: in generate
    result = self._sample(
.tox/py/lib/python3.10/site-packages/transformers/generation/utils.py:2648: in _sample
    model_inputs = self.prepare_inputs_for_generation(input_ids, **model_kwargs)
.tox/py/lib/python3.10/site-packages/peft/peft_model.py:1522: in prepare_inputs_for_generation
    if model_kwargs["past_key_values"][0][0].shape[-2] >= model_kwargs["input_ids"].shape[1]:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = DynamicCache(), layer_idx = 0

    def __getitem__(self, layer_idx: int) -> List[Tuple[torch.Tensor]]:
        """
        Support for backwards-compatible `past_key_value` indexing, e.g. `past_key_value[0][0].shape[2]` to get the
        sequence length.
        """
        if layer_idx < len(self):
            return (self.key_cache[layer_idx], self.value_cache[layer_idx])
        else:
>           raise KeyError(f"Cache only has {len(self)} layers, attempted to access layer with index {layer_idx}")
E           KeyError: 'Cache only has 0 layers, attempted to access layer with index 0'

.tox/py/lib/python3.10/site-packages/transformers/cache_utils.py:314: KeyError

Whereas transformers v4.41.2 unit tests succeeds.

Also found related issue from transformers repo which is closed but has new comments with people hitting the same issue but is talking about an older version of transformers v4.38

Abhishek-TAMU and others added 2 commits July 16, 2024 15:53
Signed-off-by: Abhishek Maurya <124327945+Abhishek-TAMU@users.noreply.github.com>
@anhuong
Copy link
Collaborator Author

anhuong commented Jul 17, 2024

closing in favor of #246

@anhuong anhuong closed this Jul 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants