deps: pull in transformers latest version by anhuong · Pull Request #235 · foundation-model-stack/fms-hf-tuning

anhuong · 2024-07-08T19:58:03Z

Description of the change

set lower limit of transformers to 4.41.0 which uses TrainingArgumenteval_strategy instead of evaluation_strategy
set upper limit of transformers to 5.0, thus pulling in new versions of transformers

Related issue number

Previously we had seen degraded performance in v4.41 of transformers as described in issue #201. After further evaluation described in the issue, the throughput degradation on llama3-8B was only 17%, only on 2 A100, and not at all on 4 A100, and that memory utilization was significantly improved by moving to 4.41.

How to verify the PR

Was the PR tested

I have added >=1 unit test(s) for every new method I have added.
I have ensured all unit tests pass

Signed-off-by: Anh-Uong <anh.uong@ibm.com>

anhuong · 2024-07-08T21:16:01Z

Note that transformers starting with v4.42.0 unit tests fail when running inference with error KeyError: 'Cache only has 0 layers, attempted to access layer with index 0'. Tuning succeeded and only failed on test cases that ran Prompt Tuning and ran inference after failed, the LoRA and fine tuning tests succeeded.
See unit tests for list of dep versions. Full error:

            # Run inference on the text
>           output_inference = loaded_model.run(
                "### Text: @NortonSupport Thanks much.\n\n### Label:", max_new_tokens=50
            )

tests/test_sft_trainer.py:248: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
scripts/run_inference.py:246: in run
    peft_outputs = self.peft_model.generate(
.tox/py/lib/python3.10/site-packages/peft/peft_model.py:1493: in generate
    outputs = self.base_model.generate(**kwargs)
.tox/py/lib/python3.10/site-packages/torch/utils/_contextlib.py:115: in decorate_context
    return func(*args, **kwargs)
.tox/py/lib/python3.10/site-packages/transformers/generation/utils.py:1914: in generate
    result = self._sample(
.tox/py/lib/python3.10/site-packages/transformers/generation/utils.py:2648: in _sample
    model_inputs = self.prepare_inputs_for_generation(input_ids, **model_kwargs)
.tox/py/lib/python3.10/site-packages/peft/peft_model.py:1522: in prepare_inputs_for_generation
    if model_kwargs["past_key_values"][0][0].shape[-2] >= model_kwargs["input_ids"].shape[1]:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = DynamicCache(), layer_idx = 0

    def __getitem__(self, layer_idx: int) -> List[Tuple[torch.Tensor]]:
        """
        Support for backwards-compatible `past_key_value` indexing, e.g. `past_key_value[0][0].shape[2]` to get the
        sequence length.
        """
        if layer_idx < len(self):
            return (self.key_cache[layer_idx], self.value_cache[layer_idx])
        else:
>           raise KeyError(f"Cache only has {len(self)} layers, attempted to access layer with index {layer_idx}")
E           KeyError: 'Cache only has 0 layers, attempted to access layer with index 0'

.tox/py/lib/python3.10/site-packages/transformers/cache_utils.py:314: KeyError

Whereas transformers v4.41.2 unit tests succeeds.

Also found related issue from transformers repo which is closed but has new comments with people hitting the same issue but is talking about an older version of transformers v4.38

Signed-off-by: Abhishek Maurya <124327945+Abhishek-TAMU@users.noreply.github.com>

anhuong · 2024-07-17T15:04:41Z

closing in favor of #246

deps: set lower limit of transformers to 4.41.0

71c0773

Signed-off-by: Anh-Uong <anh.uong@ibm.com>

anhuong requested review from Ssukriti and alex-jw-brooks as code owners July 8, 2024 19:58

Abhishek-TAMU and others added 2 commits July 16, 2024 15:53

Commented 3 prompt tuning test case

a9c035a

Merge branch 'main' into unpin-transformers

c4e44a0

Signed-off-by: Abhishek Maurya <124327945+Abhishek-TAMU@users.noreply.github.com>

anhuong closed this Jul 17, 2024

Abhishek-TAMU mentioned this pull request Aug 1, 2024

Enabling tests for prompt tuning #278

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

deps: pull in transformers latest version#235

deps: pull in transformers latest version#235
anhuong wants to merge 3 commits intofoundation-model-stack:mainfrom
anhuong:unpin-transformers

anhuong commented Jul 8, 2024

Uh oh!

anhuong commented Jul 8, 2024 •

edited

Loading

Uh oh!

anhuong commented Jul 17, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

anhuong commented Jul 8, 2024

Description of the change

Related issue number

How to verify the PR

Was the PR tested

Uh oh!

anhuong commented Jul 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

anhuong commented Jul 17, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

anhuong commented Jul 8, 2024 •

edited

Loading