Error when running llama2_fine_tuning_inference & Intel_Gaudi_Fine_Tuning examples #1467

epage480 · 2024-10-31T14:43:53Z

System Info

Google Colab (CPU runtime)

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

Copy llama2_fine_tuning_inference.ipynb and upload/open in google colab
Add an additional code cell below the "exit()" cell with the following: !git clone https://github.com/HabanaAI/Gaudi-tutorials.git
Replace <your_hugging_face_token_here> with a valid hugging face token
Run all cells up to "python3 ../gaudi_spawn.py..."
You should see an error:
Traceback (most recent call last):
File "/content/Gaudi-tutorials/PyTorch/llama2_fine_tuning_inference/optimum-habana/examples/language-modeling/../gaudi_spawn.py", line 34, in
from optimum.habana.distributed import DistributedRunner
File "/usr/local/lib/python3.10/dist-packages/optimum/habana/init.py", line 34, in
check_synapse_version()
File "/usr/local/lib/python3.10/dist-packages/optimum/habana/utils.py", line 207, in check_synapse_version
habana_frameworks_version_number = get_habana_frameworks_version()
File "/usr/local/lib/python3.10/dist-packages/optimum/habana/utils.py", line 245, in get_habana_frameworks_version
return version.parse(output.stdout.split("\n")[0].split()[-1])
IndexError: list index out of range

Expected behavior

I would expect it to run and fine-tune the model with no errors.

I discovered this post which shows someone had the same problem but it does not describe or link to how it was resolved: https://discuss.huggingface.co/t/error-when-running-examples-in-optimum-habana/74944

This is probably user error, any pointers would be appreciated!

The text was updated successfully, but these errors were encountered:

regisss · 2024-11-01T01:15:08Z

Here is the link to the former issue: #741
Not sure why it doesn't link to that in the forum discussion.

Can you provide the outputs of the commands that are suggested in this issue please? You're running your script on Colab and it is not set up with the Gaudi libraries. You probably need to run your script in a Docker container using one of the images published by Intel. For example:

docker pull vault.habana.ai/gaudi-docker/1.18.0/ubuntu24.04/habanalabs/pytorch-installer-2.4.0:latest

epage480 added the bug Something isn't working label Oct 31, 2024

regisss closed this as completed Dec 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error when running llama2_fine_tuning_inference & Intel_Gaudi_Fine_Tuning examples #1467

Error when running llama2_fine_tuning_inference & Intel_Gaudi_Fine_Tuning examples #1467

epage480 commented Oct 31, 2024

regisss commented Nov 1, 2024

Error when running llama2_fine_tuning_inference & Intel_Gaudi_Fine_Tuning examples #1467

Error when running llama2_fine_tuning_inference & Intel_Gaudi_Fine_Tuning examples #1467

Comments

epage480 commented Oct 31, 2024

System Info

Information

Tasks

Reproduction

Expected behavior

regisss commented Nov 1, 2024