Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when running llama2_fine_tuning_inference & Intel_Gaudi_Fine_Tuning examples #1467

Closed
2 of 4 tasks
epage480 opened this issue Oct 31, 2024 · 1 comment
Closed
2 of 4 tasks
Labels
bug Something isn't working

Comments

@epage480
Copy link

System Info

Google Colab (CPU runtime)

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

  1. Copy llama2_fine_tuning_inference.ipynb and upload/open in google colab
  2. Add an additional code cell below the "exit()" cell with the following: !git clone https://github.com/HabanaAI/Gaudi-tutorials.git
  3. Replace <your_hugging_face_token_here> with a valid hugging face token
  4. Run all cells up to "python3 ../gaudi_spawn.py..."
  5. You should see an error:
    Traceback (most recent call last):
    File "/content/Gaudi-tutorials/PyTorch/llama2_fine_tuning_inference/optimum-habana/examples/language-modeling/../gaudi_spawn.py", line 34, in
    from optimum.habana.distributed import DistributedRunner
    File "/usr/local/lib/python3.10/dist-packages/optimum/habana/init.py", line 34, in
    check_synapse_version()
    File "/usr/local/lib/python3.10/dist-packages/optimum/habana/utils.py", line 207, in check_synapse_version
    habana_frameworks_version_number = get_habana_frameworks_version()
    File "/usr/local/lib/python3.10/dist-packages/optimum/habana/utils.py", line 245, in get_habana_frameworks_version
    return version.parse(output.stdout.split("\n")[0].split()[-1])
    IndexError: list index out of range

Expected behavior

I would expect it to run and fine-tune the model with no errors.

I discovered this post which shows someone had the same problem but it does not describe or link to how it was resolved: https://discuss.huggingface.co/t/error-when-running-examples-in-optimum-habana/74944

This is probably user error, any pointers would be appreciated!

@epage480 epage480 added the bug Something isn't working label Oct 31, 2024
@regisss
Copy link
Collaborator

regisss commented Nov 1, 2024

Here is the link to the former issue: #741
Not sure why it doesn't link to that in the forum discussion.

Can you provide the outputs of the commands that are suggested in this issue please? You're running your script on Colab and it is not set up with the Gaudi libraries. You probably need to run your script in a Docker container using one of the images published by Intel. For example:

docker pull vault.habana.ai/gaudi-docker/1.18.0/ubuntu24.04/habanalabs/pytorch-installer-2.4.0:latest

@regisss regisss closed this as completed Dec 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants