Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finetune Problem #235

Open
ChinnYu opened this issue Dec 5, 2023 · 12 comments
Open

Finetune Problem #235

ChinnYu opened this issue Dec 5, 2023 · 12 comments

Comments

@ChinnYu
Copy link

ChinnYu commented Dec 5, 2023

I wanted to extend my sincere appreciation for your project. As a devoted fan of your work, I have thoroughly enjoyed being part of this journey. However, I have recently encountered an issue with the latest version. Specifically, when attempting to perform code finetuning and clicking on "Run Filter," it displays a fail message. However, when I click on "Run Finetune," it shows that the process is completed successfully, and the code even indicates acceptance. On top of that, the run finetune process gets interrupted immediately. This problem did not exist in the previous version.
MicrosoftTeams-image (9)
MicrosoftTeams-image (10)
image

I understand that software development can be complex, and unforeseen issues may arise. Therefore, I kindly request your assistance in resolving this matter. If there are any potential solutions or suggestions you could provide to prevent the interruption during the run finetune process, it would be greatly appreciated.

Once again, thank you for your dedication and hard work. I eagerly anticipate future updates and improvements to this remarkable project.

Thank you sincerely,

@hazratisulton
Copy link

Hello, @ChinnYu!
Can you provide logs?
Are you using release docker or using source code?
What message you see when clicking on "Run Filter"?

@ChinnYu
Copy link
Author

ChinnYu commented Dec 9, 2023

Hello, @hazratisulton. I used the source code and the provided Dockerfile to build the image. I noticed that the built image has an error when I press 'run filter.
image

@JegernOUTT
Copy link
Member

@hazratisulton have you managed to reproduce it?

@hazratisulton
Copy link

@hazratisulton have you managed to reproduce it?

No, I couldn't. I asked @mitya52 to take a look, maybe he could offer some ideas.

@ChinnYu
Copy link
Author

ChinnYu commented Dec 28, 2023

HI @hazratisulton @JegernOUTT , I attempted to build the image using the latest version of the source code (12/28) from the 'dev' branch, and it seems that the same issue persists. If there's a specific log for analysis that you need, please let me know, and I'll provide it.

image

@ChinnYu
Copy link
Author

ChinnYu commented Jan 2, 2024

After numerous code modifications, I discovered that changing 'aux' to '_aux' allows locating the Python module. But it also brings about two issues. 1. 'Index out of bounds' occurs when pressing 'run filter' and selecting 'codellama.' 2. 'AssertionError: You have to have more files to process than processes' happens at the beginning of Finetune. Indeed, the number of my files exceeds the number of processes. The first one is resolved by changing to transformers==4.34.0, and for the second one, the allocation rules need to be modified.

image

image

@olegklimov
Copy link
Contributor

Interesting! But it works in nightly without any changes 🤔 Let's ask what @JegernOUTT and @mitya52 think.

@ChinnYu
Copy link
Author

ChinnYu commented Jan 10, 2024

Hi, I'd like to ask another question. I'm attempting to integrate 'deepseek-ai/deepseek-coder-33b-base' into the refact. I added the 33B model to these two files: 'refact/known_models_db/refact_known_models/huggingface.py' and 'self_hosting_machinery/finetune/configuration/supported_models.py'. The modification process is similar to 'deepseek-coder/5.7b/mqa-base', and I've also added 'known_models' in 'refact-lsp'. However, Visual Studio Code (vscode) continues to report the following errors. I have checked the main page and confirmed that the model has been successfully initialized. Do the experts have any debugging suggestions for this issue?
image

Additionally, there is a warning as shown in the second image. What should be configured in this case? Thank you.

image

@olegklimov
Copy link
Contributor

@ChinnYu awesome that you are trying this! You might need a change in refact-lsp, just add the model there by analogy like the other models.

There was this idea to try new models using "works like this other known model" in settings. But then it appeared not very practical (the best settings is no settings, because it gets outdated, needs tech support to remove unnecessary settings once it's there and server side changes, etc). Or maybe we could return to this idea, because it allows to try a model quickly without recompiling the lsp.

@JegernOUTT
Copy link
Member

 I discovered that changing 'aux' to '_aux' allows locating the Python module

It's interesting, we never had such import namings, check this out
https://github.com/smallcloudai/refact/blob/main/self_hosting_machinery/finetune/scripts/finetune_filter.py#L16
Are you sure you haven't changed them yourself accidentally?

@ChinnYu
Copy link
Author

ChinnYu commented Feb 6, 2024

HI @JegernOUTT I obtained the GitHub code by downloading the zip file, and I'm using 7zip as the decompression software. It seems that the built-in zip tool in Windows cannot decompress it. I've tried several times, but it keeps generating '_aux', as shown in the figure below.

image

@JegernOUTT
Copy link
Member

@ChinnYu
Sorry for the delay, completely forgot about this
Yes, I've got the problem, it's due to some legacy Windows folder naming limitations.
We'll rename those folders in the next release, sorry for the inconvenience

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants