Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenAI CLI Tools for Chat Fine-Tuning #622

Open
henriqueln7 opened this issue Sep 21, 2023 · 6 comments
Open

OpenAI CLI Tools for Chat Fine-Tuning #622

henriqueln7 opened this issue Sep 21, 2023 · 6 comments
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@henriqueln7
Copy link

henriqueln7 commented Sep 21, 2023

Describe the feature or improvement you're requesting

Hello everyone,

When using legacy fine-tuning, I find the OpenAI CLI extremely helpful due to its numerous tools.
For instance, the Prepare Data Helper and the Create Fine-Tuning are particularly useful.

However, these tools only apply to legacy models, which consist of JSON with prompt and completion keys.

I propose the addition of operations to the existing CLI that can perform the same functions for the new chat fine-tuning.

My Proposal

  • For the sake of backwards compatibility, we could create a new subcommand called chat_fine_tunes.
    • This subcommand would inherit all operations that fine_tunes can perform, such as assisting with data preparation, etc. We can simply replicate the existing operations with minor modifications to suit the new format.

Additional context

I am open to working on this feature if it is approved.

@mina6765

This comment has been minimized.

@rattrayalex rattrayalex added the enhancement New feature or request label Nov 10, 2023
@rattrayalex
Copy link
Collaborator

Hi @henriqueln7 , do you remain interested in working on this? What interface would you propose?

@henriqueln7
Copy link
Author

Hey, @rattrayalex. I indeed remain interested in working on this :)

I propose the creation of a new subcommand called chat_fine_tunes. It would function as follows:

# This subcommand would assist with `.json, .jsonl` files. The formats `.csv, .txt, .tsv, .xlsx` seem incompatible with this new format (I am open to suggestions here).
# The new subcommand will perform the same operations that already exist:
# - Checking for potential improvements (removing duplicates, verifying the presence of system messages)
# - Generating a `file_prepared.jsonl` file suitable for fine-tuning
openai tools chat_fine_tunes.prepare_data -f <LOCAL_FILE>

# Create a fine_tune job
openai api chat_fine_tunes.create -t <TRAIN_FILE_ID_OR_PATH> -m <BASE_MODEL>

# List existing fine-tunings
openai api chat_fine_tunes.list

# Retrieve the status of a fine-tuning job. The output includes
# the job status (which can be pending, running, succeeded, or failed),
# among other details.
openai api chat_fine_tunes.get -i <YOUR_FINE_TUNE_JOB_ID>

# Cancel a fine-tuning job
openai api chat_fine_tunes.cancel -i <YOUR_FINE_TUNE_JOB_ID>

Questions

When I initially proposed this change, version 1.0 of the CLI had not been introduced. I noticed that all openai api fine_tunes commands were removed (although they are still mentioned in the documentation). Are there plans to also phase out the existing support for data preparation in the legacy manner? If that's the case, maybe it would be better for me to adapt the existing command rather than creating a new one.

@rattrayalex
Copy link
Collaborator

Thanks @henriqueln7 ! We'd be open to PR's for this. @jhallard can help with questions.

@rattrayalex rattrayalex added the good first issue Good for newcomers label Mar 3, 2024
@aanaseer
Copy link

aanaseer commented Mar 8, 2024

Hi, I see this issue has been pending for a while. I have developed a solution and would like to contribute by submitting a PR. Would that be alright with everyone involved @rattrayalex?

@rattrayalex
Copy link
Collaborator

Please do! PRs are always welcome.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

4 participants