diff --git a/docs/guides/sft.md b/docs/guides/sft.md index 534c6b1702..1ae95dad1f 100644 --- a/docs/guides/sft.md +++ b/docs/guides/sft.md @@ -6,9 +6,9 @@ The script [examples/run_sft.py](../../examples/run_sft.py) can be used to launc Be sure to launch the job using `uv`. The command to launch an SFT job is as follows: ```bash -uv run examples/run_sft.py --config --output-dir +uv run examples/run_sft.py --config ``` -If not specified, `config` will default to [examples/configs/sft.yaml](../../examples/configs/sft.yaml) and `output-dir` will default to `./outputs`. +If not specified, `config` will default to [examples/configs/sft.yaml](../../examples/configs/sft.yaml). ## Configuration @@ -17,16 +17,16 @@ Reinforcer allows users to configure experiments using `yaml` config files. An e To override a value in the config, either update the value in the `yaml` file directly, or pass the override via the command line. For example: ```bash -python examples/run_sft.py \ - data.max_input_seq_length=8192 \ - logger.wandb.name="sft-dev-sl-8192" +uv run examples/run_sft.py \ + cluster.gpus_per_node=1 \ + logger.wandb.name="sft-dev-1-gpu" ``` ## Datasets SFT datasets in Reinforcer are encapsulated using classes. Each SFT data class is expected to have the following attributes: - - `formatted_ds`: The dictionary of formatted datasets. This dictionary should contain `train` and `validation` splits, and each split should conform to the format described below. - - `task_spec`: The `TaskDataSpec` for this dataset. This should specify the name you choose for this dataset as well as the `custom_template` for this dataset. More on custom templates below. + 1. `formatted_ds`: The dictionary of formatted datasets. This dictionary should contain `train` and `validation` splits, and each split should conform to the format described below. + 2. `task_spec`: The `TaskDataSpec` for this dataset. This should specify the name you choose for this dataset as well as the `custom_template` for this dataset. More on custom templates below. SFT datasets are expected to follow the HuggingFace chat format. Refer to the [chat dataset document](../design_docs/chat_datasets.md) for details. If your data is not in the correct format, simply write a preprocessing script to convert the data into this format. [data/hf_datasets/squad.py](../../nemo_reinforcer/data/hf_datasets/squad.py) has an example: @@ -62,4 +62,7 @@ task_spec = TaskDataSpec( ) ``` -By default, NeMo-Reinforcer has support for `Squad` and `OpenAssistant` datasets. If you would like to use a custom dataset, create a new dataset class with the expected attributes. +By default, NeMo-Reinforcer has support for `Squad` and `OpenAssistant` datasets. Both of these datasets are downloaded from HuggingFace and preprocessed on-the-fly, so there's no need to provide a path to any datasets on disk. + +Adding a new dataset is a straightforward process. +As long as your custom dataset has the `formatted_ds` and `task_spec` attributes described above, it can serve as a drop-in replacement for Squad and OpenAssistant. \ No newline at end of file