Use lora to reproduce alpaca.
Main dependencies are openrlhf
and lm-eval
Refer to openrlhf.
Refer to lm-eval-harness.
The data generation methodology draws inspiration from Alpaca, though with several simplifications to streamline the process. By leveraging the DeepSeek Chat API for generation, we were able to complete the asynchronous requests in approximately 17.5 hours, with the total API costs remaining under $35 USD (approximately 250 CNY).
To generate data, run the following commands:
python generate_instructions_async.py
OpenRLHF is used for easy lora fine-tuning.
./train_llama3_8b_sft_lora.sh
lm-evaluation-harness is used for easy evaluation.
./eval.sh