remove unused field for chat_template.default for DPO training#2755
Conversation
WalkthroughThe Changes
Sequence Diagram(s)sequenceDiagram
participant Caller
participant ChatTemplate as chat_template.py
Caller->>ChatTemplate: Call default()
ChatTemplate-->>Caller: Return (transform_fn, {"remove_columns": [field_messages]})
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms (8)
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
"messages" field present in final dataset causes issues with DPO training otherwise
d7d1e53 to
b5b9507
Compare
winglian
left a comment
There was a problem hiding this comment.
The caller of the transform_fn is going to need to unpack the updated return (tuple)
|
I think I fixed the linting issues and tests that use the updated return values |
|
oh thanks @winglian ! I'll take a stab at it tonight as well, looks like some of the tests are still failing |
"messages" field present in final dataset causes issues with DPO training otherwise lint and fix tests for new return value fix for updated expected fields for dpo remove unused field for chat_template.default "messages" field present in final dataset causes issues with DPO training otherwise fix test still expecting "messages" field
a0fa71c to
2e7fac1
Compare
|
@winglian can you run the tests on the pipeline again please? Adjusted the failed test, since the "conversation" field is removed now |
Codecov ReportAll modified and coverable lines are covered by tests ✅ 📢 Thoughts on this report? Let us know! |
Description
DPO trainer expects only prompt+chosen+rejected columns present in the final dataset. However, the
chat_template.defaultimplementation passes an extra columnmessageswhich causes validations to fail.Motivation and Context
Fixes issue #2415
How has this been tested?
Tested using sample config:
then by running
axolotl preprocessandaxolotl traincommands.Without the fix, it fails with the following errors:
Screenshots (if appropriate)
Types of changes
Social Handles (Optional)
Summary by CodeRabbit