-
Notifications
You must be signed in to change notification settings - Fork 421
fix: correct bug on sentence-transformers trainer with a list of values in the records #4211
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@plaguss, could you add some small tests for this too? |
sample_keys = set(data[0].keys()) | ||
else: | ||
raise ValueError(f"The type is not supported: {type(data[0])}.") | ||
|
||
if sample_keys == {"label", "sentence-1", "sentence-2"}: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we should review these check too in order to align with a set
? also, there seems to be a faulty line in 1618 checking for sample_keys == sample_keys
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we should review these check too in order to align with a
set
? also, there seems to be a faulty line in 1618 checking forsample_keys == sample_keys
I didn't noticed the error in line 1618.. thanks
@plaguss could you also check if the other implementations allow for passing generators too? |
We only have tests for generators in the |
I was speaking with @sdiazlor and it seems that the generators are not working properly with the other frameworks (she mentioned the |
@plaguss, I agree. Feel free to close this and create a separate issue. |
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## develop #4211 +/- ##
============================================
+ Coverage 65.07% 91.27% +26.19%
============================================
Files 319 319
Lines 18464 18468 +4
============================================
+ Hits 12016 16857 +4841
+ Misses 6448 1611 -4837 ☔ View full report in Codecov by Sentry. |
The URL of the deployed environment for this PR is https://argilla-quickstart-pr-4211-ki24f765kq-no.a.run.app |
…es in the records (#4211) <!-- Thanks for your contribution! As part of our Community Growers initiative 🌱, we're donating Justdiggit bunds in your name to reforest sub-Saharan Africa. To claim your Community Growers certificate, please contact David Berenstein in our Slack community or fill in this form https://tally.so/r/n9XrxK once your PR has been merged. --> # Description This PR solves an error in the `ArgillaTrainer` for `sentence-similarity` when the records include nested lists. Closes #4088 **Type of change** (Please delete options that are not relevant. Remember to title the PR according to the type of change) - [x] Bug fix (non-breaking change which fixes an issue) - [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected) **How Has This Been Tested** (Please describe the tests that you ran to verify your changes. And ideally, reference `tests`) Tested locally **Checklist** - [x] I followed the style guidelines of this project - [x] I did a self-review of my code - [x] My changes generate no new warnings - [x] I have added tests that prove my fix is effective or that my feature works - [ ] I filled out [the contributor form](https://tally.so/r/n9XrxK) (see text above) - [ ] I have added relevant notes to the `CHANGELOG.md` file (See https://keepachangelog.com/)
Description
This PR solves an error in the
ArgillaTrainer
forsentence-similarity
when the records include nested lists.Closes #4088
Type of change
(Please delete options that are not relevant. Remember to title the PR according to the type of change)
How Has This Been Tested
(Please describe the tests that you ran to verify your changes. And ideally, reference
tests
)Tested locally
Checklist
CHANGELOG.md
file (See https://keepachangelog.com/)