fix: correct bug on sentence-transformers trainer with a list of values in the records #4211

plaguss · 2023-11-13T16:45:27Z

Description

This PR solves an error in the ArgillaTrainer for sentence-similarity when the records include nested lists.

Closes #4088

Type of change

(Please delete options that are not relevant. Remember to title the PR according to the type of change)

Bug fix (non-breaking change which fixes an issue)
Breaking change (fix or feature that would cause existing functionality to not work as expected)

How Has This Been Tested

(Please describe the tests that you ran to verify your changes. And ideally, reference tests)

Tested locally

Checklist

I followed the style guidelines of this project
I did a self-review of my code
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
I filled out the contributor form (see text above)
I have added relevant notes to the CHANGELOG.md file (See https://keepachangelog.com/)

…es is returned

davidberenstein1957 · 2023-11-13T16:54:34Z

@plaguss, could you add some small tests for this too?

davidberenstein1957 · 2023-11-13T16:56:52Z

src/argilla/client/feedback/training/schemas.py

+            sample_keys = set(data[0].keys())
+        else:
+            raise ValueError(f"The type is not supported: {type(data[0])}.")
+
        if sample_keys == {"label", "sentence-1", "sentence-2"}:


maybe we should review these check too in order to align with a set? also, there seems to be a faulty line in 1618 checking for sample_keys == sample_keys

maybe we should review these check too in order to align with a set? also, there seems to be a faulty line in 1618 checking for sample_keys == sample_keys

I didn't noticed the error in line 1618.. thanks

davidberenstein1957 · 2023-11-14T08:39:40Z

@plaguss could you also check if the other implementations allow for passing generators too?

plaguss · 2023-11-14T08:59:39Z

@plaguss could you also check if the other implementations allow for passing generators too?

We only have tests for generators in the text-classification frameworks. I'm not sure if trl/openai work, let me check

plaguss · 2023-11-14T11:25:04Z

@plaguss could you also check if the other implementations allow for passing generators too?

We only have tests for generators in the text-classification frameworks. I'm not sure if trl/openai work, let me check

I was speaking with @sdiazlor and it seems that the generators are not working properly with the other frameworks (she mentioned the chat-completion task I think). Maybe we can close this one and tackle that in a different issue?

…ner-rag

davidberenstein1957 · 2023-11-14T11:53:47Z

@plaguss, I agree. Feel free to close this and create a separate issue.

codecov · 2023-11-14T11:57:05Z

Codecov Report

Attention: 1 lines in your changes are missing coverage. Please review.

Comparison is base (967e9e7) 65.07% compared to head (ebe19fd) 91.27%.

Files	Patch %	Lines
src/argilla/client/feedback/training/schemas.py	83.33%	1 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##           develop    #4211       +/-   ##
============================================
+ Coverage    65.07%   91.27%   +26.19%     
============================================
  Files          319      319               
  Lines        18464    18468        +4     
============================================
+ Hits         12016    16857     +4841     
+ Misses        6448     1611     -4837

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

github-actions · 2023-11-14T11:57:22Z

The URL of the deployed environment for this PR is https://argilla-quickstart-pr-4211-ki24f765kq-no.a.run.app

…es in the records (#4211)  # Description This PR solves an error in the `ArgillaTrainer` for `sentence-similarity` when the records include nested lists. Closes #4088 **Type of change** (Please delete options that are not relevant. Remember to title the PR according to the type of change) - [x] Bug fix (non-breaking change which fixes an issue) - [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected) **How Has This Been Tested** (Please describe the tests that you ran to verify your changes. And ideally, reference `tests`) Tested locally **Checklist** - [x] I followed the style guidelines of this project - [x] I did a self-review of my code - [x] My changes generate no new warnings - [x] I have added tests that prove my fix is effective or that my feature works - [ ] I filled out [the contributor form](https://tally.so/r/n9XrxK) (see text above) - [ ] I have added relevant notes to the `CHANGELOG.md` file (See https://keepachangelog.com/)

fix: correct bug on sentence-transformers trainer when a list of valu…

7a6acad

…es is returned

plaguss requested review from davidberenstein1957 and sdiazlor November 13, 2023 16:45

chore: remove unused import

83fc478

davidberenstein1957 reviewed Nov 13, 2023

View reviewed changes

fix: added test for the case of nested lists and fixed if/else case

21e44ff

plaguss marked this pull request as ready for review November 14, 2023 11:25

plaguss added 2 commits November 14, 2023 12:25

Merge branch 'develop' of github.com:argilla-io/argilla into fix/trai…

cc3f31e

…ner-rag

chore: add changelog entry

ebe19fd

sdiazlor approved these changes Nov 14, 2023

View reviewed changes

davidberenstein1957 merged commit 9347598 into develop Nov 14, 2023

davidberenstein1957 deleted the fix/trainer-rag branch November 14, 2023 13:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: correct bug on sentence-transformers trainer with a list of values in the records #4211

fix: correct bug on sentence-transformers trainer with a list of values in the records #4211

plaguss commented Nov 13, 2023 •

edited

Loading

davidberenstein1957 commented Nov 13, 2023

davidberenstein1957 Nov 13, 2023

plaguss Nov 13, 2023

davidberenstein1957 commented Nov 14, 2023

plaguss commented Nov 14, 2023

plaguss commented Nov 14, 2023

davidberenstein1957 commented Nov 14, 2023 •

edited

Loading

codecov bot commented Nov 14, 2023

github-actions bot commented Nov 14, 2023

fix: correct bug on sentence-transformers trainer with a list of values in the records #4211

fix: correct bug on sentence-transformers trainer with a list of values in the records #4211

Conversation

plaguss commented Nov 13, 2023 • edited Loading

Description

davidberenstein1957 commented Nov 13, 2023

davidberenstein1957 Nov 13, 2023

Choose a reason for hiding this comment

plaguss Nov 13, 2023

Choose a reason for hiding this comment

davidberenstein1957 commented Nov 14, 2023

plaguss commented Nov 14, 2023

plaguss commented Nov 14, 2023

davidberenstein1957 commented Nov 14, 2023 • edited Loading

codecov bot commented Nov 14, 2023

Codecov Report

github-actions bot commented Nov 14, 2023

plaguss commented Nov 13, 2023 •

edited

Loading

davidberenstein1957 commented Nov 14, 2023 •

edited

Loading