-
Notifications
You must be signed in to change notification settings - Fork 399
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEATURE-BRANCH] feat: dataset export to the Hub #5730
Conversation
# Description This PR adds a new endpoint to support exporting an Argilla dataset to HuggingFace Hub. Being the endpoint `POST /api/v1/datasets/:dataset_id/export` with the following expected body: - `name`: the name of the dataset to be created on HF Hub. i.e. `username/my-new-dataset`. - `subset` (optional, default: `default`): the subset of the dataset to be created. - `split` (optional, default: `train`): the split of the dataset to be created. - `private` (optional, default: `false`): whether the dataset should be private or not. - `token`: the HF Hub API access token to create the dataset, it should have WRITE access. **Type of change** - New feature (non-breaking change which adds functionality) **How Has This Been Tested** - [x] Testing manually exporting some datasets. **Checklist** - I added relevant documentation - I followed the style guidelines of this project - I did a self-review of my code - I made corresponding changes to the documentation - I confirm My changes generate no new warnings - I have added tests that prove my fix is effective or that my feature works - I have added relevant notes to the CHANGELOG.md file (See https://keepachangelog.com/)
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #5730 +/- ##
===========================================
- Coverage 92.25% 92.10% -0.15%
===========================================
Files 161 161
Lines 6676 6894 +218
===========================================
+ Hits 6159 6350 +191
- Misses 517 544 +27
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Co-authored-by: Leire Aguirre <[email protected]>
The URL of the deployed environment for this PR is https://argilla-quickstart-pr-5730-ki24f765kq-no.a.run.app |
# Description This PR includes a suite of tests for `HubDatasetExporter` class, checking the most common scenarios for exporting datasets from Argilla to the Hub. **Type of change** - Improvement (change adding some improvement to an existing functionality) **How Has This Been Tested** - [ ] Running the test suite. Writing the last examples I got a rate-limiting error so I will try to run them on GitHub. **Checklist** - I added relevant documentation - I followed the style guidelines of this project - I did a self-review of my code - I made corresponding changes to the documentation - I confirm My changes generate no new warnings - I have added tests that prove my fix is effective or that my feature works - I have added relevant notes to the CHANGELOG.md file (See https://keepachangelog.com/)
) # Description Add support to export suggestions to the Hub with `HubDatasetExport`. **Type of change** - New feature (non-breaking change which adds functionality) **How Has This Been Tested** - [x] Adding new tests for every type of question including suggestions. **Checklist** - I added relevant documentation - I followed the style guidelines of this project - I did a self-review of my code - I made corresponding changes to the documentation - I confirm My changes generate no new warnings - I have added tests that prove my fix is effective or that my feature works - I have added relevant notes to the CHANGELOG.md file (See https://keepachangelog.com/)
This includes some fixes and improvements in the export to hub feature UI - [x] Fix private / public info on exporting - [x] Prevent the dialog from closing when dragging on text inputs - [x] Update exporting dialog styles - [x] New Tooltip plugin - [x] Update translations --------- Co-authored-by: Damián Pumar <[email protected]>
# Description Only adding record's `inserted_at` and `updated_at` to the list of record attributes used with `HubDatasetExport`. **Type of change** - New feature (non-breaking change which adds functionality) **How Has This Been Tested** - [x] Changing the test suite for `HubDatasetExporter`. **Checklist** - I added relevant documentation - I followed the style guidelines of this project - I did a self-review of my code - I made corresponding changes to the documentation - I confirm My changes generate no new warnings - I have added tests that prove my fix is effective or that my feature works - I have added relevant notes to the CHANGELOG.md file (See https://keepachangelog.com/)
…o the hub (#5759) # Description Some questions were not rendering values correctly when generating the README dataset card. This PR includes some changes to fix that problem. **Type of change** - Improvement (change adding some improvement to an existing functionality) **How Has This Been Tested** - [x] Manually uploading a dataset to the hub with all kinds of questions. **Checklist** - I added relevant documentation - I followed the style guidelines of this project - I did a self-review of my code - I made corresponding changes to the documentation - I confirm My changes generate no new warnings - I have added tests that prove my fix is effective or that my feature works - I have added relevant notes to the CHANGELOG.md file (See https://keepachangelog.com/)
This reverts commit b8a71ea.
This reverts commit 6d051b3.
Description
This is the feature branch with the joining efforts (UI + Backend) for the new feature allowing users to export datasets from Argilla to the HF Hub.