Skip to content

feat: Implement status bar with tqdm when parsing records in from_huggingface #4132

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

Racso-3141
Copy link
Contributor

@Racso-3141 Racso-3141 commented Nov 4, 2023

Description

It is a new feature for function from_huggingface. I implement a progress bar with tqdm for the records parsing process.

Closes #3888

Type of change

(Please delete options that are not relevant. Remember to title the PR according to the type of change)

  • New feature (non-breaking change which adds functionality)
  • Refactor (change restructuring the codebase without changing functionality)
  • Improvement (change adding some improvement to an existing functionality)

How Has This Been Tested

  1. Run pip install -e . in the repo after making changes.
  2. Run from_huggingface in a ipynb file.
  3. The progress bar of parsing record is shown.

Checklist

  • I added relevant documentation
  • I followed the style guidelines of this project
  • I did a self-review of my code
  • I made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • I filled out the contributor form (see text above)
  • I have added relevant notes to the CHANGELOG.md file (See https://keepachangelog.com/)

@davidberenstein1957
Copy link
Contributor

Hi @Racso-3141, this looks great already. Would you be able to update the CHANGELOG.md and work on making the tqdm bar optional via a flag?

@davidberenstein1957
Copy link
Contributor

@kursathalat, can you also explain and highlight the disable vs enable difference we found in tqdm this morning?

@Racso-3141 Racso-3141 force-pushed the feature/from-huggingface-parse-records-progress-bar branch from 26d119f to bb59adf Compare November 6, 2023 15:58
Copy link

codecov bot commented Nov 6, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Files Coverage Δ
...lient/feedback/integrations/huggingface/dataset.py 89.94% <100.00%> (+0.05%) ⬆️

📢 Thoughts on this report? Let us know!.

@Racso-3141 Racso-3141 force-pushed the feature/from-huggingface-parse-records-progress-bar branch from be3777f to 317ab0e Compare November 6, 2023 16:30
@davidberenstein1957
Copy link
Contributor

davidberenstein1957 commented Nov 6, 2023

@Racso-3141 thanks for the update could you align with the implementation and naming used here?

for i in trange(
0, len(records), PUSHING_BATCH_SIZE, desc="Updating records in Argilla...", disable=not show_progress

@Racso-3141
Copy link
Contributor Author

Hi, I added option flag for the progress bar and updated the CHANGE.LOG

@Racso-3141
Copy link
Contributor Author

Racso-3141 commented Nov 6, 2023 via email

@davidberenstein1957
Copy link
Contributor

davidberenstein1957 commented Nov 6, 2023

@Racso-3141 we have the style guide defined via pre-commit. You can see the process here: https://docs.argilla.io/en/latest/community/contributing.html and https://docs.argilla.io/en/latest/community/developer_docs.html#install-code-formatting-tools. I would say show_progress is fine, but we don't have anything for naming yet. Do you have any suggestions on how to communicate this better?

@davidberenstein1957
Copy link
Contributor

@Racso-3141 Also thanks a lot for the help and flexibility. It is appreciated.

@Racso-3141 Racso-3141 force-pushed the feature/from-huggingface-parse-records-progress-bar branch from 34f0b27 to e1d627b Compare November 6, 2023 16:51
@Racso-3141 Racso-3141 force-pushed the feature/from-huggingface-parse-records-progress-bar branch from c98cd25 to d82c25b Compare November 6, 2023 16:56
@davidberenstein1957 davidberenstein1957 requested review from davidberenstein1957 and removed request for alvarobartt November 6, 2023 17:02
@davidberenstein1957 davidberenstein1957 merged commit 8a07acf into argilla-io:develop Nov 7, 2023
@davidberenstein1957
Copy link
Contributor

Thanks @Racso-3141! The contribution is very welcome :) If you want to find another good challenge. Feel free to reach out.

@davidberenstein1957
Copy link
Contributor

@Racso-3141 Thanks again for this PR! Just to double-check, did you fill out our contributor form? https://tally.so/r/n9XrxK

@Racso-3141
Copy link
Contributor Author

Racso-3141 commented Nov 16, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEATURE] Notify dataset progress when downloading datasets from huggingface
2 participants