Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add trainer callbacks for Transformer NER #377

Merged
merged 2 commits into from
Dec 5, 2023
Merged

Add trainer callbacks for Transformer NER #377

merged 2 commits into from
Dec 5, 2023

Conversation

baixiac
Copy link
Member

@baixiac baixiac commented Dec 4, 2023

This PR enables the client side to collect metrics during the training on the Transformers NER model.

@tomolopolis
Copy link
Member

Copy link
Collaborator

@mart-r mart-r left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR looks good to me.

Do we want to have some part of the tutorials to cover this use case as well?
Currently I don't think there's anything that uses TransformersNER explicitly in the tutorials?
In any case, not a necessity but rather food for thoughts.

@baixiac
Copy link
Member Author

baixiac commented Dec 5, 2023

Good point. Given the sole use case ATM is the De-ID, that could be added to the upcoming tutorial when the model has a GA release.

@baixiac baixiac merged commit 7fddac0 into master Dec 5, 2023
5 checks passed
@baixiac baixiac deleted the CU-86938vf30 branch December 5, 2023 17:36
mart-r added a commit that referenced this pull request Jan 8, 2024
* Bump urllib3 from 1.26.5 to 1.26.17 in /webapp/webapp

Bumps [urllib3](https://github.com/urllib3/urllib3) from 1.26.5 to 1.26.17.
- [Release notes](https://github.com/urllib3/urllib3/releases)
- [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst)
- [Commits](urllib3/urllib3@1.26.5...1.26.17)

---
updated-dependencies:
- dependency-name: urllib3
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

* Cu 8692wbcq5 docs builds (#359)

* CU-8692wbcq5: Pin max version of numpy

* CU-8692wbcq5: Pin max version of numpy in setup.py

* CU-8692wbcq5: Bump python version for readthedocs workflow

* CU-8692wbcq5: Pin all requirement versions in docs requirements

* CU-8692wbcq5: Move docs requirements before setuptools

* CU-8692wbcq5: Fix typo in docs requirements

* CU-8692wbcq5: Remove some less relevant stuff from docs requirements

* CU-8692wbcq5: Add back sphinx-based requirements to docs requirements

* CU-8692wbcq5: Move back to python 3.9 on docs build workflow

* CU-8692wbcq5: Bump sphinx-autoapi version

* CU-8692wbcq5: Bump sphinx version

* CU-8692wbcq5: Bump python version back to 3.10 for future-proofing

* CU-8692wbcq5: Undo pinning numpy to max version in setup.py

* CU-8692wbcq5: Remove docs-build specific dependencies in setup.py

* Bump urllib3 from 1.26.17 to 1.26.18 in /webapp/webapp

Bumps [urllib3](https://github.com/urllib3/urllib3) from 1.26.17 to 1.26.18.
- [Release notes](https://github.com/urllib3/urllib3/releases)
- [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst)
- [Commits](urllib3/urllib3@1.26.17...1.26.18)

---
updated-dependencies:
- dependency-name: urllib3
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

* CU-8692uznvd: Allow empty-dict config.linking.filters.cuis and convert to set in memory (#352)

* CU-8692uznvd: Allow empty-dict config.linking.filters.cuis and convert to set in memory

* CU-8692uznvd: Move the empty-set detection and conversion from validator to init

* CU-8692uznvd: Remove unused import

* CU-8692t3fdf separate config on save (#350)

* CU-8692t3fdf Move saving config outside of the cdb.dat; Add test to make sure the config does not get saved with the CDB; patch a few existing tests

* CU-8692t3fdf Use class methods on class instead of instance in a few tests

* CU-8692t3fdf Fix typing issue

* CU-8692t3fdf Add additional tests for 2 configs and zero configs when loading model pack

* CU-8692t3fdf: Make sure CDB is linked to the correct config; Treat incorrect configs as dirty CDBs and force a recalc of the hash

* CU-2cdpd4t: Unify default addl_info in different methdos. (#363)

* Bump django from 3.2.20 to 3.2.23 in /webapp/webapp

Bumps [django](https://github.com/django/django) from 3.2.20 to 3.2.23.
- [Commits](django/django@3.2.20...3.2.23)

---
updated-dependencies:
- dependency-name: django
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

* Changing cdb.add_concept to a protected method

* Re-added deprecated method  with deprecated flag and addtional comments

* Initial commit for merge_cdb method

* Added indentation to make merge_cdb a class method

* fixed syntax issues

* more lint fixes

* more lint fixes

* bug fixes of merge_cdb

* removed print statements

* CU-86931prq4: Update GHA versions (checkout and setup-python) to v4 (#368)

* Cu 1yn0v9e duplicate multiprocessing methods (#364)

* CU-1yn0v9e: Rename and deprecate one of the multiprocessing methods;

Add docstring. Trying to be more explicit regarding usage and differences between different methods

* CU-1yn0v9e: Rename and deprecate the multiprocessing_pipe method;

Add docstring. Trying to be more explicit regarding usage and differences between different methods

* CU-1yn0v9e: Fix typo in docstring; more consistent naming

* 869377m3u: Add comment regarding demo link load times to README (#376)

* intermediate changes of merge_cdb and testing function

* Added README.md documentation for CPU only installations (#365)

* changed README.md to reflect installation options.

* added setup script to demonstrate how wrapper could look for CPU installations

* removed setup.sh as unnessescary for cpu only builds

* Initial commit for merge_cdb method

* Added indentation to make merge_cdb a class method

* fixed syntax issues

* more lint fixes

* more lint fixes

* bug fixes of merge_cdb

* removed print statements

* Added commentary on disk space usage of pytorch-gpu

* removed merge_cdb from branch

---------

Co-authored-by: adam-sutton-1992 <[email protected]>

* Cu 8692zguyq no preferred name (#367)

* CU-8692zguyq: Slight simplification of minimum-name-length logic

* CU-8692zguyq: Add some tests for prepare_name preprocessor

* CU-8692zguyq: Add warning if no preferred name was added along a new CUI

* CU-8692zguyq: Add additional warning messages when adding/training a new CUI with no preferred name

* CU-8692zguyq: Make no preferred name warnings only run if name status is preferred

* CU-8692zguyq: Add tests for no-preferred name warnings

* CU-8692zguyq: Add Vocab.make_unigram_table to CAT tests

* CU-8692zguyq: Move to built in asserting for logging instead of patching the method

* CU-8692zguyq: Add workaround for assertNoLogs on python 3.8 and 3.9

* Add trainer callbacks for Transformer NER (#377)

CU-86938vf30 add trainer callbacks for Transformer NER

* changes to merge_cdb and adding unit tests for method

* fixing lint issues

* fixing flake8 linting

* bug fixes, additional tests, and more documentation

* moved set up of cdbs to be merged to tests.helper

* moved merge_cdb to utils and created test_cdb_utils

* removed class wrapper in cdb utils and fixed class set up in tests

* changed test object setup to class setup

* removed erroneous new line

* CU-2e77a31 improve print stats (#366)

* Add base class for CAT

* Add CDB base class

* Some whitespace fixes for base modules

* CU-2e77a31: Move print stats to their own module and class

* CU-2e77a31: Fix issues introduced by moving print stats

* CU-2e77a31: Rename print_stats to get_stats and add option to avoid printed output

* CU-2e77a31: Add test for print_stats

* CU-2e77a31: Remove unused import

* CU-2e77a31: Add new package to setup.py

* CU-2e77a31: Fix a bunch of typing issues

* CU-2e77a31: Revert CAT and CDB abstraction

* Load stopwords in Defaults before spacy model

* CU-8693az82g Remove cdb tests side effects (#380)

* 8693az82g: Add method to CDBMaker to reset the CDB

* 8693az82g: Add test in CDB tests to ensure a new CDB is used for each test

* 8693az82g: Reset CDB in CDB tests before each test to avoid side effects

* Added tests

* CU-8693bpq82 fallback spacy model (#384)

* CU-8693bpq82: Add fallback spacy model along with test

* CU-8693bpq82: Remove debug output

* CU-8693bpq82: Add exception info to warning upon spacy model load failure and fallback

* Remove tests of internals where possible

* Add test for skipping of stopwords

* Avoid supporting only English for stopwords

* Remove debug output

* Make sure stopwords language getter works for file-path spacy models

* CU-8693cv3w0 Fix fallback spacy model existance on pip installs (#386)

* CU-8693cv3w0: Add method to ensure spacy model and use it when falling back to default model

* CU-8693cv3w0: Add logged output when installing/downloading spacy model

* CU-8693b0a61 Add method to get spacy model version (#381)

* CU-8693b0a61: Add method to find spacy folder in model pack along with some tests

* CU-8693b0a61: Add test for spacy folder finding (full path)

* CU-8693b0a61: Add method for finding spacy model in model pack along with tests

* CU-8693b0a61: Add method for finding current spacy version

* CU-8693b0a61: Add method for getting spacy model version installed

* CU-8693b0a61: Fix getting spacy model folder return path

* CU-8693b0a61: Add method to get name and meta of spacy model based on model pack

* CU-8693b0a61: Add missing fake spacy model meta

* CU-8693b0a61: Add missing docstrings

* CU-8693b0a61: Change name of method for clarity

* CU-8693b0a61: Add method to get spacy model name and version from model pack path

* CU-8693b0a61: Fix a few typing issues

* CU-8693b0a61: Add a missing docstring

* CU-8693b0a61: Match folder name of fake spacy model to its name

* CU-8693b0a61: Make the final method return true name of spacy model instead of folder name

* Add additional output to method for getting spacy model version - the compatible spacy versions

* CU-8693b0a61: Add method for querying whether the spacy version is compatible with a range

* CU-8693b0a61: Add better abstraction for spacy version mocking in tests

* CU-8693b0a61: Add some more abstraction for fake model pack in tests

* CU-8693b0a61: Add method for checking whethera model pack has a spacy model compatible with installed spacy version

* CU-8693b0a61: Improve abstraction within tests

* CU-8693b0a61: Add method to check which of two versions is older

* CU-8693b0a61: Fix fake spacy model versioning

* CU-8693b0a61: Add method for determining whether a model pack has semi-compatible spacy model

* CU-8693b0a61: Add missing word in docstring.

* CU-8693b0a61: Change some method to protected ones

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: tomolopolis <[email protected]>
Co-authored-by: adam-sutton-1992 <[email protected]>
Co-authored-by: adam-sutton-1992 <[email protected]>
Co-authored-by: Xi Bai <[email protected]>
Co-authored-by: jenniferajiang <[email protected]>
Co-authored-by: Jennifer Jiang <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants