v1.10.2
This patch release mostly addresses another issue with the DeID model(s).
The underlying RoBERTa models have a token limit (512) and because of that later parts of larger documents would fail to de-identify.
This Release (or more specifically, PR #405) fixes that issue by allowing the user to specify the overlapping tokens (and defaults to 5).
What's Changed
- CU-8693v3tt6 SOMED opcs refset selection by @mart-r in #402
- CU-8693v6epd: Move typing imports away from pydantic by @mart-r in #403
- CU-8693qx9yp Deid chunking - hugging face pipeline approach by @shubham-s-agarwal in #405
New Contributors
- @shubham-s-agarwal made their first contribution in #405
Full Changelog: v1.10.1...v1.10.2