This repository has been archived by the owner on Dec 21, 2022. It is now read-only.
0.12.0 (2020-01-25)
- Improved the docs to reflect the
SequenceTokenizerSpec
that was added in
0.11.0. - Made max length optional for the tokenizer.
- Added CLI that parses use the SequencePiece library.
- Began versioning docker build, and make pushing easier during build process.
- Have the tokenizer resolve the named alphabets.
- Use poetry along with general updates to a build pipeline.