You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jan 15, 2024. It is now read-only.
Do you mean exposing an option in nlp_process or changing the defaults in nlp_process? As English is a special case that doesn't care much about accents, I suggest we must keep the option to keep accents in nlp_process.
________________________________
From: Leonard Lausen <[email protected]>
Sent: Monday, February 22, 2021 7:37:12 AM
To: dmlc/gluon-nlp <[email protected]>
Cc: Xingjian SHI <[email protected]>; Author <[email protected]>
Subject: Re: [dmlc/gluon-nlp] strip_accents should be None by default in WordPiece (#1528)
Thus, we may try to turn it off in nlp_process.
Do you mean exposing an option in nlp_process or changing the defaults in nlp_process? As English is a special case that doesn't care much about accents, I suggest we must keep the option to keep accents in nlp_process.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<#1528 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ABHQH3VYVYGTNOKCD23YH2LTAJ22RANCNFSM4X76IM4Q>.
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Description
@leezu @szha @xinyual I noticed that we may need to set
strip_accents
to None ingluon-nlp/src/gluonnlp/data/tokenizers/huggingface.py
Line 564 in 223f1f6
lowercase
is True.This may impact the performance.
Error Message
(Paste the complete error message, including stack trace.)
To Reproduce
(If you developed your own code, please provide a short script that reproduces the error. For existing examples, please provide link.)
Steps to reproduce
(Paste the commands you ran that produced the error.)
What have you tried to solve it?
Environment
We recommend using our script for collecting the diagnositc information. Run the following command and paste the outputs below:
The text was updated successfully, but these errors were encountered: