Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test tokenize json fail #534

Closed
MXueguang opened this issue May 3, 2021 · 5 comments
Closed

test tokenize json fail #534

MXueguang opened this issue May 3, 2021 · 5 comments

Comments

@MXueguang
Copy link
Member

/home/xueguang/anaconda3/envs/pyserini_head/bin/python /home/xueguang/.local/share/JetBrains/PyCharm2020.3/python/helpers/pycharm/_jb_unittest_runner.py --target test_tokenize_json.TestTokenizeJson
Testing started at 8:20 p.m. ...
Launching unittests with arguments python -m unittest test_tokenize_json.TestTokenizeJson in /home/xueguang/Research/pyserini/tests

Converted 0 docs, writing into file ./test_out_tokenize_json/docs00.json
Converted 0 docs, writing into file ./test_out_tokenize_json/docs01.json


a new gp ##u ! != i have a new gp ##u !

Expected :i have a new gp ##u !
Actual   :a new gp ##u !
@lintool
Copy link
Member

lintool commented May 3, 2021

Hrm, interesting - works for me?

$ python -m unittest tests/test_tokenize_json.py 
Converted 0 docs, writing into file ./test_out_tokenize_json/docs00.json
Converted 0 docs, writing into file ./test_out_tokenize_json/docs01.json
.Converted 0 docs, writing into file out_test_bert_single_file.json
.
----------------------------------------------------------------------
Ran 2 tests in 2.851s

OK

@MXueguang
Copy link
Member Author

what's your sentencepiece/transformers version?

@lintool
Copy link
Member

lintool commented May 3, 2021

(python36) iMac-Pro:pyserini jimmylin$ pip list | grep transformers
transformers                  4.0.0
(python36) iMac-Pro:pyserini jimmylin$ pip list | grep sentencepiece
sentencepiece                 0.1.95

@MXueguang
Copy link
Member Author

reason is this line should be sorted

for i, inf in enumerate(sorted(os.listdir(args.input))):

@MXueguang
Copy link
Member Author

closed by #535

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants