-
Notifications
You must be signed in to change notification settings - Fork 388
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Annotation and pipeline for lexicon free custom dataset #410
Comments
@AumXIV 1.If you want use new dataset. You can remove ./data/char_dict/*.json. The char_dict.json and ord_map.json file will be automatically generated when you make the tfrecords. 2. As for the repetition in order_map.json that may due to the repetition in your char_dict.txt file. The old_map.json file records the character's ord number:) |
@AumXIV The second col in annotation file should be the row index in lexicon file:) |
@AumXIV Right. But I'm not sure if the row index begin with 0 or 1. You may check it yourself:) |
Okay, now I can generate tfrecord and train the model. Thank you lots. :) >< |
@AumXIV @MaybeShewill-CV so row index begin with 0 right?? |
yes, the row index begin with 0. |
Hello, I’m working on license plate which is lexicon free. Now I understand about Synth90k dataset which is lexicon. And my annotation is shown below. How can I use this dataset for CRNN?
I wonder if the Chinese recognition example could help, but I can’t download the dataset. Is the Chinese dataset use the same pipeline with Synth90k? And the same tfrecord?
I can also write char_dict_th.json and ord_map_th.json from my char_dict_th.txt by using local_utils/establish_char_dict.py to generate the json file. But the ord_map_th.json have repeatedly number 32, I don’t know that it can use or not. Please give me some suggestions. Thanks. :)
The text was updated successfully, but these errors were encountered: