Annotation and pipeline for lexicon free custom dataset #410

AumXIV · 2020-07-14T03:22:18Z

Hello, I’m working on license plate which is lexicon free. Now I understand about Synth90k dataset which is lexicon. And my annotation is shown below. How can I use this dataset for CRNN?

I wonder if the Chinese recognition example could help, but I can’t download the dataset. Is the Chinese dataset use the same pipeline with Synth90k? And the same tfrecord?

I can also write char_dict_th.json and ord_map_th.json from my char_dict_th.txt by using local_utils/establish_char_dict.py to generate the json file. But the ord_map_th.json have repeatedly number 32, I don’t know that it can use or not. Please give me some suggestions. Thanks. :)

MaybeShewill-CV · 2020-07-14T03:45:53Z

@AumXIV 1.If you want use new dataset. You can remove ./data/char_dict/*.json. The char_dict.json and ord_map.json file will be automatically generated when you make the tfrecords. 2. As for the repetition in order_map.json that may due to the repetition in your char_dict.txt file. The old_map.json file records the character's ord number:)

AumXIV · 2020-07-14T08:24:53Z

Okay, thanks. Here, I try the tfrecord and some errors are appearing. :)

MaybeShewill-CV · 2020-07-14T08:40:14Z

@AumXIV Label index should be a number sequence. You should follow the synth90k's dataset format to reconstruct your dataset. As for dataset's format this may help #354

AumXIV · 2020-07-15T02:34:39Z

So you mean I must have a lexicon file? How about let the lexicon file be like the label of each image?

Become like this?

MaybeShewill-CV · 2020-07-15T02:47:13Z

@AumXIV The second col in annotation file should be the row index in lexicon file:)

AumXIV · 2020-07-15T03:23:45Z

From

Become

Right? :)

MaybeShewill-CV · 2020-07-15T03:35:34Z

@AumXIV Right. But I'm not sure if the row index begin with 0 or 1. You may check it yourself:)

AumXIV · 2020-07-16T09:59:25Z

Okay, now I can generate tfrecord and train the model. Thank you lots. :) ><

GarhwalRifles · 2020-09-17T22:44:01Z

@AumXIV @MaybeShewill-CV so row index begin with 0 right??

XinyuDu · 2021-12-21T06:35:28Z

@AumXIV @MaybeShewill-CV so row index begin with 0 right??

yes, the row index begin with 0.
You can find label index 0 in annotation.txt of synth90k's dataset:
./2919/3/160_0_0.jpg 0

AumXIV closed this as completed Jul 16, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Annotation and pipeline for lexicon free custom dataset #410

Annotation and pipeline for lexicon free custom dataset #410

AumXIV commented Jul 14, 2020 •

edited

Loading

MaybeShewill-CV commented Jul 14, 2020

AumXIV commented Jul 14, 2020

MaybeShewill-CV commented Jul 14, 2020

AumXIV commented Jul 15, 2020

MaybeShewill-CV commented Jul 15, 2020

AumXIV commented Jul 15, 2020

MaybeShewill-CV commented Jul 15, 2020

AumXIV commented Jul 16, 2020

GarhwalRifles commented Sep 17, 2020

XinyuDu commented Dec 21, 2021 •

edited

Loading

Annotation and pipeline for lexicon free custom dataset #410

Annotation and pipeline for lexicon free custom dataset #410

Comments

AumXIV commented Jul 14, 2020 • edited Loading

MaybeShewill-CV commented Jul 14, 2020

AumXIV commented Jul 14, 2020

MaybeShewill-CV commented Jul 14, 2020

AumXIV commented Jul 15, 2020

MaybeShewill-CV commented Jul 15, 2020

AumXIV commented Jul 15, 2020

MaybeShewill-CV commented Jul 15, 2020

AumXIV commented Jul 16, 2020

GarhwalRifles commented Sep 17, 2020

XinyuDu commented Dec 21, 2021 • edited Loading

AumXIV commented Jul 14, 2020 •

edited

Loading

XinyuDu commented Dec 21, 2021 •

edited

Loading