training a classifier should overwrite the .lex #484

kordjamshidi · 2017-08-03T20:33:08Z

It seems if the .lex of a classifier has been created before and exists in the default path when we retrain the classifiers it adds features to the same lexicon, that is, the lexicon is not overwritten.
(We need tests for load, save and when classifiers are created from scratch. related to #411 )

kordjamshidi · 2017-08-04T02:44:08Z

@danyaljj do you have any comments on this?

danyaljj · 2017-08-04T17:18:02Z

Just to clarify it, are you saying that training a model would write on disk (lexicon file), before/without calling save()?

kordjamshidi · 2017-08-04T17:21:45Z

No, with or without save is not an issue. The issue is when there exists a lex anyhow from the past, the train() just uses that and adds new features to it that leads to exploding the lex size as we run the app and train() frequent times (in different independent runs).

danyaljj · 2017-08-04T17:42:26Z

I see. So you think we should always remove lexicon file, at the beginning of train?

kordjamshidi · 2017-08-04T17:49:11Z

I expected it to be overwritten by default, we need to indicate if we want to continue training or need to train from scratch. Because removing those at the beginning of the train will be problematic in case we want to initialize models with existing lex and lc.

danyaljj · 2017-08-04T17:52:13Z

Right I agree it's tricky.
We can ask the user at the beginning of the training:

Do you want to remove existing model files? [Y/N]

What do you think?

kordjamshidi · 2017-08-04T17:55:17Z

Sounds good to me. @Rahgooy might have comments.

Rahgooy · 2017-08-04T18:02:35Z

I think it is good for training a single model, but when we want to train multiple models, let's say with a loop, in that case, the user should wait for the first model to train and then enter [Y/N]. IMO, the better option is to have it as a parameter or something.

kordjamshidi · 2017-08-04T18:08:50Z

In fact for jointraining we have the init parameter: here

kordjamshidi added the bug label Aug 3, 2017

kordjamshidi assigned bhargav Aug 3, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

training a classifier should overwrite the .lex #484

training a classifier should overwrite the .lex #484

kordjamshidi commented Aug 3, 2017 •

edited

Loading

kordjamshidi commented Aug 4, 2017

danyaljj commented Aug 4, 2017

kordjamshidi commented Aug 4, 2017

danyaljj commented Aug 4, 2017

kordjamshidi commented Aug 4, 2017

danyaljj commented Aug 4, 2017

kordjamshidi commented Aug 4, 2017

Rahgooy commented Aug 4, 2017

kordjamshidi commented Aug 4, 2017 •

edited

Loading

training a classifier should overwrite the .lex #484

training a classifier should overwrite the .lex #484

Comments

kordjamshidi commented Aug 3, 2017 • edited Loading

kordjamshidi commented Aug 4, 2017

danyaljj commented Aug 4, 2017

kordjamshidi commented Aug 4, 2017

danyaljj commented Aug 4, 2017

kordjamshidi commented Aug 4, 2017

danyaljj commented Aug 4, 2017

kordjamshidi commented Aug 4, 2017

Rahgooy commented Aug 4, 2017

kordjamshidi commented Aug 4, 2017 • edited Loading

kordjamshidi commented Aug 3, 2017 •

edited

Loading

kordjamshidi commented Aug 4, 2017 •

edited

Loading