-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/nominal dtype checks in transform #162
Feature/nominal dtype checks in transform #162
Conversation
… not a dataframe, also tested
… not a dataframe, also tested
…is not a dataframe, also tested
…un X is dataframe check in transform
Thanks @TommyMatthews. I agree this is all quite messy and could do with a further reaching refactor. Going forward I would like to see less reliance on super init/fit/transform calls and move towards the error handling being in more specific methods to be called in child classes (like you have done with the columns_check method here) |
I don't understand why moving the super_transform call to the start broke check_mappable_rows. Can you elaborate on why it failed please? I'm happy with the solution proposed (i.e. direct call to columns_check) but want to understand to help with future refactoring of BaseNominalTransformer as this seems like odd behaviour. |
@davidhopkinson26 I'm not 100% sure myself. If I run So I guess mappable rows is now be calculated incorrectly? (I can't see it changing X.shape[0]?) |
Figured this one out: hadn't fully understood the inheritance set up of the transformer but super.transform is actually calling the transform method of I think the best solution is still just calling the |
That makes sense. I agree that this is the best solution. |
Addresses issue #145 in part. I was unable to replicate the error I found in the generic testing where the
fit
method did not raise an appropriate error for X not being apandas.DataFrame object
.This bug definitely exists for the
transform
method so I have fixed that.super.transform
was always getting called, but always after thecheck_mappable_rows
method inherited fromBaseNominalTransformer
, which was making checks that assumed a dataframe. This is a good example of an implementation test (super transform call) not testing for the behaviour we want (error if X not pd dataframe).In the case of
MeanResponseTransformer
I just moved the call ofsuper.transform
to the start of the transform method, however inOrdinalEncoderTransformer
andNominalToIntegerTransformer
this led to X not passing thecheck_mappable_rows
check, so I just called the relevantcolumns_check
method fromBaseTransformer
at the start of the transform method, which raises the appropriate error. I think the fact that I have had to do it like this reflects either a lack of understanding on my part or a need to refactor BaseNominalTransformer. I have added tests for this behaviour (copied from BaseTransformer test suite).The issue still mentions the possibility of adding nominal type checks (maybe with a warning) for nominal transformers. I think probably best to get the bug fix in, close the issue and open a new issue for this feature.