Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/one hot encoder default int8 #175

Merged
merged 6 commits into from
Feb 6, 2024

Conversation

limlam96
Copy link
Contributor

@limlam96 limlam96 commented Feb 2, 2024

Edited OneHotEncodingTransformer to default as outputting int8 types. Output cols should be all 0/1, so do not see a downside to this. Will be significant memory savings compared to default float64.

Copy link
Contributor

@davidhopkinson26 davidhopkinson26 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Comments on duplicate changes from #173 and changelog update needing fixing.

CHANGELOG.rst Outdated
------------------
Added
^^^^^
- Update OneHotEncodingTransformer to default to returning int8 columns https://github.com/lvgig/tubular/pull/175
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please update link to same format as rest of changelog

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated here 72cd91f

@@ -469,6 +469,6 @@ def transform(self, X: pd.DataFrame) -> pd.DataFrame:
X = super().transform(X)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean to update NullIndicator in this PR as well? Happy to have both in 1 PR if you just want to abandon #173 and update this PR to reflect that both are here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or assuming changes are same in both might be simplest to just merge #173 in first since it looks fine. Should be fine so long as we don't squash commit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops sorry, not sure how this happened, must have been in a rush on Friday. Sounds good, can complete 173 then revisit this one to see if there are any issues

Copy link
Contributor

@davidhopkinson26 davidhopkinson26 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

@davidhopkinson26 davidhopkinson26 merged commit cebe0c0 into main Feb 6, 2024
1 check passed
@davidhopkinson26 davidhopkinson26 deleted the feature/one_hot_encoder_default_int8 branch February 6, 2024 09:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants