-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
catagorical and ordinal feature specification and encoding #121
Comments
Should the dataset profiles work off of label encoded data? onehot data? |
|
|
Hi @weixuanfu, this seems great, the only update I see is that the response from the api to get a dataset the ordinal and categorical features will be in a slightly different format. Once the code has been updated for this api spec and there is a unit test that mocks the api response with ordinal features, this should be ready for a pull request or direct merge into master. Example response from a GET to
|
@hjwilli ok, I will push a commit for supporting this API format. |
cat/ordinal api tests References #121
Need to update tests/validation to fail for strings in fields that have not been marked as cat/ord, and for ord cols with values that were not explictly provided References #121
cat/ord datasets can be uploaded via api References #121
validation for datasets that have string data in cols not defined as categorical, and for ordinal cols that contain values not explicitly defined References #121
Be able to handle data that has categorical or ordinal features and preprocess data appropriately depending on the algorithm being run, see label vs onehot encoding
Needed for some features of #119
The text was updated successfully, but these errors were encountered: