Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve compatability with the set_output API from scikit-learn #46

Open
paucablop opened this issue Sep 30, 2023 · 0 comments
Open

Improve compatability with the set_output API from scikit-learn #46

paucablop opened this issue Sep 30, 2023 · 0 comments
Assignees
Labels
🪲 bug Something isn't working 💪 enhancement New feature or request hacktoberfest hacktoberfest 2023 🥺 help wanted Extra attention is needed
Milestone

Comments

@paucablop
Copy link
Owner

Description

All the transformers from chemotools are compatible with scikit-learn, that is the objective of chemotools 👍. In one of the most recent releases of scikit-learn they have introduced the set_output API, which basically allows the user to define an pandas as output. This will produce a pandas.DataFrame object as output instead of the default numpy.ndarray. This works fine with most of chemotools transformers, but I have some specific issues:

👉 The column names are lost after the transformation

When I use a chemotools transformer setup to produce a pandas.DataFrame, it does not keep the column names, and produces an output without column names. I have compared the functionality with other scikit-learn transformers (such as StandardScaler(), and I have seen that they do keep the column names in the output.

👉 The API does not work when the transformer reduces the number of features

Some transformers will reduce the number of features on our dataset (e.g., will select a subset of columns from it). These are under the variable selection transformers. I don't really know how to fix this issue.

Hacktoberfest Challenge

We invite open source developers to contribute to our project during Hacktoberfest. The goal is to improve compatibility with the set_output API

How to Contribute

Here is the contributing guidelines

Contact

[We can have the the conversation in the Issue or the Discussion](#45)

Resources

👉 Link to set_output API form scikit-learn

👉Link to problem description

@paucablop paucablop added 🪲 bug Something isn't working 💪 enhancement New feature or request 🥺 help wanted Extra attention is needed hacktoberfest hacktoberfest 2023 labels Sep 30, 2023
@paucablop paucablop added this to the Hacktoberfest 2023 milestone Sep 30, 2023
@paucablop paucablop self-assigned this Sep 30, 2023
@paucablop paucablop modified the milestones: Hacktoberfest 2023, v0.2.0 Nov 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🪲 bug Something isn't working 💪 enhancement New feature or request hacktoberfest hacktoberfest 2023 🥺 help wanted Extra attention is needed
Projects
Status: To be done
Development

No branches or pull requests

1 participant