For ease of use and versatility, the min_features
and max_features
parameters should be changed to
- feature_range: (min_val, max_val) or "all"
- recipe: "best" or "parsimonious"
Regarding feature_range
, if "all" is selected, feature subsets of all sizes will be considered as candidates for the best feature subset selected based on what's specified under recipe
.
Regarding recipe
, if "best" is provided, the feature selector will return the
feature subset with the best cross-validation performance.
If "parsimonious" is provided as an argument, the smallest
feature subset that is within one standard error of the
cross-validation performance will be selected.
I.e., if feature_range=(3, 5)
and recipe='best'
, the best feature subset with the best performance will be selected, and this feature subset can either have 3, 4, or 5 features.
Note that it would be best to deprecate min_features
and max_features
and default them to None
. However if min_features
and max_features
are not None
, they should have priority over the new parameters to avoid breaking existing code bases.