Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data Imputation Fix in erroranalysis #2436

Merged
merged 11 commits into from
Dec 13, 2023
Merged

Data Imputation Fix in erroranalysis #2436

merged 11 commits into from
Dec 13, 2023

Conversation

Advitya17
Copy link
Collaborator

@Advitya17 Advitya17 commented Dec 4, 2023

Generating the multilabel & OD dashboards from notebooks would result in an uncaught error statement:

Input X contains NaN. Traceback (most recent call last): File "c:\workspace\rai\responsible-ai-toolbox\erroranalysis\erroranalysis\analyzer\error_analyzer.py", line 480, in compute_importances importances = self._compute_error_correlation( File "c:\workspace\rai\responsible-ai-toolbox\erroranalysis\erroranalysis\analyzer\error_analyzer.py", line 519, in _compute_error_correlation return mutual_info_classif( File "c:\Users\agemawat\Anaconda3\envs\rai\lib\site-packages\sklearn\utils_param_validation.py", line 211, in wrapper return func(*args, **kwargs) File "c:\Users\agemawat\Anaconda3\envs\rai\lib\site-packages\sklearn\feature_selection_mutual_info.py", line 493, in mutual_info_classif return _estimate_mi(X, y, discrete_features, True, n_neighbors, copy, random_state) File "c:\Users\agemawat\Anaconda3\envs\rai\lib\site-packages\sklearn\feature_selection_mutual_info.py", line 258, in _estimate_mi X, y = check_X_y(X, y, accept_sparse="csc", y_numeric=not discrete_target) File "c:\Users\agemawat\Anaconda3\envs\rai\lib\site-packages\sklearn\utils\validation.py", line 1147, in check_X_y X = check_array( File "c:\Users\agemawat\Anaconda3\envs\rai\lib\site-packages\sklearn\utils\validation.py", line 959, in check_array _assert_all_finite( File "c:\Users\agemawat\Anaconda3\envs\rai\lib\site-packages\sklearn\utils\validation.py", line 124, in _assert_all_finite _assert_all_finite_element_wise( File "c:\Users\agemawat\Anaconda3\envs\rai\lib\site-packages\sklearn\utils\validation.py", line 173, in _assert_all_finite_element_wise raise ValueError(msg_err) ValueError: Input X contains NaN.

This was because the imputer in erroranalysis was not replacing nans due to non-numeric dtype. This PR enforces a numeric dtype for successful imputation and to remove this uncaught exception.

Copilot:
This pull request to the erroranalysis/erroranalysis repository adds code to convert the input_data array to the float data type in the compute_importances method of the ErrorAnalyzer class. This change ensures error-free calculation and imputation of numerical data types.

  • erroranalysis/erroranalysis/analyzer/error_analyzer.py: Added code to convert input_data array to float data type in the compute_importances method of the ErrorAnalyzer class.

Description

Checklist

  • I have added screenshots above for all UI changes.
  • I have added e2e tests for all UI changes.
  • Documentation was updated if it was needed.

Copy link
Contributor

@gaugup gaugup left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a test case for this scenario.

@codecov-commenter
Copy link

codecov-commenter commented Dec 4, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (1ced6ef) 89.74% compared to head (697d121) 92.40%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2436      +/-   ##
==========================================
+ Coverage   89.74%   92.40%   +2.66%     
==========================================
  Files         122      108      -14     
  Lines        6747     5415    -1332     
==========================================
- Hits         6055     5004    -1051     
+ Misses        692      411     -281     
Flag Coverage Δ
unittests 92.40% <100.00%> (+2.66%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

2 similar comments
3 similar comments
1 similar comment
@Advitya17 Advitya17 merged commit 095b2ab into main Dec 13, 2023
@Advitya17 Advitya17 deleted the agemawat/ea_imputer_fix branch December 13, 2023 15:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants