multivariate drift bugfix and doc update #37

nikml · 2022-03-14T18:57:14Z

Fixes Data Reconstruction fails when selected features doesn't include a categorical feature. #36
Documentation Update

codecov · 2022-03-14T18:59:18Z

Codecov Report

Merging #37 (6dcd725) into main (caa9849) will increase coverage by 0.09%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main      #37      +/-   ##
==========================================
+ Coverage   68.64%   68.74%   +0.09%     
==========================================
  Files          28       28              
  Lines        1263     1267       +4     
  Branches      239      243       +4     
==========================================
+ Hits          867      871       +4     
  Misses        392      392              
  Partials        4        4

Impacted Files	Coverage Δ
nannyml/drift/data_reconstruction/calculator.py	`98.96% <100.00%> (+0.04%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update caa9849...6dcd725. Read the comment docs.

nnansters · 2022-03-15T09:18:26Z

Good job on the added tests 👍

* Created step plot functionality, created artificial endpoint generation, separated legend label arguments form hover label arguments, small improved to legend generation, added incomplete target functionality, created reference implementation for: (1) target distribution monitoring (2) realised performance monitoring (3) correct naming of plotting elements. * multivariate drift bugfix and doc update (#37) * [skip ci] Updated the changelog * doc and testing updates (#38) * 39 continuous distribution plots scaling got wrong (#40) * fix scaling * update docs plots * Check if calibration is needed before performing CBPE estimation (#42) * - CBPE will check if calibration is beneficial during fitting. If not, calibration will not be performed. - Calibration is not required when roc_auc_score == 1 (perfect predictor) * Deal with indexing issues when using StratifiedShuffleSplit indexes on subsets * needs_calibration threshold with some margin * Debug results messing up fitting * Include realized performance in CBPE results * Plot realized performance for reference period * Don't exclude analysis data from realized performance calculation (future work) * Allow for different functionalities of NannyML to set thier own minimum chunk size (#43) * wip1: default min chunk size function for BaseDriftCalculator * wip2: min default chunk size for BasePerformanceEstimator * wip3 - perf est? * Big chunker refactor. Minimum chunk size moved to split function as optional argument. Multiple other refactors as a consequence + test adjustments. * wip: update BasePerfEstimator to not have functions regarding minimum chunk size * make CBPE set its own min chunk size * wip: min chunk size for multivariate * add docs for minimum chunk size * Move chunker.split to inheriting drift calculator classes * Fix missing target values during (old) _minimum_chunk_size calculation * - Move chunk splitting to PerformanceEstimator subclasses - Move roc-auc based min_chunk_size predictor to CBPE (also ROC-AUC based). Co-authored-by: Niels Nuyttens <[email protected]> * Updated CHANGELOG.md * Bump version: 0.2.0 → 0.2.1 * Update CHANGELOG.md * Update CHANGELOG.md * Feature: performance calculation (#44) * Add predicted probabilities to metadata * Predicted labels should be predicted scores for CBPE Co-authored-by: Nikolaos Perrakis <[email protected]> Co-authored-by: jakubnml <[email protected]> * typo fix (#45) * Stricter constraints for scipy * Fixes: - using predicted labels during univariate continuous drift calculation - exclude detected predicted probabilities columns from feature list during metadata extraction - use predicted probabilities during drift results plotting - use predicted probabilities during drifting features ranking * Fixes: - Still using predicted labels instead of predicted probabilities in CBPE - Added test to run CBPE with synthetic example data * Fixes: - Added check for metadata.predicted_probability_column_name in univariate drift calculator construction + test - Fix some broken tests * Replace line plots by step plots Co-authored-by: Wiljan Cools <[email protected]> Co-authored-by: Nikolaos Perrakis <[email protected]> Co-authored-by: jakubnml <[email protected]>

multivariate drift bugfix and doc update

6dcd725

nikml requested a review from nnansters March 14, 2022 18:57

nikml self-assigned this Mar 14, 2022

nnansters merged commit 6388d0e into main Mar 15, 2022

nikml deleted the multivariate-drift-update-02 branch March 15, 2022 15:19

nnansters pushed a commit that referenced this pull request Mar 22, 2022

multivariate drift bugfix and doc update (#37)

8cc0ec7

nnansters pushed a commit that referenced this pull request Mar 23, 2022

multivariate drift bugfix and doc update (#37)

79e9f57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

multivariate drift bugfix and doc update #37

multivariate drift bugfix and doc update #37

nikml commented Mar 14, 2022

codecov bot commented Mar 14, 2022 •

edited

Loading

nnansters commented Mar 15, 2022

multivariate drift bugfix and doc update #37

multivariate drift bugfix and doc update #37

Conversation

nikml commented Mar 14, 2022

codecov bot commented Mar 14, 2022 • edited Loading

Codecov Report

nnansters commented Mar 15, 2022

codecov bot commented Mar 14, 2022 •

edited

Loading