Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feat]: Add common functions for prediction files validation #26

Merged
merged 5 commits into from
May 17, 2024
Merged

Conversation

vpchung
Copy link
Member

@vpchung vpchung commented May 16, 2024

Fixes #25

Changelog

  • Functions added:
Function Name Description Use case
check_missing_keys Check for missing keys (participant IDs, patient IDs, etc) There is at least one prediction for every participant / patient / etc.
check_unknown_keys Check for unknown keys There are no predictions without a corresponding groundtruth value
check_duplicate_keys Check for duplicate keys There is exactly one prediction for a participant / patient / etc.
check_nan_values Check for NAN values There are no missing and/or null prediction values
check_binary_values Check for binary values Predictions can only be 0 (no disease) or 1 (disease)
check_values_range Check that values are between min and max values Prediction values are between the expected range, e.g. probability of disease from 0 to 1, etc.
  • Add reference to docs site

Preview

Screenshot 2024-05-16 at 5 06 31 PM

Future work

  • Add a tutorial page on how these functions can be used to write a validation script.
  • Use Great Expectations SodaCL for validation? As suggested by @thomasyu888

@vpchung
Copy link
Member Author

vpchung commented May 16, 2024

Any other use cases we should consider? Lmk!

@vpchung vpchung self-assigned this May 16, 2024
Copy link

Quality Gate Passed Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud

@vpchung vpchung merged commit 293eb51 into main May 17, 2024
2 checks passed
@vpchung vpchung deleted the feat-25 branch May 17, 2024 19:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[feat] Add validation toolkit
2 participants