-
Notifications
You must be signed in to change notification settings - Fork 174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add correlations to Facets charts/tables #66
Comments
Two pieces here:
|
@jameswex We are currently planning to compute correlation statistics in TFDV and probably update TF.Metadata statistics proto to capture these statistics. |
Any updates on this / where it is on the roadmap? I agree, this tool is excellent and the correlations are the only thing missing at the moment. As such, I was happy to see that it was already raised. Cheers. |
We alrady have a stats generator (tensorflow_data_validation/statistics/generators/cross_feature_stats_generator.py). You can try enabling it by specifying it in StatsOptions.generators But currently Facets does not visualize the results. We could attach the cross stats as custom stats (like the LiftStatsGenerator does). |
Hello, is there an update about the possibility to have the correlation in tfdv.visualize_statistics() ? |
TensorFlow Data Validation is a great tool to look at the data. One feature that might make it even better is if it would also compute correlations among the variables, so that if two variables are highly correlated you can avoid multicollinearities by dropping one of the correlated variables. Having that available in the facets visualization would make it easier to spot issues with the data.
The text was updated successfully, but these errors were encountered: