You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm having issues specifying the features to include/exclude when visualizing stats in TFDV. It seems like the allowlist_features and denylist_features require a tensorflow_data_validation.types.FeaturePath object, which took a bit to figure out how to construct. This doesn't seem that user friendly -- was it intended to allow a list of strings to be passed?
Code to reproduce
I can reproduce the problem in the public colab example. In the "Compute and Visualize Statistics" section of the above notebook, update the visualize_statistics call to be: tfdv.visualize_statistics(train_stats, denylist_features=['pickup_community_area']). The first feature shouldn't exist in the visualized example (if I'm calling this correctly).
Workaround code
To make this work, I have to manually construct a tensorflow_data_validation.types.FeaturePath object. Perhaps it would be better to do the filter comparison on each feature's path string?
# Show string name of feature
first_feat = train_stats.datasets[0].features[0]
print(first_feat.path)
# Construct necessary object to make `allowlist_feature` filter work
from tensorflow_data_validation import types
print(types.FeaturePath.from_proto(first_feat.path))
# docs-infra: no-execute
tfdv.visualize_statistics(train_stats, allowlist_features=[types.FeaturePath.from_proto(first_feat.path)])
The text was updated successfully, but these errors were encountered:
Overview
I'm having issues specifying the features to include/exclude when visualizing stats in TFDV. It seems like the
allowlist_features
anddenylist_features
require atensorflow_data_validation.types.FeaturePath
object, which took a bit to figure out how to construct. This doesn't seem that user friendly -- was it intended to allow a list of strings to be passed?Code to reproduce
I can reproduce the problem in the public colab example. In the "Compute and Visualize Statistics" section of the above notebook, update the
visualize_statistics
call to be:tfdv.visualize_statistics(train_stats, denylist_features=['pickup_community_area'])
. The first feature shouldn't exist in the visualized example (if I'm calling this correctly).Workaround code
To make this work, I have to manually construct a
tensorflow_data_validation.types.FeaturePath
object. Perhaps it would be better to do the filter comparison on each feature'spath
string?The text was updated successfully, but these errors were encountered: