Releases: tensorflow/data-validation
TensorFlow Data Validation 1.16.1
Version 1.16.1
Major Features and Improvements
- N/A
Bug Fixes and Other Changes
- Relax dependency on Protobuf to include version 5.x
Known Issues
- N/A
Breaking Changes
- N/A
Deprecations
- N/A
TensorFlow Data Validation 1.16.0
Version 1.16.0
Major Features and Improvements
- N/A
Bug Fixes and Other Changes
- Depends on
tensorflow>=2.16,<2.17
.
Known Issues
- N/A
Breaking Changes
- N/A
Deprecations
- N/A
TensorFlow Data Validation 1.15.1
Major Features and Improvements
- N/A
Bug Fixes and Other Changes
- Depends on
tensorflow>=2.15,<2.16
.
Known Issues
- N/A
Breaking Changes
- N/A
Deprecations
- N/A
TensorFlow Data Validation 1.15.0
Major Features and Improvements
- N/A
Bug Fixes and Other Changes
- When computing cross feature statistics, skip configured crosses that
include features of unsupported types (i.e., are not univalent numeric
features). - Update the minimum Bazel version required to build TFDV to 6.1.0.
- Modifies get_statistics_html() utility function to return a value indicating
a dataset has no examples. - Outputs both a standard and a quantiles histogram for level N value list
length statistics. - Add a
macos_arm64
config setting to the TFDV build file. NOTE: At this
time, any M1 support for TFDV is experimental and untested. - Bumps the pybind11 version to 2.11.1.
- Depends on
tensorflow~=2.15.0
. - Depends on
apache-beam[gcp]>=2.53.0,<3
for Python 3.11 and on
apache-beam[gcp]>=2.47.0,<3
for 3.9 and 3.10. - Depends on
protobuf>=4.25.2,<5
for Python 3.11 and onprotobuf>3.20.3,<5
for 3.9 and 3.10.
Known Issues
- N/A
Breaking Changes
- N/A
Deprecations
- Deprecated python 3.8 support.
- Deprecated Windows support.
TensorFlow Data Validation 1.14.0
Major Features and Improvements
- N/A
Bug Fixes and Other Changes
- Bumped the Ubuntu version on which TFX-BSL is tested to 20.04 (previously
was 16.04). - Use @platforms instead of @bazel_tools//platforms to specify constraints in
OSS build. - Depends on
pyarrow>=10,<11
. - Depends on
apache-beam>=2.47,<3
. - Depends on
numpy>=1.22.0
. - Depends on
tensorflow>=2.13.0,<3
.
Known Issues
- N/A
Breaking Changes
- Moves some non-public arrow_util functions to TFX-BSL.
- Changes SkewPair proto to store tf.Examples in serialized format.
Deprecations
- N/A
TensorFlow Data Validation 1.13.0
Major Features and Improvements
- Introduces a Schema option
HistogramSelection
to allow numeric drift/skew
calculations to use QUANTILES histograms, which are more robust to outliers.
Bug Fixes and Other Changes
- Rename
statistics_io_impl
anddefault_record_sink
(not part of public API). - Update the minimum Bazel version required to build TFDV to 5.3.0.
- Depends on
numpy~=1.22.0
. - Depends on
pyfarmhash>=0.2.2,<0.4
. - Depends on
tensorflow>=2.12.0,<2.13
. - Depends on
protobuf>=3.20.3,<5
. - Depends on
tfx-bsl>=1.13.0,<1.14.0
. - Depends on
tensorflow-metadata>=1.13.1,<1.14.0
.
Known Issues
- N/A
Breaking Changes
- Jensen-Shannon divergence now treats NaN values as always contributing to
higher drift score.
Deprecations
- Deprecated python 3.7 support.
TensorFlow Data Validation 1.12.0
Major Features and Improvements
- N/A
Bug Fixes and Other Changes
- TFDV is now tested against macOS 12.5 (Monterey).
Known Issues
- N/A
Breaking Changes
- Depends on
tensorflow>=2.11,<3
- Depends on
tfx-bsl>=1.12.0,<1.13.0
. - Depends on
tensorflow-metadata>=1.12.0,<1.13.0
.
Deprecations
- N/A
TensorFlow Data Validation 1.11.0
Major Features and Improvements
-
This is the last version that supports TensorFlow 1.15.x. TF 1.15.x support
will be removed in the next version. Please check the
TF2 migration guide to migrate
to TF2. -
Add a
custom_validate_statistics
function to the validation API, and
support passing custom validations tovalidate_statistics
. Note that
custom validation is not supported on Windows.
Bug Fixes and Other Changes
-
Fix bug in implementation of
semantic_domain_stats_sample_rate
. -
Add beam metrics on string length
-
Determine whether to calculate string statistics based on the
is_categorical
field in the schema string domain. -
Histograms counts should now be more accurate for distributions with few
distinct values, or frequent individual values. -
Nested list length histogram counts are no longer based on the number of
values one up in the nested list hierarchy. -
Support using jensen-shannon divergence to detect drift and skew for string
and categorical features. -
get_drift_skew_dataframe
now includes athreshold
column. -
Adds support for NormalizedAbsoluteDifference comparator.
-
Depends on
tensorflow>=1.15.5,<2
ortensorflow>=2.10,<3
-
Depends on
joblib>=1.2.0
.
Known Issues
- N/A
Breaking Changes
- Histogram semantics are slightly changed, so that buckets include their
upper bound instead of their lower bound. STANDARD histograms will no longer
generate buckets that contain infinite and finite endpoints together. - Introduces StatsOptions.use_sketch_based_topk_uniques replacing
experimental_use_sketch_based_topk_uniques. The latter option can still be
written, but not read.
Deprecations
- N/A
TensorFlow Data Validation 1.10.0
Major Features and Improvements
- N/A
Bug Fixes and Other Changes
- Skew pipeline supports counting pairs of feature values in base/test.
- Depends on
apache-beam[gcp]>=2.40,<3
. - Depends on
pyarrow>=6,<7
. - Depends on
tfx-bsl>=1.10.1,<1.11.0
. - Depends on
tensorflow-metadata>=1.10.0,<1.11.0
.
Known Issues
- N/A
Breaking Changes
- N/A
Deprecations
- N/A
TensorFlow Data Validation 1.9.0
Major Features and Improvements
- N/A
Bug Fixes and Other Changes
- Depends on
tensorflow>=1.15.5,<2
ortensorflow>=2.9,<3
- Depends on
tfx-bsl>=1.9.0,<1.10.0
. - Depends on
tensorflow-metadata>=1.9.0,<1.10.0
.
Known Issues
- N/A
Breaking Changes
- Some fields in feature skew results proto changed names to be more generic.
Deprecations
- N/A