Skip to content

Commit

Permalink
Fix TAP and Kokoro tests caused by NumPy v2 migration.
Browse files Browse the repository at this point in the history
1. To ensure test compatibility between NumPy v1 and v2 environments, we've adjusted the comparison tolerance to 1e-4. This accommodates slight variations (around 1e-4) in floating-point outcomes between the two NumPy versions. Additionally, we've modified the expected proto float to align with NumPy v2 results.
2. For mutual_information, NumPy v2 is able to handle values > 2**53 if the min and max of the examples are the same. However, since we need to be compatible with NumPy v1 and v2, for related unit tests, we check for the NumPy version before running the associated unit tests.

PiperOrigin-RevId: 681598675
  • Loading branch information
tfx-copybara committed Oct 2, 2024
1 parent bca0f85 commit 052aec8
Show file tree
Hide file tree
Showing 7 changed files with 165 additions and 189 deletions.
76 changes: 0 additions & 76 deletions .github/workflows/build.yml

This file was deleted.

8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,9 +66,9 @@ tested at Google.

### 1. Install Docker

Please first install `docker` and `docker compose` by following the directions:
Please first install `docker` and `docker-compose` by following the directions:
[docker](https://docs.docker.com/install/);
[docker compose](https://docs.docker.com/compose/install/).
[docker-compose](https://docs.docker.com/compose/install/).

### 2. Clone the TFDV repository

Expand All @@ -86,8 +86,8 @@ branch), pass `-b <branchname>` to the `git clone` command.
Then, run the following at the project root:

```bash
sudo docker compose build manylinux2010
sudo docker compose run -e PYTHON_VERSION=${PYTHON_VERSION} manylinux2010
sudo docker-compose build manylinux2010
sudo docker-compose run -e PYTHON_VERSION=${PYTHON_VERSION} manylinux2010
```
where `PYTHON_VERSION` is one of `{39, 310, 311}`.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -267,18 +267,18 @@
histograms {
buckets {
low_value: 5.0
high_value: 6.666666666666667
sample_count: 10.0220751
high_value: 6.6666665
sample_count: 10.0220747
}
buckets {
low_value: 6.666666666666667
high_value: 8.333333333333334
low_value: 6.6666665
high_value: 8.333333
sample_count: 0.0220751
}
buckets {
low_value: 8.333333333333334
low_value: 8.333333
high_value: 10.0
sample_count: 9.9558499
sample_count: 9.9558363
}
}
histograms {
Expand Down Expand Up @@ -340,18 +340,18 @@
histograms {
buckets {
low_value: 1.0
high_value: 1.3333333333333333
sample_count: 10.0220751
high_value: 1.3333334
sample_count: 10.0220747
}
buckets {
low_value: 1.3333333333333333
low_value: 1.3333334
high_value: 1.6666666666666665
sample_count: 0.0220751
}
buckets {
low_value: 1.6666666666666665
high_value: 2.0
sample_count: 9.9558499
sample_count: 9.9558363
}
}
histograms {
Expand Down Expand Up @@ -787,18 +787,18 @@
histograms {
buckets {
low_value: 5.0
high_value: 6.666666666666667
sample_count: 10.0220751
high_value: 6.6666665
sample_count: 10.0220747
}
buckets {
low_value: 6.666666666666667
high_value: 8.333333333333334
low_value: 6.6666665
high_value: 8.333333
sample_count: 0.0220751
}
buckets {
low_value: 8.333333333333334
low_value: 8.333333
high_value: 10.0
sample_count: 9.9558499
sample_count: 9.9558363
}
}
histograms {
Expand Down Expand Up @@ -826,18 +826,18 @@
histograms {
buckets {
low_value: 5.0
high_value: 6.666666666666667
sample_count: 50.1658375
high_value: 6.6666665
sample_count: 50.1658363
}
buckets {
low_value: 6.666666666666667
high_value: 8.333333333333334
low_value: 6.6666665
high_value: 8.333333
sample_count: 0.1658375
}
buckets {
low_value: 8.333333333333334
low_value: 8.333333
high_value: 10.0
sample_count: 99.668325
sample_count: 99.6683884
}
}
histograms {
Expand Down Expand Up @@ -905,18 +905,18 @@
histograms {
buckets {
low_value: 1.0
high_value: 1.3333333333333333
sample_count: 10.0220751
high_value: 1.3333334
sample_count: 10.0220747
}
buckets {
low_value: 1.3333333333333333
low_value: 1.3333334
high_value: 1.6666666666666665
sample_count: 0.0220751
}
buckets {
low_value: 1.6666666666666665
high_value: 2.0
sample_count: 9.9558499
sample_count: 9.9558363
}
}
histograms {
Expand Down Expand Up @@ -944,18 +944,18 @@
histograms {
buckets {
low_value: 1.0
high_value: 1.3333333333333333
sample_count: 50.1658375
high_value: 1.3333334
sample_count: 50.1658363
}
buckets {
low_value: 1.3333333333333333
low_value: 1.3333334
high_value: 1.6666666666666665
sample_count: 0.1658375
}
buckets {
low_value: 1.6666666666666665
high_value: 2.0
sample_count: 99.668325
sample_count: 99.6683884
}
}
histograms {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1864,18 +1864,18 @@ def test_with_multiple_features(self):
histograms {
buckets {
low_value: 1.0
high_value: 2.3333333
high_value: 2.3333335
sample_count: 3.0049751
}
buckets {
low_value: 2.3333333
low_value: 2.3333335
high_value: 3.6666667
sample_count: 1.0049751
sample_count: 1.0049746
}
buckets {
low_value: 3.6666667
high_value: 5.0
sample_count: 1.9900498
sample_count: 1.9900484
}
type: STANDARD
}
Expand Down Expand Up @@ -1903,7 +1903,9 @@ def test_with_multiple_features(self):
type: QUANTILES
}
}
""", statistics_pb2.FeatureNameStatistics()),
""",
statistics_pb2.FeatureNameStatistics(),
),
types.FeaturePath(['b']): text_format.Parse(
"""
path {
Expand Down Expand Up @@ -1938,7 +1940,9 @@ def test_with_multiple_features(self):
}
avg_length: 1.71428571
}
""", statistics_pb2.FeatureNameStatistics()),
""",
statistics_pb2.FeatureNameStatistics(),
),
types.FeaturePath(['c']): text_format.Parse(
"""
path {
Expand Down Expand Up @@ -2018,7 +2022,10 @@ def test_with_multiple_features(self):
type: QUANTILES
}
}
""", statistics_pb2.FeatureNameStatistics())}
""",
statistics_pb2.FeatureNameStatistics(),
),
}
generator = basic_stats_generator.BasicStatsGenerator(
num_values_histogram_buckets=3, num_histogram_buckets=3,
num_quantiles_histogram_buckets=4, epsilon=0.001)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -228,8 +228,9 @@ def _encode_multivalent_numeric_feature(
_, histogram_bin_boundaries = np.histogram(
flattened_feature_values, bins=encoding_length - 1)
except IndexError as e:
# np.histogram cannot handle values > 2**53 if the min and max of the
# examples are the same. https://github.com/numpy/numpy/issues/8627
# For NumPy version 1.x.x, np.histogram cannot handle values > 2**53 if the
# min and max of the examples are the same.
# https://github.com/numpy/numpy/issues/8627
logging.exception("Unable to encode examples: %s with error: %s",
flattened_feature_values, e)
return None
Expand Down
Loading

0 comments on commit 052aec8

Please sign in to comment.