Add tests for auto-(de)compression #195

goeffthomas · 2024-12-16T21:50:33Z

Notes:

This is follow on work from Fix auto-compressed dataset downloads #194
Also added some integration tests
Note that not all stubs have been moved to this redirect path. Only the paths that could have compressed URLs in prod have been affected (Comps and Datasets). We can move all download paths to use this if we'd like.

Notes: - This is follow on work from #194 - Also added some integration tests - Note that not all stubs have been moved to this redirect path. Only the paths that could have compressed URLs in prod have been affected (Comps and Datasets). We can move all download paths to use this if we'd like. http://b/379756505

rosbo

We can move all download paths to use this if we'd like.

Given we have also integration tests that test it e2e. Probably for models to not have the GCS redirect in our unit tests given it never auto-compresses files.

rosbo · 2024-12-16T22:00:18Z

integration_tests/test_competition_download.py

+    def test_auto_decompress_file(self) -> None:
+        with create_test_cache():
+            # sample_submission.csv is an auto-compressed CSV with the following columns
+            expected_columns = ["TransactionId", "isFraud"]


Thanks for adding integration tests with deeper assertion to ensure we don't regress on this!

rosbo · 2024-12-16T22:02:09Z

integration_tests/utils.py

-def assert_files(test_case: unittest.TestCase, path: str, expected_files: list[str]) -> bool:
+def list_columns(path: str) -> list[str]:
+    """Assuming the path is a CSV, list all columns sorted lexicographically"""
+    with open(path) as file:


Alternatively, you can use csv.reader from the Python standard library.

https://docs.python.org/3/library/csv.html#csv.reader

But given this is really simple, I will leave it up to you.

Yeah, I think given the simplicity, I'll leave it for now. If we want more robust features later or run into issues that the library better handles, I'd be happy to switch it up.

goeffthomas requested review from rosbo and neshdev December 16, 2024 21:50

goeffthomas added 2 commits December 16, 2024 21:56

Fix integration test

1f751be

Remove unused import

bbce02d

rosbo approved these changes Dec 16, 2024

View reviewed changes

goeffthomas added 2 commits December 16, 2024 22:12

Fix lint/tests

fcf9e22

Fix typo

0d779f4

goeffthomas merged commit 5fdb159 into main Dec 16, 2024
6 checks passed

goeffthomas deleted the add-more-decompression-tests branch December 16, 2024 22:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add tests for auto-(de)compression #195

Add tests for auto-(de)compression #195

goeffthomas commented Dec 16, 2024

rosbo left a comment

rosbo Dec 16, 2024

rosbo Dec 16, 2024

goeffthomas Dec 16, 2024

Add tests for auto-(de)compression #195

Add tests for auto-(de)compression #195

Conversation

goeffthomas commented Dec 16, 2024

rosbo left a comment

Choose a reason for hiding this comment

rosbo Dec 16, 2024

Choose a reason for hiding this comment

rosbo Dec 16, 2024

Choose a reason for hiding this comment

goeffthomas Dec 16, 2024

Choose a reason for hiding this comment