Skip to content

Conversation

@huaxingao
Copy link
Contributor

When running TestCompressionSettings#testWriteDataWithDifferentSetting(), we want to reset the Spark configuration between tests to ensure that previous settings do not influence subsequent ones. For instance, consider the following scenarios:
The first configuration uses:

ImmutableMap.of(COMPRESSION_CODEC, "zstd", COMPRESSION_LEVEL, "1")

The second configuration uses:

ImmutableMap.of(COMPRESSION_CODEC, "gzip")

Currently, the configuration does not reset, so the second setting unexpectedly retains the COMPRESSION_LEVEL from the first setting, resulting in COMPRESSION_CODEC, "gzip", COMPRESSION_LEVEL, "1". This can lead to unexpected behavior.

@github-actions github-actions bot added the spark label Oct 16, 2024
@huaxingao
Copy link
Contributor Author

cc @szehon-ho

Copy link
Member

@RussellSpitzer RussellSpitzer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM and I really appreciate that you have asserts in to make sure the config is in use!

@RussellSpitzer RussellSpitzer merged commit 043757c into apache:main Oct 23, 2024
@huaxingao
Copy link
Contributor Author

Thanks a lot @RussellSpitzer

@huaxingao huaxingao deleted the compression_setting branch October 23, 2024 21:08
@RussellSpitzer
Copy link
Member

Thank you @huaxingao ! It's always great to fix those hidden bad tests!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants