Skip to content

Conversation

@rmnskb
Copy link
Contributor

@rmnskb rmnskb commented Oct 7, 2025

Rationale for this change

Please see #47441 and #41476.
The ArrowWriterProperties.write_time_adjusted_to_utc flag is available in C++, yet isn't accessible from Python. This PR introduces the said flag to Python API as well.

What changes are included in this PR?

Exposure of use_time_adjusted_to_utc boolean argument in Python's API.

Are these changes tested?

Yes, roundtrip parquet tests for all combinations of time types and their respective time units.

Are there any user-facing changes?

The users will be able to adjust the said flag directly from Python API.

@github-actions
Copy link

github-actions bot commented Oct 7, 2025

⚠️ GitHub issue #47411 has been automatically assigned in GitHub to PR creator.

@rmnskb rmnskb changed the title GH-47411: [Python][Parquet] Allow passing write_time_adjusted_to_utc to Python's ParquetWriter GH-47441: [Python][Parquet] Allow passing write_time_adjusted_to_utc to Python's ParquetWriter Oct 7, 2025
@github-actions
Copy link

github-actions bot commented Oct 7, 2025

⚠️ GitHub issue #47441 has no components, please add labels for components.

2 similar comments
@github-actions
Copy link

github-actions bot commented Oct 7, 2025

⚠️ GitHub issue #47441 has no components, please add labels for components.

@github-actions
Copy link

github-actions bot commented Oct 7, 2025

⚠️ GitHub issue #47441 has no components, please add labels for components.

@rmnskb
Copy link
Contributor Author

rmnskb commented Oct 8, 2025

Hey @pitrou, I've exposed the UTC flag to the Python's Parquet API.
However, I wanted to ask you if it would make sense to test these changes? The original PR that I've linked tests them by checking the ArrowWriterProperties, which are not available to Python from the pq.ParquetWriter.writer. Would it make sense to write an end to end test?
Thank you for your input in advance!

Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'll need to update the docstrings as well (see _parquet_writer_arg_docs)

@pitrou
Copy link
Member

pitrou commented Oct 8, 2025

However, I wanted to ask you if it would make sense to test these changes? The original PR that I've linked tests them by checking the ArrowWriterProperties, which are not available to Python from the pq.ParquetWriter.writer. Would it make sense to write an end to end test?

We can indeed write a simple end-to-end test. Showing that the data roundtrips is IMHO sufficient.

@github-actions github-actions bot added awaiting committer review Awaiting committer review and removed awaiting review Awaiting review labels Oct 8, 2025
@rmnskb
Copy link
Contributor Author

rmnskb commented Oct 8, 2025

We can indeed write a simple end-to-end test. Showing that the data roundtrips is IMHO sufficient.

Thanks!
Is there any way from the Python API to check the metadata if the UTC flag was successfully applied? I could not find anything in the documentation.

@pitrou
Copy link
Member

pitrou commented Oct 8, 2025

Is there any way from the Python API to check the metadata if the UTC flag was successfully applied? I could not find anything in the documentation.

I don't think so, no.

@github-actions
Copy link

github-actions bot commented Oct 8, 2025

⚠️ GitHub issue #47441 has no components, please add labels for components.

@rmnskb rmnskb marked this pull request as ready for review October 8, 2025 21:16
@rmnskb rmnskb requested review from AlenkaF, raulcd and rok as code owners October 8, 2025 21:16
@rmnskb rmnskb requested a review from pitrou October 8, 2025 21:16
@rmnskb rmnskb requested a review from pitrou October 9, 2025 15:51
Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just one small suggestion

@pitrou pitrou merged commit ed91f6f into apache:main Oct 14, 2025
12 of 13 checks passed
@pitrou pitrou removed the awaiting committer review Awaiting committer review label Oct 14, 2025
@conbench-apache-arrow
Copy link

After merging your PR, Conbench analyzed the 3 benchmarking runs that have been run so far on merge-commit ed91f6f.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about 1 possible false positive for unstable benchmarks that are known to sometimes produce them.

zanmato1984 pushed a commit to zanmato1984/arrow that referenced this pull request Oct 15, 2025
…o_utc to Python's ParquetWriter (apache#47745)

### Rationale for this change
Please see apache#47441 and apache#41476. 
The `ArrowWriterProperties.write_time_adjusted_to_utc` flag is available in C++, yet isn't accessible from Python. This PR introduces the said flag to Python API as well.

### What changes are included in this PR?
Exposure of `use_time_adjusted_to_utc` boolean argument in Python's API. 

### Are these changes tested?
Yes, roundtrip parquet tests for all combinations of time types and their respective time units.

### Are there any user-facing changes?
The users will be able to adjust the said flag directly from Python API.
* GitHub Issue: apache#47441

Lead-authored-by: Bogdan Romenskii <[email protected]>
Co-authored-by: Antoine Pitrou <[email protected]>
Signed-off-by: Antoine Pitrou <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants