Set timestamp time zone in parquet writer#10474
Set timestamp time zone in parquet writer#10474JkSelf wants to merge 8 commits intofacebookincubator:mainfrom
Conversation
✅ Deploy Preview for meta-velox canceled.
|
2a4d7f1 to
b6ebfaa
Compare
|
@pedroerp @jinchengchenghh Resolved all your comments. Can you help to review again? Thanks. |
|
@pedroerp @jinchengchenghh Thanks for your review. I have resolved all your comments. Can you help to review again? Thanks. |
pedroerp
left a comment
There was a problem hiding this comment.
A few small comments, but overall looks good.
@jinchengchenghh please take one last pass when you have a moment, then I can get it merged.
|
Cc: @majetideepak |
8f9d882 to
160128b
Compare
|
@jinchengchenghh @pedroerp Thanks for your review. I have resolved all your comments. Can you help to review again? Thanks. |
|
@jinchengchenghh Can you help to review again? Thanks. |
majetideepak
left a comment
There was a problem hiding this comment.
Changes look good! Just need to address the outdated code comment.
|
@jinchengchenghh @majetideepak @pedroerp Resolved all your comments. Can you help to review again? Thanks. |
|
Thanks~ |
|
@kagamiori has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
|
@kagamiori Can you help to merge? Thanks. |
|
Hi @JkSelf, could you please rebasing the PR onto the latest main? That would help with merging. Thanks! |
03e7bf7 to
45ce236
Compare
@kagamiori Yes. Rebased to latest main. Thanks. |
|
@kagamiori has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
|
@kagamiori Does the failed unit test relate with this PR? |
No problem. I was out of office yesterday. Merging it now. |
|
@kagamiori merged this pull request in 9355109. |
|
Conbench analyzed the 1 benchmark run on commit There were no benchmark performance regressions. 🎉 The full Conbench report has more details. |
When we use Gluten's Parquet write to write timestamp data, we noticed that the timestamp data read out differs by a 7-hour time zone difference compared to vanilla Spark. We observed that Spark automatically adjusts the time zone to UTC when writing timestamp data. However, Velox's Parquet write function does not make such an adjustment, due to the fact that we did not set the time zone when creating the Arrow TimestampType. To address this issue, this PR will set the time zone in the Arrow TimestampType based on the kSessionTimezone configuration.