Skip to content

Fix an issue while accessing Symlink tables#25307

Merged
imjalpreet merged 1 commit intoprestodb:masterfrom
imjalpreet:fixSymlinkConfigObject
Jun 19, 2025
Merged

Fix an issue while accessing Symlink tables#25307
imjalpreet merged 1 commit intoprestodb:masterfrom
imjalpreet:fixSymlinkConfigObject

Conversation

@imjalpreet
Copy link
Member

Description

Introduction of WrapperJobConf and CopyOnFirstWriteConfiguration lead to issues while accessing Symlink tables since the actual configuration object is wrapped inside a configuration object

Motivation and Context

Fixes #25306

Test Plan

Ran manual tests and was able to access the Symlink tables on the S3 filesystem.

presto:tpch_sf1000_parquet_delta> select * from customer limit 2;
  custkey  |        name        |             address             | nationkey |      phone      | acctbal | mktsegment |                                    comment                                     
-----------+--------------------+---------------------------------+-----------+-----------------+---------+------------+--------------------------------------------------------------------------------
  99281969 | Customer#099281969 | 2zrmoAd9sF10aS4id8zLiin0pGxZyJ6 |        19 | 29-421-607-7641 | -230.99 | HOUSEHOLD  | c instructions cajole along the ruthlessly ironic platelets; blithely ironic d 
 122862757 | Customer#122862757 | EWzBgsC,AZcCdAG8j               |         9 | 19-250-866-9749 | -946.34 | AUTOMOBILE | egular theodolites. furiously ironic ide                                       

(2 rows)

Query 20250611_003932_00002_kthgg, FINISHED, 1 node
Splits: 217 total, 96 done (44.24%)
[Latency: client-side: 1:07, server-side: 1:06] [34 rows, 461MB] [0 rows/s, 6.93MB/s]

Contributor checklist

  • Please make sure your submission complies with our contributing guide, in particular code style and commit standards.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.

Release Notes

Please follow release notes guidelines and fill in the release notes below.

== RELEASE NOTES ==

Hive Connector Changes
* Fix an issue while accessing Symlink tables

@imjalpreet imjalpreet self-assigned this Jun 12, 2025
@imjalpreet imjalpreet requested a review from a team as a code owner June 12, 2025 20:58
@imjalpreet imjalpreet requested a review from jaystarshot June 12, 2025 20:58
@prestodb-ci prestodb-ci added the from:IBM PR from IBM label Jun 12, 2025
@prestodb-ci prestodb-ci requested review from a team, Dilli-Babu-Godari and namya28 and removed request for a team June 12, 2025 20:58
Introduction of WrapperJobConf and CopyOnFirstWriteConfiguration lead to issues while accessing Symlink tables since the actual configuration object is wrapped inside a configuration object
@imjalpreet imjalpreet force-pushed the fixSymlinkConfigObject branch from 7244684 to 81a6edf Compare June 13, 2025 17:42
Copy link
Member

@agrawalreetika agrawalreetika left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any specific scenario when config will create an issue? Since there is an existing symlink Tests here https://github.com/prestodb/presto/blob/master/presto-product-tests/src/main/java/com/facebook/presto/tests/hive/TestSymlinkTableListCaching.java#L72 which is using catalog with config resource in catalog and works fine.

@imjalpreet
Copy link
Member Author

One example is included in the linked issue in the description. This problem occurs only when a query relies on Hadoop configurations that have been modified by Presto. The integration tests run on HDFS and Hadoop and that probably works just fine with the default Hadoop Configuration object.

@tdcmeehan
Copy link
Contributor

Any way to unit test this?

@imjalpreet
Copy link
Member Author

imjalpreet commented Jun 19, 2025

Any way to unit test this?

Yeah, there aren't any existing tests to catch these edge cases. To catch this issue, we need to use a filesystem other than HDFS, something like S3 or GCS, which are supported by Presto out of the box. I will look to either add an integration test using MinIO (which simulates S3) or explore adding a unit test specifically to verify the Configuration object being used in that part of the code.

I will add it in a follow-up PR as discussed, Thanks.

@imjalpreet imjalpreet merged commit aaf3d3c into prestodb:master Jun 19, 2025
225 of 229 checks passed
@prestodb-ci prestodb-ci mentioned this pull request Jul 28, 2025
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

from:IBM PR from IBM

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Issue while accessing Symlink Tables on S3

4 participants