Skip to content

Handle S3 paths where bucket name contains "dot digit"#19911

Merged
findepi merged 1 commit intomasterfrom
findepi/handle-s3-paths-where-bucket-name-contains-dot-digit-f77f05
Nov 27, 2023
Merged

Handle S3 paths where bucket name contains "dot digit"#19911
findepi merged 1 commit intomasterfrom
findepi/handle-s3-paths-where-bucket-name-contains-dot-digit-f77f05

Conversation

@findepi
Copy link
Member

@findepi findepi commented Nov 27, 2023

HadoopPaths compatibility layer was failing when S3 bucket name contains a dot followed by a digit:

java.net.URISyntaxException: Illegal character in hostname at index 10: s3://test.123/abc//xyz.csv#abc//xyz.csv
	at java.base/java.net.URI$Parser.fail(URI.java:2974)
	at java.base/java.net.URI$Parser.parseHostname(URI.java:3517)
	at java.base/java.net.URI$Parser.parseServer(URI.java:3358)
	at java.base/java.net.URI$Parser.parseAuthority(URI.java:3277)
	at java.base/java.net.URI$Parser.parseHierarchical(URI.java:3219)
	at java.base/java.net.URI$Parser.parse(URI.java:3175)
	at java.base/java.net.URI.<init>(URI.java:708)
	at java.base/java.net.URI.<init>(URI.java:809)
	at io.trino.filesystem.hdfs.HadoopPaths.toPathEncodedUri(HadoopPaths.java:46)

Using a different URI constructor avoids the problem.

`HadoopPaths` compatibility layer was failing when S3 bucket name
contains a dot followed by a digit:

```
java.net.URISyntaxException: Illegal character in hostname at index 10: s3://test.123/abc//xyz.csv#abc//xyz.csv
	at java.base/java.net.URI$Parser.fail(URI.java:2974)
	at java.base/java.net.URI$Parser.parseHostname(URI.java:3517)
	at java.base/java.net.URI$Parser.parseServer(URI.java:3358)
	at java.base/java.net.URI$Parser.parseAuthority(URI.java:3277)
	at java.base/java.net.URI$Parser.parseHierarchical(URI.java:3219)
	at java.base/java.net.URI$Parser.parse(URI.java:3175)
	at java.base/java.net.URI.<init>(URI.java:708)
	at java.base/java.net.URI.<init>(URI.java:809)
	at io.trino.filesystem.hdfs.HadoopPaths.toPathEncodedUri(HadoopPaths.java:46)
```

Using a different `URI` constructor avoids the problem.
@findepi findepi merged commit b825b96 into master Nov 27, 2023
@findepi findepi deleted the findepi/handle-s3-paths-where-bucket-name-contains-dot-digit-f77f05 branch November 27, 2023 20:22
@github-actions github-actions bot added this to the 434 milestone Nov 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

2 participants