Skip to content

Improve docs for support hadoop compatible file system when use HDFS …#24918

Merged
mosabua merged 1 commit intotrinodb:masterfrom
hqbhoho:feature/improve_docs_for_exchange_hdfs
Feb 7, 2025
Merged

Improve docs for support hadoop compatible file system when use HDFS …#24918
mosabua merged 1 commit intotrinodb:masterfrom
hqbhoho:feature/improve_docs_for_exchange_hdfs

Conversation

@hqbhoho
Copy link
Copy Markdown
Contributor

@hqbhoho hqbhoho commented Feb 6, 2025

Description

Follow up to #24627

Additional context and related issues

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text:

@cla-bot cla-bot bot added the cla-signed label Feb 6, 2025
@github-actions github-actions bot added the docs label Feb 6, 2025
@hqbhoho hqbhoho requested a review from mosabua February 6, 2025 03:32
@hqbhoho
Copy link
Copy Markdown
Contributor Author

hqbhoho commented Feb 6, 2025

@losipiuk @mosabua Could you help review it? thanks!

Copy link
Copy Markdown
Member

@mosabua mosabua left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should clarify if this is about Alluxio or some other partially Hadoop compatible system. And if we make it more explicit as Alluxio as the concrete example we should link to where to get the JAR files from and what they are called maybe

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Skip directory scheme validation to support hadoop compatible file system.
- Skip directory scheme validation to support Hadoop-compatible file system.

arguably "partially Hadoop-compatible" right?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
You can enable `exchange.hdfs.skip-directory-scheme-validation` to support hadoop compatible file system. Please do the following steps:
You can enable `exchange.hdfs.skip-directory-scheme-validation` to support other Hadoop- compatible file systems:

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. Configure AbstractFileSystem implementation in `core-site.xml`.
1. Configure the `AbstractFileSystem` implementation in `core-site.xml`.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
2. Put the relevant client jars into the directory `${Trino_HOME}/plugin/exchange-hdfs` on all Trino servers.
2. Add the relevant client JAR files into the directory `${Trino_HOME}/plugin/exchange-hdfs` on all Trino cluster nodes.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
as the spooling storage destination.
as the spooling storage location.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is that not done via the Alluxio file system support instead of HDFS? We have to probably explain that

Copy link
Copy Markdown
Contributor Author

@hqbhoho hqbhoho Feb 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for feedback. Since the HDFS client supports accessing other Hadoop-compatible file system, I believe adding this config can provide user with more options. By modifying the core-site.xml and adding the relevant client JARs, user can freely choose Hadoop-compatible file system. Different file system will require different configurations and client JARs. Here, Alluxio is only an example.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough .. we should call that out in the written docs

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made the suggested changes. Please let me know if there's anything else.

@hqbhoho hqbhoho force-pushed the feature/improve_docs_for_exchange_hdfs branch from 27bb725 to 46c04bd Compare February 6, 2025 12:02
@hqbhoho hqbhoho requested a review from mosabua February 6, 2025 12:10
Copy link
Copy Markdown
Member

@mosabua mosabua left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good now.

@mosabua mosabua merged commit 1afa2b8 into trinodb:master Feb 7, 2025
@github-actions github-actions bot added this to the 471 milestone Feb 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

2 participants