-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-20070][SQL] Redact DataSourceScanExec treeString #17397
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
cc @liufengdb |
|
LGTM |
|
Test build #75093 has finished for PR 17397 at commit
|
|
lgtm |
|
|
||
| /** | ||
| * Shorthand for calling redactString() without specifying redacting rules | ||
| * */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: * */ -> */
|
Maybe we can verify the user input or capture the exception? If users input incorrect value, it could get a confusing error message later. |
|
Test build #75100 has finished for PR 17397 at commit
|
|
Thanks! @hvanhovell Could we also output the conf name? |
|
Test build #75166 has finished for PR 17397 at commit
|
|
LGTM pending Jenkins |
|
Test build #75174 has finished for PR 17397 at commit
|
|
Thanks! Merging to master. |
|
@hvanhovell This seems to have broken the 2.10 build: |
|
I see, thanks! |
|
@hvanhovell |
|
Sorry, I did not catch it. Let me submit a follow-up to fix this test case issue. |
|
Submitted the PR #17448. Please check whether the fix is appropriate. |
What changes were proposed in this pull request?
The explain output of
DataSourceScanExeccan contain sensitive information (like Amazon keys). Such information should not end up in logs, or be exposed to non privileged users.This PR addresses this by adding a redaction facility for the
DataSourceScanExec.treeString. A user can enable this by setting a regex in thespark.redaction.string.regexconfiguration.How was this patch tested?
Added a unit test to check the output of DataSourceScanExec.