-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-6305][BUILD] Migrate from log4j1 to log4j2 #34895
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This comment has been minimized.
This comment has been minimized.
|
Kubernetes integration test starting |
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
laychengtan1401
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
log4j 2.12 still expose to vulnerability unless change to 2.15
|
Kubernetes integration test status failure |
pom.xml
Outdated
| <sbt.project.name>spark</sbt.project.name> | ||
| <slf4j.version>1.7.30</slf4j.version> | ||
| <log4j.version>1.2.17</log4j.version> | ||
| <log4j.version>2.12.1</log4j.version> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bump to 2.16.0 please
|
Kubernetes integration test starting |
|
Test build #146178 has finished for PR 34895 at commit
|
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Kubernetes integration test status failure |
|
Kubernetes integration test starting |
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Kubernetes integration test status failure |
|
|
|
Test build #146346 has finished for PR 34895 at commit
|
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM. Merged to master for Apache Spark 3.3.
|
Thank you, @viirya , @dbtsai , @srowen , @laychengtan1401 , @ChethanGB , @FranciscoBorges , @enigmatic00 and all. |
|
Thank you @dongjoon-hyun and all. The log4j1 properties files are useless now. I will open another PR to remove them. |
|
Test build #146349 has finished for PR 34895 at commit
|
|
Awesome job @viirya ! Looks like there is another CVE in 2.16: https://logging.apache.org/log4j/2.x/security.html. Should we upgrade to 2.17? |
|
@sunchao yea, I will upgrade to 2.17. |
| public static class LogAppender extends AbstractAppender { | ||
|
|
||
| protected LogAppender(String name, | ||
| Filter filter, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indent seems fine here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
| // If Log4j 1.2 is being used, but is not initialized, load a default properties file | ||
| if (Logging.isLog4j12()) { | ||
| val log4j12Initialized = LogManager.getRootLogger.getAllAppenders.hasMoreElements | ||
| if (!Logging.isLog4j12()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code guarded by if(!Logging.isLog4j12()) only works with Log4J2 and not other SLF4J bindings (such as Logback). I think it might be a bit clearer and slightly more correct to replace this with a positive if(Logging.isLog4j2()) check which checks that the Log4J2 binding is in use (e.g. that the binder class is org.apache.logging.slf4j.Log4jLoggerFactory).
Similarly, I think log4j12Initialized should be renamed to log4j2Initialized (since we're not using Log4J 1.2 in the branch where that variable is defined).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. I will fix them.
| import org.apache.spark.SparkFunSuite | ||
| import org.apache.spark.util.Utils | ||
|
|
||
| class LoggingSuite extends SparkFunSuite { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did we need to delete this suite? I think this suite's removal means that we no longer have unit test coverage of SparkShellLoggingFilter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, I think I removed it at the beginning of the attempt. I will add it back with other fixes regarding your other comments.
| val fa = Log4jFileAppender.createAppender(localLogFile, "false", "false", | ||
| DriverLogger.APPENDER_NAME, "true", "false", "false", "4000", layout, null, | ||
| "false", null, config); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My IDE says that this createAppender method is deprecated. Since the old 1. code was only setting the filename and layout, could this instead be
val fa = Log4jFileAppender.newBuilder().withFileName(localLogFile).setLayout(layout).build()
or do we need to set some of those other arguments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can rewrite it using builder.
| val logger = LogManager.getRootLogger().asInstanceOf[Logger] | ||
| val fa = logger.getAppenders.get(DriverLogger.APPENDER_NAME) | ||
| logger.removeAppender(fa) | ||
| fa.stop() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we omit this fa.stop() since we have a Utils.tryLogNonFatalError(fa.stop()) on the following line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay.
| val rootLogger = org.apache.logging.log4j.LogManager.getRootLogger() | ||
| .asInstanceOf[org.apache.logging.log4j.core.Logger] | ||
| rootLogger.setLevel(l) | ||
| rootLogger.get().setLevel(l) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we know if this is the best way to programmatically change the root logger's level? I noticed that https://stackoverflow.com/questions/23434252/programmatically-change-log-level-in-log4j2 mentions several techniques which differ from the one used here.
I ask because I'm not very familiar with Log4J2 and I'm unsure if there's subtle pitfalls to avoid (e.g. will this work if we don't call updateLoggers()?).
If we wind up changing this code then we should also update similar code in SparkFunSuite.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, this is a good question. I think I feel the same regarding log4j2 API. This is the way I finally made it work. I will look if there is more proper way for it.
| logger.setLevel(level.get) | ||
| logger.get().setLevel(level.get) | ||
| } | ||
| case _ => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since loggers is a Seq[Logger], is this branch ever reachable? Unless I'm mistaken, I think think we'll always hit the case logger: Logger case.
| while (logger.getParent() != null) { | ||
| if (logger.getLevel != null || logger.getAllAppenders.hasMoreElements) { | ||
| return Filter.NEUTRAL | ||
| private class SparkShellLoggingFilter extends Filter { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could potentially simplify this class by having it extend AbstractFilter instead of Filter:
From AbstractFilter's Javadoc:
/**
* Users should extend this class to implement filters. Filters can be either context wide or attached to
* an appender. A filter may choose to support being called only from the context or only from an appender in
* which case it will only implement the required method(s). The rest will default to return {@link org.apache.logging.log4j.core.Filter.Result#NEUTRAL}.
* <p>
* Garbage-free note: the methods with unrolled varargs by default delegate to the
* {@link #filter(Logger, Level, Marker, String, Object...) filter method with vararg parameters}.
* Subclasses that want to be garbage-free should override these methods to implement the appropriate filtering
* without creating a vararg array.
* </p>
*/
public abstract class AbstractFilter extends AbstractLifeCycle implements Filter {There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay. I will try to rewrite this with AbstractFilter. I think the functionality should be the same but yea it looks better to use AbstractFilter here.
| <!-- | ||
| <dependency> | ||
| <groupId>org.slf4j</groupId> | ||
| <artifactId>slf4j-api</artifactId> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should remove this commented-out block since it is no longer being used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, they were removed in a followup later.
### What changes were proposed in this pull request? Change log level of tests to WARN ### Why are the changes needed? After #34895, the log level of tests becomes either "INFO" or "DEBUG", this increases the log output file to over 100MB! E.g https://github.com/gengliangwang/spark/runs/4623214509?check_suite_focus=true This makes it difficult to find the CI test failures. Changing the log level of tests to WARN should help. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? It's just conf change. Closes #35011 from gengliangwang/loglevel. Authored-by: Gengliang Wang <[email protected]> Signed-off-by: Hyukjin Kwon <[email protected]>
|
Pretty bold timing. I would have expected private builds to simply ship with a patched version of log4j 1.17, as we do https://mvnrepository.com/artifact/log4j/log4j/1.2.17-cloudera1 Given all projects using log4j 2.x have had to do three emergency releases in the space of the week, you are potentially adding more work in the immediate future. in hadoop we are now looking at logback important have you made sure any/ all transit to dependencies of log4j 1.x have been excluded, including its SLF4J binding library? there may be some transient dependencies waiting to cause problems |
<!-- Thanks for sending a pull request! Here are some tips for you: 1. If this is your first time, please read our contributor guidelines: https://kyuubi.readthedocs.io/en/latest/community/contributions.html 2. If the PR is related to an issue in https://github.com/apache/incubator-kyuubi/issues, add '[KYUUBI #XXXX]' in your PR title, e.g., '[KYUUBI #XXXX] Your PR title ...'. 3. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP][KYUUBI #XXXX] Your PR title ...'. --> ### _Why are the changes needed?_ <!-- Please clarify why the changes are needed. For instance, 1. If you add a feature, you can talk about the use case of it. 2. If you fix a bug, you can clarify why it is a bug. --> This patch proposes to migrate from log4j1 to log4j2 in Kyuubi. Refer the spark patch: apache/spark#34895 ### Does this PR introduce any user-facing change? Yes. Users may need to rewrite their log4j properties file for log4j2. As of version 2.4, log4j now supports configuration via properties files. Note that the property syntax is not the same as the syntax used in log4j 1, but during the migration I found the syntax is pretty close so the migration should be straightforward. ### _How was this patch tested?_ Passed all existing tests. Closes #1769 from turboFei/log4j2. Closes #1769 8613779 [Fei Wang] remove log4j dependencies from spark-sql-engine b2fe6db [Fei Wang] Use String to present default log level 762e2d0 [Fei Wang] only remove org.apache.logging.log4j:log4j-slf4j-impl 8a91208 [Fei Wang] remove dependencies from spark-sql-engine 7e3a498 [Fei Wang] address commments 051f49f [Fei Wang] address comments 85316a0 [Fei Wang] Keep compatible with log4j12 01d1a84 [Fei Wang] for log4j1 b9e17e1 [Fei Wang] refactor e24391e [Fei Wang] revert log count 3880300 [Fei Wang] add log4j2.properties.template 4f0b22f [Fei Wang] save 7ce8411 [Fei Wang] modify log level 1ea5ca5 [Fei Wang] add log4j to engine c4a86d4 [Fei Wang] use AbstractFilter 27b08b6 [Fei Wang] remove more 8cc15ae [Fei Wang] reformat c13ec29 [Fei Wang] save temporarily 33a38e2 [Fei Wang] exclude log4j12 from spark-sql 9129a64 [Fei Wang] refactor 5362b43 [Fei Wang] make it run at first 7f27f51 [Fei Wang] more 56f4f1f [Fei Wang] fix logging a74b6d3 [Fei Wang] start appender dea964a [Fei Wang] fix build erorr at first e20b750 [Fei Wang] address comments 2ec02b4 [Fei Wang] fix LogDivertAppender dded129 [Fei Wang] more c63e000 [Fei Wang] add log4j2.properties Authored-by: Fei Wang <[email protected]> Signed-off-by: Fei Wang <[email protected]>
This patch proposes to migrate from log4j1 to log4j2 in Spark. Apache Spark uses log4j 1.x, so hence is not susceptible to the recent log4j v2 vulnerability. However, log4j 1.x has reached end of life and is no longer supported. Vulnerabilities reported after August 2015 against log4j 1.x were not checked and will not be fixed. As a result, we are recommended to upgrade Spark to Log4j 2 as soon as possible. Yes. Users may need to rewrite their log4j properties file for log4j2. As of version 2.4, log4j now supports configuration via properties files. Note that the property syntax is not the same as the syntax used in log4j 1, but during the migration I found the syntax is pretty close so the migration should be straightforward. Passed all existing tests. Closes apache#34895 from viirya/log4j2. Authored-by: Liang-Chi Hsieh <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
This patch proposes to migrate from log4j1 to log4j2 in Spark. Apache Spark uses log4j 1.x, so hence is not susceptible to the recent log4j v2 vulnerability. However, log4j 1.x has reached end of life and is no longer supported. Vulnerabilities reported after August 2015 against log4j 1.x were not checked and will not be fixed. As a result, we are recommended to upgrade Spark to Log4j 2 as soon as possible. Yes. Users may need to rewrite their log4j properties file for log4j2. As of version 2.4, log4j now supports configuration via properties files. Note that the property syntax is not the same as the syntax used in log4j 1, but during the migration I found the syntax is pretty close so the migration should be straightforward. Passed all existing tests. Closes apache#34895 from viirya/log4j2. Authored-by: Liang-Chi Hsieh <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
### What changes were proposed in this pull request? This patch proposes to migrate from log4j1 to log4j2 in Spark. ### Why are the changes needed? Apache Spark uses log4j 1.x, so hence is not susceptible to the recent log4j v2 vulnerability. However, log4j 1.x has reached end of life and is no longer supported. Vulnerabilities reported after August 2015 against log4j 1.x were not checked and will not be fixed. As a result, we are recommended to upgrade Spark to Log4j 2 as soon as possible. ### Does this PR introduce _any_ user-facing change? Yes. Users may need to rewrite their log4j properties file for log4j2. As of version 2.4, log4j now supports configuration via properties files. Note that the property syntax is not the same as the syntax used in log4j 1, but during the migration I found the syntax is pretty close so the migration should be straightforward. ### How was this patch tested? Passed all existing tests. Closes apache#34895 from viirya/log4j2. Authored-by: Liang-Chi Hsieh <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit ca5cabe)
|
|
||
| appender.console.type = Console | ||
| appender.console.name = STDOUT | ||
| appender.console.target = SYSTEM_OUT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Random thing I just started noticing, the default logging now goes to stdout instead of stderr, was that intentional? The log4j2.properties.template uses stderr too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| # Set everything to be logged to the console | |
| log4j.rootCategory=INFO, console | |
| log4j.appender.console=org.apache.log4j.ConsoleAppender | |
| log4j.appender.console.target=System.err | |
| log4j.appender.console.layout=org.apache.log4j.PatternLayout | |
| log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n |
| rootLogger.level = info | |
| rootLogger.appenderRef.stdout.ref = STDOUT | |
| appender.console.type = Console | |
| appender.console.name = STDOUT | |
| appender.console.target = SYSTEM_OUT | |
| appender.console.layout.type = PatternLayout | |
| appender.console.layout.pattern = %d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n%ex |
look different, WDYT? @viirya
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be when I copied across properties file and missed to change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
roughly scanned, seems only this place changes from stderr to stdout, #37854 try fix this
What changes were proposed in this pull request?
This patch proposes to migrate from log4j1 to log4j2 in Spark.
Why are the changes needed?
Apache Spark uses log4j 1.x, so hence is not susceptible to the recent log4j v2 vulnerability. However, log4j 1.x has reached end of life and is no longer supported. Vulnerabilities reported after August 2015 against log4j 1.x were not checked and will not be fixed. As a result, we are recommended to upgrade Spark to Log4j 2 as soon as possible.
Does this PR introduce any user-facing change?
Yes. Users may need to rewrite their log4j properties file for log4j2. As of version 2.4, log4j now supports configuration via properties files. Note that the property syntax is not the same as the syntax used in log4j 1, but during the migration I found the syntax is pretty close so the migration should be straightforward.
How was this patch tested?
Passed all existing tests.