Skip to content

HDFS-16638. Add isDebugEnabled check for debug blockLogs in BlockManager#4480

Merged
ayushtkn merged 1 commit intoapache:trunkfrom
cxzl25:HDFS-16638
Jul 3, 2022
Merged

HDFS-16638. Add isDebugEnabled check for debug blockLogs in BlockManager#4480
ayushtkn merged 1 commit intoapache:trunkfrom
cxzl25:HDFS-16638

Conversation

@cxzl25
Copy link
Contributor

@cxzl25 cxzl25 commented Jun 21, 2022

Description of PR

There are lots of concatenating Strings using blockLog.debug in BlockManager.

How was this patch tested?

exist UT

For code changes:

  • Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
  • Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

@cxzl25 cxzl25 changed the title HDFS-16637. Add isDebugEnabled check for debug blockLogs in BlockManager HDFS-16638 Add isDebugEnabled check for debug blockLogs in BlockManager Jun 21, 2022
@cxzl25 cxzl25 changed the title HDFS-16638 Add isDebugEnabled check for debug blockLogs in BlockManager HDFS-16638. Add isDebugEnabled check for debug blockLogs in BlockManager Jun 21, 2022
@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 1m 0s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
+1 💚 mvninstall 40m 59s trunk passed
+1 💚 compile 1m 48s trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚 compile 1m 38s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 checkstyle 1m 21s trunk passed
+1 💚 mvnsite 1m 45s trunk passed
+1 💚 javadoc 1m 18s trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚 javadoc 1m 48s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 4m 3s trunk passed
+1 💚 shadedclient 26m 45s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 1m 25s the patch passed
+1 💚 compile 1m 32s the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚 javac 1m 32s the patch passed
+1 💚 compile 1m 25s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 javac 1m 25s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 1m 2s the patch passed
+1 💚 mvnsite 1m 34s the patch passed
+1 💚 javadoc 1m 1s the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚 javadoc 1m 39s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 3m 45s the patch passed
+1 💚 shadedclient 26m 30s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 343m 52s hadoop-hdfs in the patch passed.
+1 💚 asflicense 1m 0s The patch does not generate ASF License warnings.
464m 21s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4480/1/artifact/out/Dockerfile
GITHUB PR #4480
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 0adfd20b1193 4.15.0-175-generic #184-Ubuntu SMP Thu Mar 24 17:48:36 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / d2d3b01
Default Java Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4480/1/testReport/
Max. process+thread count 1837 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4480/1/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

Copy link
Member

@ayushtkn ayushtkn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ZanderXu
Copy link
Contributor

Thanks @cxzl25 for your patch. I have a question about it and looking forward your feedback.
Can this change improve the namenode performance? I see that some places specially remove this judgment, and use log.info("{}", XXX) replace it.

Or can you do some performance test for it? use this judgment or not.

@cxzl25
Copy link
Contributor Author

cxzl25 commented Jun 23, 2022

Thanks @cxzl25 for your patch. I have a question about it and looking forward your feedback. Can this change improve the namenode performance? I see that some places specially remove this judgment, and use log.info("{}", XXX) replace it.

Or can you do some performance test for it? use this judgment or not.

We have some JIRAs that use the isDebugEnabled judgment.
At the beginning, I found that the log of Removing stale replica has meaningless string splicing, and using NameNode.blockStateChangeLog can be replaced by blockLog, because this debug-level log is called a lot every day.
So I fixed the other problems by the way.

Copy link
Contributor

@ZanderXu ZanderXu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @cxzl25 for your comment. LGTM

@virajjasani
Copy link
Contributor

I am not sure if having isLogEnabled() would help with much perf improvement so long as we use arguments {} to let the logger library take care of the final String concat.

Looking at slf4j-api and log4j-slf4j-impl, Log4jLogger has this implementation:

    @Override
    public void debug(final String format, final Object... args) {
        logger.logIfEnabled(FQCN, Level.DEBUG, null, format, args);
    }

which would call log4j-api implementation of AbstractLogger:

    @Override
    public void logIfEnabled(final String fqcn, final Level level, final Marker marker, final String message,
            final Object... params) {
        if (isEnabled(level, marker, message, params)) {
            logMessage(fqcn, level, marker, message, params);
        }
    }

Hence, as long as we use parameterized args {}, the final log concat operation happens only if the logger level (DEBUG in our case) is enabled.
The best practice to use isDebugEnabled or isTraceEnabled is when we don't use parameterized args {} and instead use full string concat all by ourselves, in which case string cancat with big values happen regardless of whether log level is enabled, but we are already using params for these loggers, hence I don't think we would gain much here.
Thoughts?

@ZanderXu
Copy link
Contributor

Thanks @virajjasani.

blockLog.debug("BLOCK* invalidateBlocks: postponing " +
            "invalidation of {} on {} because {} replica(s) are located on " +
            "nodes with potentially out-of-date block reports", b, dn,
            nr.replicasOnStaleNodes());

Like this debug log, in order to satisfy the checkStyle, we must split the log into multiple lines. So even we use parameterized args {}, there is still string concatenation.

Like this debug log, do you have some good ideas?

@virajjasani
Copy link
Contributor

virajjasani commented Jun 27, 2022

Thanks @ZanderXu, IMHO breaking the small strings like the one mentioned above (to satisfy checkstyle) is not an overhead for JVM. The overhead comes when big objects (or let's say variable size of collections) are added in the log lines.
For instance, when b, dn and nr.replicasOnStaleNodes() are concatenated with + in the log line, it could have been bit of an overhead. But constant string concatenation is pretty simple given that String objects are allocated from heap memory.

@ZanderXu
Copy link
Contributor

ZanderXu commented Jun 27, 2022

But constant string concatenation is pretty simple given that String objects are allocated from heap memory.

Yes, your are right.

blockLog.debug("BLOCK* invalidateBlocks: postponing " +
            "invalidation of {} on {} because {} replica(s) are located on " +
            "nodes with potentially out-of-date block reports", b, dn,
            nr.replicasOnStaleNodes());
if(blockLog.isDebugEnabled()) {
        blockLog.debug("BLOCK* invalidateBlocks: postponing " +
            "invalidation of {} on {} because {} replica(s) are located on " +
            "nodes with potentially out-of-date block reports", b, dn,
            nr.replicasOnStaleNodes());
      }

So maybe we should have a performance test to confirm that which of the above two modes has better performance.
Do you think so? Or if you have a clear result, please let me know. Thanks @virajjasani

@ZanderXu
Copy link
Contributor

ZanderXu commented Jun 27, 2022

@cxzl25 @virajjasani @ayushtkn Share the results of my stress test here.

Test code:

    org.slf4j.Logger LOG = LoggerFactory.getLogger("Test");
    Block b = new Block();
    DatanodeInfo dn = new DatanodeInfo(EMPTY_DATANODE_ID);
    List<DatanodeInfo> staleNodes = new ArrayList<>();
    staleNodes.add(dn);
    long beginTime1 = Time.monotonicNowNanos();
    for (int i = 0; i < 100000000; i++) {
      LOG.debug("BLOCK* invalidateBlocks: postponing " +
              "invalidation of {} on {} because {} replica(s) are located on " +
              "nodes with potentially out-of-date block reports",
          b, dn, staleNodes);
    }
    long endTime1 = Time.monotonicNowNanos();
    LOG.info("Time1: {}", endTime1 - beginTime1);

    long beginTime2 = Time.monotonicNowNanos();
    for (int i = 0; i < 100000000; i++) {
      if (LOG.isDebugEnabled()) {
        LOG.debug("BLOCK* invalidateBlocks: postponing " +
                "invalidation of {} on {} because {} replica(s) are located on " +
                "nodes with potentially out-of-date block reports",
            b, dn, staleNodes);
      }
    }
    long endTime2 = Time.monotonicNowNanos();
    LOG.info("Time2: {}", endTime2 - beginTime2);

    long beginTime3 = Time.monotonicNowNanos();
    for (int i = 0; i < 100000000; i++) {
      LOG.debug("BLOCK* invalidateBlocks: postponing invalidation of {}", b);
    }
    long endTime3 = Time.monotonicNowNanos();
    LOG.info("Time3: {}", endTime3 - beginTime3);

And the result is:

- Time1(ns) Time2(ns) Time3(ns)
First Run 353681417 287066916 270287253
Second Run 356797081 308013207 266873810
Third Run 366995010 287111940 259907404
Avg 359157836 294064021 265689489

So this PR is help with much performance improvement.

@cxzl25
Copy link
Contributor Author

cxzl25 commented Jun 27, 2022

Thank you for your help with the stress test. @ZanderXu
It is helpful to improve performance for string concat where the toString cost of some objects is relatively high.

@ayushtkn
Copy link
Member

ayushtkn commented Jul 3, 2022

Well I didn't expect it to be that better as the above result shows, but I don't think concat will be having a Zero overhead anyway, if the code path is critical and something called at very high frequency, even small improvements can help.
I am ok with this change, but if others feels it isn't good enough to get it in or would have negligible impact. I am fine avoiding it as well, not something that critical....

@virajjasani
Copy link
Contributor

I think it's fine moving forward because in the presence of this patch, the worst case scenario would be calculating if DEBUG level is enabled twice, which is nothing compared to even static String concat. Hence, no objections from my side. Thanks @cxzl25 for the work!

@ayushtkn
Copy link
Member

ayushtkn commented Jul 3, 2022

Cool, So we are good here. @cxzl25 Can you assign the jira as well on your name, or if you can't do due to some permissions, drop a comment on the jira or let me know your jira id.

Will commit post that,

@cxzl25
Copy link
Contributor Author

cxzl25 commented Jul 3, 2022

Thank you all for your review and help! @ayushtkn @ZanderXu @virajjasani

let me know your jira id.

My jira id is dzcxzl
I don't have jira related permissions.

@ZanderXu
Copy link
Contributor

ZanderXu commented Jul 3, 2022

Thanks @cxzl25 @virajjasani @ayushtkn . I will create a new PR to correct any debug logs in HDFS if needed.

@ayushtkn ayushtkn merged commit ea46f49 into apache:trunk Jul 3, 2022
HarshitGupta11 pushed a commit to HarshitGupta11/hadoop that referenced this pull request Nov 28, 2022
…ger. (apache#4480). Contributed by  dzcxzl.

Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants