Skip to content

Conversation

@thangTang
Copy link
Contributor

No description provided.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 1m 14s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+1 💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ branch-1 Compile Tests _
+1 💚 mvninstall 4m 17s branch-1 passed
+1 💚 compile 0m 21s branch-1 passed with JDK Azul Systems, Inc.-1.8.0_262-b19
+1 💚 compile 0m 22s branch-1 passed with JDK Azul Systems, Inc.-1.7.0_272-b10
+1 💚 checkstyle 0m 27s branch-1 passed
+1 💚 shadedjars 2m 52s branch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 0m 23s branch-1 passed with JDK Azul Systems, Inc.-1.8.0_262-b19
+1 💚 javadoc 0m 22s branch-1 passed with JDK Azul Systems, Inc.-1.7.0_272-b10
+0 🆗 spotbugs 1m 7s Used deprecated FindBugs config; considering switching to SpotBugs.
+1 💚 findbugs 1m 4s branch-1 passed
_ Patch Compile Tests _
+1 💚 mvninstall 2m 8s the patch passed
+1 💚 compile 0m 21s the patch passed with JDK Azul Systems, Inc.-1.8.0_262-b19
+1 💚 javac 0m 21s the patch passed
+1 💚 compile 0m 24s the patch passed with JDK Azul Systems, Inc.-1.7.0_272-b10
+1 💚 javac 0m 24s the patch passed
+1 💚 checkstyle 0m 27s the patch passed
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
+1 💚 shadedjars 2m 48s patch has no errors when building our shaded downstream artifacts.
+1 💚 hadoopcheck 4m 53s Patch does not cause any errors with Hadoop 2.8.5 2.9.2.
+1 💚 javadoc 0m 21s the patch passed with JDK Azul Systems, Inc.-1.8.0_262-b19
+1 💚 javadoc 0m 22s the patch passed with JDK Azul Systems, Inc.-1.7.0_272-b10
+1 💚 findbugs 1m 14s the patch passed
_ Other Tests _
+1 💚 unit 2m 30s hbase-common in the patch passed.
+1 💚 asflicense 0m 14s The patch does not generate ASF License warnings.
31m 31s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-4233/1/artifact/out/Dockerfile
GITHUB PR #4233
Optional Tests dupname asflicense javac javadoc unit spotbugs findbugs shadedjars hadoopcheck hbaseanti checkstyle compile
uname Linux f89a2a8fc107 5.4.0-90-generic #101-Ubuntu SMP Fri Oct 15 20:00:55 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality /home/jenkins/jenkins-home/workspace/Base-PreCommit-GitHub-PR_PR-4233/out/precommit/personality/provided.sh
git revision branch-1 / 70e695b
Default Java Azul Systems, Inc.-1.7.0_272-b10
Multi-JDK versions /usr/lib/jvm/zulu-8-amd64:Azul Systems, Inc.-1.8.0_262-b19 /usr/lib/jvm/zulu-7-amd64:Azul Systems, Inc.-1.7.0_272-b10
Test Results https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-4233/1/testReport/
Max. process+thread count 143 (vs. ulimit of 10000)
modules C: hbase-common U: hbase-common
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-4233/1/console
versions git=2.17.1 maven=3.6.0 findbugs=3.0.1
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 1m 4s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+1 💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ branch-1 Compile Tests _
+1 💚 mvninstall 2m 53s branch-1 passed
+1 💚 compile 0m 13s branch-1 passed with JDK Azul Systems, Inc.-1.8.0_262-b19
+1 💚 compile 0m 17s branch-1 passed with JDK Azul Systems, Inc.-1.7.0_272-b10
+1 💚 checkstyle 0m 19s branch-1 passed
+1 💚 shadedjars 1m 45s branch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 0m 16s branch-1 passed with JDK Azul Systems, Inc.-1.8.0_262-b19
+1 💚 javadoc 0m 16s branch-1 passed with JDK Azul Systems, Inc.-1.7.0_272-b10
+0 🆗 spotbugs 0m 44s Used deprecated FindBugs config; considering switching to SpotBugs.
+1 💚 findbugs 0m 42s branch-1 passed
_ Patch Compile Tests _
+1 💚 mvninstall 1m 14s the patch passed
+1 💚 compile 0m 13s the patch passed with JDK Azul Systems, Inc.-1.8.0_262-b19
+1 💚 javac 0m 13s the patch passed
+1 💚 compile 0m 16s the patch passed with JDK Azul Systems, Inc.-1.7.0_272-b10
+1 💚 javac 0m 16s the patch passed
+1 💚 checkstyle 0m 18s the patch passed
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
+1 💚 shadedjars 1m 43s patch has no errors when building our shaded downstream artifacts.
+1 💚 hadoopcheck 2m 55s Patch does not cause any errors with Hadoop 2.8.5 2.9.2.
+1 💚 javadoc 0m 13s the patch passed with JDK Azul Systems, Inc.-1.8.0_262-b19
+1 💚 javadoc 0m 16s the patch passed with JDK Azul Systems, Inc.-1.7.0_272-b10
+1 💚 findbugs 0m 47s the patch passed
_ Other Tests _
+1 💚 unit 1m 53s hbase-common in the patch passed.
+1 💚 asflicense 0m 12s The patch does not generate ASF License warnings.
20m 11s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-4233/2/artifact/out/Dockerfile
GITHUB PR #4233
Optional Tests dupname asflicense javac javadoc unit spotbugs findbugs shadedjars hadoopcheck hbaseanti checkstyle compile
uname Linux 7c6178f9c255 5.4.0-90-generic #101-Ubuntu SMP Fri Oct 15 20:00:55 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality /home/jenkins/jenkins-home/workspace/Base-PreCommit-GitHub-PR_PR-4233/out/precommit/personality/provided.sh
git revision branch-1 / aa9cba3
Default Java Azul Systems, Inc.-1.7.0_272-b10
Multi-JDK versions /usr/lib/jvm/zulu-8-amd64:Azul Systems, Inc.-1.8.0_262-b19 /usr/lib/jvm/zulu-7-amd64:Azul Systems, Inc.-1.7.0_272-b10
Test Results https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-4233/2/testReport/
Max. process+thread count 161 (vs. ulimit of 10000)
modules C: hbase-common U: hbase-common
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-4233/2/console
versions git=2.17.1 maven=3.6.0 findbugs=3.0.1
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

if (nodeToIndex.containsKey(node)) {
short index = nodeToIndex.get(node);
node = indexToNode[index];
moveToHead(node);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to discuss in the email, if here always use the previous node, then how can it ensure that the previous node is a completed one?

Copy link
Contributor Author

@thangTang thangTang Mar 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I didn't understand what "completed one" means.
But this patch does not actually change the logic of the previous use of this LRUCache:
RingBufferEventHandler#onEvent -> RingBufferEventHandler#append -> ProtobufLogWriter#append -> CompressedKvEncoder#write -> LRUDictionary#findEntry
For this actually used write link, findEntry uses the previously existing node.
We did find NPE on the read link, but the root reason is not the implementation of this LRUCache (this patch), but that the LRUCache is polluted.
I just unified the logic of addEntry with the actual logic on the write link, which I think is more elegant.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The call trace is WALEntryStream#tryAdvanceEntry->ProtobufLogReader#readNext->CompressedKVDecoder#readIntoArray->LRUDirectory#addEntry->then here the changed BidirectionalLRUMap#put.
Your change only makes the newly some node will not be added to the directory, but the old same node may have uncompleted data, e.g. tailing the WAL.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh i got your point. As we discussed in email, to solve the problem you mentioned, we need to rebuild the LRUCache every time when we re-seek to somewhere(will done it in 26849), and in the future we could try to implement a 'versioned' cache for replication.
This patch is just for code optimization, not to solve the problem. So it's an "Improvement", not a "bug".
Could this answer your question?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the new value maybe a completed one, this improvement can not prove using the old value is always better than the new value, except the performance improvement.
I think an umbrella should be created to track the problem mentioned in the email, and this issue can be a child of it. So before the umbrella issue is completed, all the child codes can be tested together.
Thanks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a stability or performance standpoint, I don't think it's a good or bad/right or wrong question since it doesn't change the existing logic.
But from a code architecture point of view, I think this way is better. The original implementation is to put the logic of "find the existing node and return" into findEntry, and directly expose addEntry to the outside, which leads to the possibility of inconsistent behavior between the two. So I think we can completely encapsulate the same logic in addEntry (although this does not bring any stability improvement for now).
But if you would like to wait for 26849 to finish and watch it together, I think it's OK~

Copy link
Contributor

@apurtell apurtell Apr 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I follow the discussion because in the proposed improvement the old node is not reused if the contents being stored are different.

  Node node = new Node();
  node.setContents(stored, 0, stored.length);
  if (nodeToIndex.containsKey(node)) {
     // new logic reusing existing entry and index
     // ...
  } else {
     // original logic adding new entry
     // ...
  }      

containsKey will use hashcode of Node, which is Bytes.hashCode over the contents. A previous short read and a current full read will have different contents so different hashcode, right? If so, this just reuses an entry that has equivalent data, which I agree is an improvement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the old node is not reused if the contents being stored are different.

Completely correct.
This patch only reuse SAME node.
Actually, In the previous implementation, if the nodes are same, the existing nodes will also be reused too, the only difference is this logic were in findIdx:

private short findIdx(byte[] array, int offset, int length) {
      Short s;
      final Node comparisonNode = new Node();
      comparisonNode.setContents(array, offset, length);
      if ((s = nodeToIndex.get(comparisonNode)) != null) {
        moveToHead(indexToNode[s]);
        return s;
      } else {
        return -1;
      }
    }

For the write link:

CompressedKvEncoder#write
->
LRUDictionary#findEntry (LRUDictionary#findIdx)
->
LRUDictionary#addEntry

But for the read link:

CompressedKVDecoder#readIntoArray
->
LRUDirectory#addEntry

We could see, on the read link, it just addEntry directly, without findIdx(reuse the existing same node).
So, I just thought it would be more beautiful to write this way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants