Add Checksum CRC32 to Exchange SerializedPage by jbroll · Pull Request #15686 · prestodb/presto

jbroll · 2021-02-05T16:43:47Z

Added Feature "exchange.checksum-enabled"
Maps to System property "exchange_checksum"

Uses CRC32 to compute the checksum.

Checksum is only read / written when enabled. The checksum existence in a page is indicated in PageCodecMarker.

Refactored TestExchangeClient adding a test for checksum ok and checksum fail.

== RELEASE NOTES ==
* Added Feature "exchange.checksum-enabled"
* Maps to System property "exchange_checksum"

Enables CRC32 checking for SerializedPage on the Exchange data path.

aweisberg · 2021-02-10T16:48:22Z

Is page the right place to be calculating checksums? Should we be doing it at the transport layer so the entire message is checksummed. I know we are using HTTP and streaming so it's a bit more complex than that. I haven't done checksumming for HTTP.

jbroll · 2021-02-10T17:01:22Z

This is what we decided to do. Check summing the HTTP body is more or less the same thing as this, but looks harder to implement.

aweisberg · 2021-02-10T18:24:13Z

I get that it is easier, but there is corruption this won't detect, corruption that can lead to incorrect query results. Even if the page is correct the other data in a message necessary to deliver the page to the correct place and process it correctly is still unprotected.

We don't want to calculate the checksum twice so having it on pages is really only valuable for the case where we want to persist just a page (such as local disk).

There are also message flows that aren't just exchanging pages.

jbroll · 2021-02-10T18:55:19Z

I don't have any arguments to make regarding where the 'right' place to checksum is.

aweisberg · 2021-02-10T20:08:22Z

This is not the hill I want to die on either, but if we are going to do an 80% solution lets consciously decide that we're going with that.

Just to give an example of other important message flows. File paths and statistics can flow between coordinator and workers. This could lead to silent corruption as the wrong paths are recorded in metadata or statistics contain corrupt values like highest and lowest value.

Checksums on pages covers 99% of the data by volume so you are likely to catch most bitflips but that 1% contains things that when corrupted can cover larger amounts of data. Page level checksums are very high value and good quick win though. They will provide strong signal about the reliability of our hosts and network.

I just don't want to lose sight of the other 20%. I think we will save more time on the other end with explicit checksum failures then we gain by saving a little time now.

tdcmeehan · 2021-02-10T20:31:55Z

@aweisberg: going to the point @cemcayiroglu raised, data integrity should be guaranteed end to end for all communication flows once internal communication uses TLS. How do you feel about TLS being our 100% solution, with data integrity checks for exchange and scans being our 99% solution?

aweisberg · 2021-02-10T20:38:21Z

Ah, I completely forgot about our moving to TLS. Yes I think since we are going to move to TLS everywhere intra-cluster this is totally fine.

Will TLS use a 32-bit checksum or a >32-bit checksum and when it detects corruption will it signal an error or silently retransmit?

presto-main/src/test/java/com/facebook/presto/execution/buffer/TestingPagesSerdeFactory.java

presto-main/src/test/java/com/facebook/presto/operator/TestExchangeClient.java

presto-spi/src/main/java/com/facebook/presto/spi/page/PagesSerdeUtil.java

bhhari

Minor comments

bhhari

Minor comments

presto-spi/src/main/java/com/facebook/presto/spi/page/PagesSerdeUtil.java

presto-main/src/main/java/com/facebook/presto/execution/buffer/PagesSerdeFactory.java

presto-main/src/test/java/com/facebook/presto/operator/TestExchangeClient.java

tdcmeehan · 2021-02-19T05:17:22Z

Is this change still WIP? If not, we'll need to update the commit message.

bhhari · 2021-02-19T18:51:39Z

@jbroll the PR looks good, please update the commit message as @tdcmeehan suggested.

arhimondr · 2021-04-08T09:15:39Z

presto-spi/src/main/java/com/facebook/presto/spi/page/PagesSerdeUtil.java

+        if (CHECKSUMMED.isSet(page.getPageCodecMarkers())) {
+            output.writeLong(page.getChecksum());
+            try {
+                String host = Inet6Address.getLocalHost().getHostAddress();


Inet6Address.getLocalHost().getHostAddress() is actually quite expensive. Could potentially be too expensive to be done for every single page.

Also are we sure this is the right way of extracting a hostname? Wouldn't it return a 127.0.0.1 analogue in IPv6?

CC: @bhhari @tdcmeehan

This is the right way to get hostname, I have tested it, it will give twshared...

arhimondr · 2021-04-08T09:18:50Z

presto-spi/src/main/java/com/facebook/presto/spi/page/PagesSerdeUtil.java

+            int length = sliceInput.readInt();
+            byte[] hostAddress = new byte[length];
+            sliceInput.read(hostAddress);
+            host = new String(hostAddress);


nit: decoding is unnecessary unless a checksum error is encountered

jbroll requested review from bhhari, mbasmanova and tdcmeehan February 5, 2021 16:44

jbroll changed the title ~~[WIP] Add Checksum CRC32 to Exchange SerializedPage~~ Add Checksum CRC32 to Exchange SerializedPage Feb 10, 2021