Add Checksum CRC32 to Exchange SerializedPage#15686
Add Checksum CRC32 to Exchange SerializedPage#15686arhimondr merged 1 commit intoprestodb:masterfrom jbroll:exchange-crc32
Conversation
|
Is page the right place to be calculating checksums? Should we be doing it at the transport layer so the entire message is checksummed. I know we are using HTTP and streaming so it's a bit more complex than that. I haven't done checksumming for HTTP. |
|
This is what we decided to do. Check summing the HTTP body is more or less the same thing as this, but looks harder to implement. |
|
I get that it is easier, but there is corruption this won't detect, corruption that can lead to incorrect query results. Even if the page is correct the other data in a message necessary to deliver the page to the correct place and process it correctly is still unprotected. We don't want to calculate the checksum twice so having it on pages is really only valuable for the case where we want to persist just a page (such as local disk). There are also message flows that aren't just exchanging pages. |
|
I don't have any arguments to make regarding where the 'right' place to checksum is. |
|
This is not the hill I want to die on either, but if we are going to do an 80% solution lets consciously decide that we're going with that. Just to give an example of other important message flows. File paths and statistics can flow between coordinator and workers. This could lead to silent corruption as the wrong paths are recorded in metadata or statistics contain corrupt values like highest and lowest value. Checksums on pages covers 99% of the data by volume so you are likely to catch most bitflips but that 1% contains things that when corrupted can cover larger amounts of data. Page level checksums are very high value and good quick win though. They will provide strong signal about the reliability of our hosts and network. I just don't want to lose sight of the other 20%. I think we will save more time on the other end with explicit checksum failures then we gain by saving a little time now. |
|
@aweisberg: going to the point @cemcayiroglu raised, data integrity should be guaranteed end to end for all communication flows once internal communication uses TLS. How do you feel about TLS being our 100% solution, with data integrity checks for exchange and scans being our 99% solution? |
|
Ah, I completely forgot about our moving to TLS. Yes I think since we are going to move to TLS everywhere intra-cluster this is totally fine. Will TLS use a 32-bit checksum or a >32-bit checksum and when it detects corruption will it signal an error or silently retransmit? |
presto-main/src/test/java/com/facebook/presto/execution/buffer/TestingPagesSerdeFactory.java
Outdated
Show resolved
Hide resolved
presto-main/src/test/java/com/facebook/presto/execution/buffer/TestingPagesSerdeFactory.java
Outdated
Show resolved
Hide resolved
presto-main/src/test/java/com/facebook/presto/operator/TestExchangeClient.java
Outdated
Show resolved
Hide resolved
presto-spi/src/main/java/com/facebook/presto/spi/page/PagesSerdeUtil.java
Outdated
Show resolved
Hide resolved
presto-spi/src/main/java/com/facebook/presto/spi/page/PagesSerdeUtil.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/com/facebook/presto/execution/buffer/PagesSerdeFactory.java
Outdated
Show resolved
Hide resolved
presto-main/src/test/java/com/facebook/presto/operator/TestExchangeClient.java
Outdated
Show resolved
Hide resolved
presto-main/src/test/java/com/facebook/presto/operator/TestExchangeClient.java
Outdated
Show resolved
Hide resolved
|
Is this change still WIP? If not, we'll need to update the commit message. |
|
@jbroll the PR looks good, please update the commit message as @tdcmeehan suggested. |
| if (CHECKSUMMED.isSet(page.getPageCodecMarkers())) { | ||
| output.writeLong(page.getChecksum()); | ||
| try { | ||
| String host = Inet6Address.getLocalHost().getHostAddress(); |
There was a problem hiding this comment.
Inet6Address.getLocalHost().getHostAddress() is actually quite expensive. Could potentially be too expensive to be done for every single page.
Also are we sure this is the right way of extracting a hostname? Wouldn't it return a 127.0.0.1 analogue in IPv6?
CC: @bhhari @tdcmeehan
There was a problem hiding this comment.
This is the right way to get hostname, I have tested it, it will give twshared...
| int length = sliceInput.readInt(); | ||
| byte[] hostAddress = new byte[length]; | ||
| sliceInput.read(hostAddress); | ||
| host = new String(hostAddress); |
There was a problem hiding this comment.
nit: decoding is unnecessary unless a checksum error is encountered
Added Feature "exchange.checksum-enabled"
Maps to System property "exchange_checksum"
Uses CRC32 to compute the checksum.
Checksum is only read / written when enabled. The checksum existence in a page is indicated in PageCodecMarker.
Refactored TestExchangeClient adding a test for checksum ok and checksum fail.