[SPARK-19556][core] Do not encrypt block manager data in memory. #17295

vanzin · 2017-03-14T21:39:55Z

This change modifies the way block data is encrypted to make the more
common cases faster, while penalizing an edge case. As a side effect
of the change, all data that goes through the block manager is now
encrypted only when needed, including the previous path (broadcast
variables) where that did not happen.

The way the change works is by not encrypting data that is stored in
memory; so if a serialized block is in memory, it will only be encrypted
once it is evicted to disk.

The penalty comes when transferring that encrypted data from disk. If the
data ends up in memory again, it is as efficient as before; but if the
evicted block needs to be transferred directly to a remote executor, then
there's now a performance penalty, since the code now uses a custom
FileRegion implementation to decrypt the data before transferring.

This also means that block data transferred between executors now is
not encrypted (and thus relies on the network library encryption support
for secrecy). Shuffle blocks are still transferred in encrypted form,
since they're handled in a slightly different way by the code. This also
keeps compatibility with existing external shuffle services, which transfer
encrypted shuffle blocks, and avoids having to make the external service
aware of encryption at all.

The serialization and deserialization APIs in the SerializerManager now
do not do encryption automatically; callers need to explicitly wrap their
streams with an appropriate crypto stream before using those.

As a result of these changes, some of the workarounds added in SPARK-19520
are removed here.

Testing: a new trait ("EncryptionFunSuite") was added that provides an easy
way to run a test twice, with encryption on and off; broadcast, block manager
and caching tests were modified to use this new trait so that the existing
tests exercise both encrypted and non-encrypted paths. I also ran some
applications with encryption turned on to verify that they still work,
including streaming tests that failed without the fix for SPARK-19520.

This change modifies the way block data is encrypted to make the more common cases faster, while penalizing an edge case. As a side effect of the change, all data that goes through the block manager is now encrypted only when needed, including the previous path (broadcast variables) where that did not happen. The way the change works is by not encrypting data that is stored in memory; so if a serialized block is in memory, it will only be encrypted once it is evicted to disk. The penalty comes when transferring that encrypted data from disk. If the data ends up in memory again, it is as efficient as before; but if the evicted block needs to be transferred directly to a remote executor, then there's now a performance penalty, since the code now uses a custom FileRegion implementation to decrypt the data before transferring. This also means that block data transferred between executors now is not encrypted (and thus relies on the network library encryption support for secrecy). Shuffle blocks are still transferred in encrypted form, since they're handled in a slightly different way by the code. This also keeps compatibility with existing external shuffle services, which transfer encrypted shuffle blocks, and avoids having to make the external service aware of encryption at all. Another change in the disk store is that it now stores a tiny metadata file next to the file holding the block data; this is needed to accurately account for the decrypted block size, which may be significantly different from the size of the encrypted file on disk. The serialization and deserialization APIs in the SerializerManager now do not do encryption automatically; callers need to explicitly wrap their streams with an appropriate crypto stream before using those. As a result of these changes, some of the workarounds added in SPARK-19520 are removed here. Testing: a new trait ("EncryptionFunSuite") was added that provides an easy way to run a test twice, with encryption on and off; broadcast, block manager and caching tests were modified to use this new trait so that the existing tests exercise both encrypted and non-encrypted paths. I also ran some applications with encryption turned on to verify that they still work, including streaming tests that failed without the fix for SPARK-19520.

mridulm · 2017-03-14T23:33:11Z

I have not looked at the implementation in detail, but can you comment on why the change w.r.t plain text block data to remote executor ? Isn't it not simpler to transmit block contents in encrypted format without decryption ?

Remote fetch of RDD blocks is not uncommon (for any task other than PROCESS_LOCAL); and I wanted to better understand why this is required.

vanzin · 2017-03-14T23:50:11Z

Isn't it not simpler to transmit block contents in encrypted format without decryption?

First, keep in mind that there's no metadata that tells the receiver whether a block is encrypted or not. This means that methods like BlockManager.get, which can read block data from either local or remote sources, need to return data that is either always encrypted or always not encrypted for the same block ID.

This leaves two choices:

encrypt the data in all stores (memory & disk); this is what the current code does, and it requires all code that uses the BlockManager to have to deal with encryption. This is what caused SPARK-19520, and I filed SPARK-19556 to cover yet another case of a code path that did not do the right thing when encryption is enabled.
make all non-shuffle block data read from the BlockManager not encrypted. This means non-shuffle code calling the BlockManager does not have to care about encryption, since it will always read unencrypted data, and can always put unencrypted data in the BlockManager and it will be encrypted when needed (a.k.a. when writing to disk).

Remote fetch of RDD blocks is not uncommon

That's fine. This change makes the data read from the BlockManager instance not encrypted. But when transmitting the data over to another executor, there's RPC-level encryption (spark.authenticate.enableSaslEncryption or spark.network.crypto.enabled), which means the data is still encrypted on the wire.

mridulm · 2017-03-15T00:16:52Z

First, keep in mind that there's no metadata that tells the receiver whether a block is encrypted or not. This means that methods like BlockManager.get, which can read block data from either local or remote sources, need to return data that is either always encrypted or always not encrypted for the same block ID.

This can be solved by tagging the block data with a prefix byte - we do something similar for MapStatus (direct or broadcast).

SparkQA · 2017-03-15T00:22:30Z

Test build #74555 has finished for PR 17295 at commit 3aa752f.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

mridulm · 2017-03-15T00:23:32Z

Just to be clear, I would prefer if we consistently did things - either encrypt all blocks while transferring (irrespective of sasl being enabled or not); or depend only on sasl for channel encryption.
But given this was what it currently is, I am not sure if it was by design or accident; and what the tradeoff's for ensuring consistency is.

(The workaround is, what I mentioned above, tagging)

vanzin · 2017-03-15T00:24:21Z

This can be solved by tagging the block data with a prefix byte

Sure, it could be solved in different ways. I just happened to prefer the one in this patch, since I think it's less intrusive; if you look closely, the majority of changes are in a single class (DiskStore), and there's mostly minor adjustments in other places.

vanzin · 2017-03-15T00:29:46Z

Just to be clear, I would prefer if we consistently did things - either encrypt all blocks while transferring (irrespective of sasl being enabled or not); or depend only on sasl for channel encryption.

Not really sure what you mean here. But transferring encrypted data without RPC encryption is not really secure, since the encryption key is transferred to executors using an RPC. There's even a warning message if RPC encryption is not on and you enable disk encryption.

Shuffle is a different beast - I explain why the shuffle blocks are transferred in encrypted form in the PR description.

mridulm · 2017-03-15T00:33:00Z

Not really sure what you mean here. But transferring encrypted data without RPC encryption is not really secure, since the encryption key is transferred to executors using an RPC. There's even a warning message if RPC encryption is not on and you enable disk encryption.

Good point, I overlooked that.
So to summarize, after this change, RDD block's transferred will always be in plain text; with an implicit requirement that rpc encryption is strongly preferred to be enabled.
Is there any case where it is transfered in encrypted form in supported cases ? (cases being: broadcast, rdd block transfer, replication, anything else ?)
I wanted to ensure I understand what the final expected behavior/state would be, and how consistent we will become.

I agree about shuffle being special case'd; I was looking at only non-shuffle blocks.

vanzin · 2017-03-15T00:35:20Z

Is there any case where it is transfered in encrypted form in supported cases ?

No, with these changes, only shuffle data is transferred in encrypted form.

sameeragarwal · 2017-03-16T00:33:04Z

cc @cloud-fan @ueshin

cloud-fan · 2017-03-16T02:32:58Z

The penalty comes when transferring that encrypted data from disk. If the
data ends up in memory again, it is as efficient as before; but if the
evicted block needs to be transferred directly to a remote executor, then
there's now a performance penalty, since the code now uses a custom
FileRegion implementation to decrypt the data before transferring.

What's the actual difference? previously we transfer encrypted data?

vanzin · 2017-03-16T17:18:57Z

What's the actual difference? previously we transfer encrypted data?

Yes. The previous version of the code would transfer the encrypted file over to the receiver, and the encrypted data for serialized blocks would also be stored in MemoryStore (and then decrypted on every use). That means the files could just be mmap'ed for transfer, which is faster than the ReadableByteChannel path even without encryption in the picture. (If you consider the previous code had to decrypt from the MemoryStore on every read, you can end up with better performance overall with this patch.)

But this caused all the other issues with making the BlockManager harder to use when encryption was on, so I think this is a better solution.

cloud-fan · 2017-03-17T04:56:01Z

makes sense. one more question, ideally, shall we also transfer shuffle blocks after decryption?

cloud-fan · 2017-03-17T05:42:56Z

core/src/main/scala/org/apache/spark/security/CryptoStreamUtils.scala

-    new CryptoInputStream(transformationStr, properties, is,
-      new SecretKeySpec(key, "AES"), new IvParameterSpec(iv))
+    var read = 0
+    while (read < iv.length) {


what does this while loop do?

It avoids issues with short reads. It's unlikely to happen but I always write read code like this to be safe.

Yeah, you can just use ByteStreams.readFully(is, iv).

Ah, missed that one. +1 for shorter code.

cloud-fan · 2017-03-17T05:44:21Z

core/src/main/scala/org/apache/spark/security/CryptoStreamUtils.scala

+  /**
+   * This class is a workaround for CRYPTO-125, that forces all bytes to be written to the
+   * underlying channel. Since the callers of this API are using blocking I/O, there are no
+   * concerns with regards to CPU usage here.


is it a separated bug fix?

No. As the comment states, it's a workaround for a bug in the commons-crypto library, which would affect the code being added.

cloud-fan · 2017-03-17T06:00:40Z

core/src/main/scala/org/apache/spark/storage/DiskStore.scala

+      assert(blockSize <= Int.MaxValue, "Block is too large to be wrapped in a byte buffer.")
+      val is = toInputStream()
+      try {
+        ByteBuffer.wrap(ByteStreams.toByteArray(is))


will we read all data out here?

There's a comment explaining it a few lines above...

vanzin · 2017-03-17T16:29:04Z

shall we also transfer shuffle blocks after decryption?

No. That's explained in the PR description.

mridulm

I did an initial pass and added some comments/queries. Overall, as I mentioned earlier, I like the fact that we have a more consistent approach to transfering data.
Thanks for the work @vanzin !

mridulm · 2017-03-15T00:46:52Z

core/src/main/scala/org/apache/spark/security/CryptoStreamUtils.scala

+    val params = new CryptoParams(key, sparkConf)
+    val iv = createInitializationVector(params.conf)
+    val buf = ByteBuffer.wrap(iv)
+    while (buf.remaining() > 0) {


nit: buf.hasRemaining for this pattern of use

mridulm · 2017-03-15T19:42:31Z

core/src/main/scala/org/apache/spark/security/CryptoStreamUtils.scala

+        throw new EOFException("Failed to read IV from stream.")
+      }
+      read += _read
+    }


ByteStreams.readFully instead of the loop

mridulm · 2017-03-15T20:01:26Z

core/src/main/scala/org/apache/spark/security/CryptoStreamUtils.scala

+      key: Array[Byte]): ReadableByteChannel = {
+    val iv = new Array[Byte](IV_LENGTH_IN_BYTES)
+    val buf = ByteBuffer.wrap(iv)
+    buf.clear()


nit: The clear is not required.

mridulm · 2017-03-15T20:03:10Z

core/src/main/scala/org/apache/spark/security/CryptoStreamUtils.scala

  }
+
+  /**
+   * This class is a workaround for CRYPTO-125, that forces all bytes to be written to the


This is a lousy bug ! Good thing that we dont seem to be hit by it (yet).

There's a pretty nasty workaround for it in the network library... (the non-blocking workaround is a lot worse than this.)

mridulm · 2017-03-18T10:19:44Z

core/src/main/scala/org/apache/spark/storage/BlockManager.scala

+  override def toManagedBuffer(): ManagedBuffer = new NettyManagedBuffer(buffer.toNetty)
+
+  override def toByteBuffer(allocator: Int => ByteBuffer): ChunkedByteBuffer = {
+    buffer.copy(allocator)


autoDispose is not honored for toManagedBuffer and toByteBuffer ?
On first pass, it looks like it is not ...

Also, is the expectation that invoker must manually invoke dispose when not using toInputStream ?
Would be good to add a comment about this to BlockData trait detailing the expectation.

So I had traced through that stuff 2 or 3 times, and now I did it again and I think I finally understood all that's going on. Basically, the old code was really bad at explicitly disposing of the buffers, meaning a bunch of paths (like the ones that used managed buffers) didn't bother to do it and just left the work to the GC.

I changed the code a bit to make the dispose more explicit and added comments in a few key places.

mridulm · 2017-03-18T11:31:26Z

core/src/main/scala/org/apache/spark/storage/DiskBlockManager.scala

 */
 private[spark] class DiskBlockManager(conf: SparkConf, deleteFilesOnStop: Boolean) extends Logging {

+  private val METADATA_FILE_SUFFIX = ".meta"


Assuming I am not missing something, shuffle does not use (require) block length from meta file.
If yes, for all others, why not simply keep the block size in memory ? On executor failure, the on disk block is lost anyway, and we already maintain block info for each block in executor.

Hmm, good point... there's currently no metadata kept in the DiskStore class, but then this shouldn't be a lot of data.

mridulm · 2017-03-18T11:48:38Z

core/src/main/scala/org/apache/spark/storage/DiskStore.scala

    } finally {
      try {
-        Closeables.close(fileOutputStream, threwException)
+        Closeables.close(out, threwException)


IOException can be thrown in close(), we will need to remove block (and meta) in that case as well.

This was the previous behavior, but well, doesn't hurt to fix it.

mridulm · 2017-03-18T11:58:24Z

core/src/main/scala/org/apache/spark/storage/DiskStore.scala

+          Utils.tryWithSafeFinally {
+            val buf = ByteBuffer.allocate(blockSize.toInt)
+            while (buf.remaining() > 0) {
+              channel.read(buf)


We need to handle case where read() returns EOF (-1) in case of data corruption, file removal from underneath, etc : we will end up in infinite loop otherwise.

I might have missed more places where this pattern exists in this change.

mridulm · 2017-03-18T12:06:54Z

core/src/main/scala/org/apache/spark/storage/DiskStore.scala

+        remaining -= chunkSize
+
+        while (chunk.remaining() > 0) {
+          source.read(chunk)


as mentioned above, needs EOF error handling.

mridulm · 2017-03-18T12:11:52Z

core/src/main/scala/org/apache/spark/storage/DiskStore.scala

+    written
+  }
+
+  override def deallocate(): Unit = source.close()


release buffer as well.

StorageUtils.dispose specifically checks for mapped buffers, which is not the case here. It could be changed, but in this case I wonder if it's necessary or if waiting for GC is good enough.

This wasn't really caused by the new code, but by old code that was not consistent in its disposal of BlockManager buffers. This commit fixes the places where the underlying buffer was just left for the GC instead of being explicitly disposed.

SparkQA · 2017-03-20T22:54:27Z

Test build #74906 has finished for PR 17295 at commit 1b2a3e4.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

vanzin · 2017-03-20T22:59:35Z

retest this please

SparkQA · 2017-03-20T23:06:05Z

Test build #74905 has finished for PR 17295 at commit 1428fcd.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-03-21T01:05:06Z

Test build #74911 has finished for PR 17295 at commit 1b2a3e4.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-03-21T16:51:48Z

Test build #74989 has finished for PR 17295 at commit 6848a59.

This patch fails to build.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-03-21T23:00:58Z

Test build #74997 has finished for PR 17295 at commit 6bda670.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

cloud-fan · 2017-03-22T02:46:49Z

core/src/main/scala/org/apache/spark/broadcast/TorrentBroadcast.scala

+            }
+            obj
+          } finally {
+            blocks.foreach(_.dispose())


ah good catch! we should dispose the blocks here

cloud-fan · 2017-03-22T02:48:46Z

core/src/main/scala/org/apache/spark/security/CryptoStreamUtils.scala

+    val params = new CryptoParams(key, sparkConf)
+    val iv = createInitializationVector(params.conf)
+    val buf = ByteBuffer.wrap(iv)
+    while (buf.hasRemaining()) {


is there any possibility this may be an infinite loop?

actually this logic is same with CryptoHelperChannel.write. Shall we create CryptoHelperChannel first and simply call helper.write(buf) here?

No, there's no infinite loop here, because a failure would cause an exception. Yeah, using the helper should work too.

cloud-fan · 2017-03-22T02:50:40Z

core/src/main/scala/org/apache/spark/security/CryptoStreamUtils.scala

+      key: Array[Byte]): ReadableByteChannel = {
+    val iv = new Array[Byte](IV_LENGTH_IN_BYTES)
+    val buf = ByteBuffer.wrap(iv)
+    JavaUtils.readFully(channel, buf)


why not use ByteStreams.readFully? the buf is not used else where

There's no ByteStreams.readFully for ReadableByteChannel that I'm aware of.

cloud-fan · 2017-03-22T02:57:23Z

core/src/main/scala/org/apache/spark/serializer/SerializerManager.scala

    val byteStream = new BufferedOutputStream(outputStream)
    val autoPick = !blockId.isInstanceOf[StreamBlockId]
    val ser = getSerializer(implicitly[ClassTag[T]], autoPick).newInstance()
-    ser.serializeStream(wrapStream(blockId, byteStream)).writeAll(values).close()


the wrapStream and wrapForEncryption methods can be removed from this class

They're still used in a bunch of places.

cloud-fan · 2017-03-22T02:59:04Z

core/src/main/scala/org/apache/spark/storage/BlockManager.scala

+  def toNetty(): Object
+
+  def toChunkedByteBuffer(allocator: Int => ByteBuffer): ChunkedByteBuffer
+


it will be great to add some document for these 4 methods about when they will be called.

I added scaladoc for toNetty(), but the others seem self-explanatory to me.

cloud-fan · 2017-03-22T03:02:34Z

core/src/main/scala/org/apache/spark/storage/BlockManagerManagedBuffer.scala


 /**
- * This [[ManagedBuffer]] wraps a [[ChunkedByteBuffer]] retrieved from the [[BlockManager]]
+ * This [[ManagedBuffer]] wraps a ManagedBuffer retrieved from the [[BlockManager]]


wraps a [[BlockData]]

cloud-fan · 2017-03-22T03:05:12Z

core/src/main/scala/org/apache/spark/storage/BlockManagerManagedBuffer.scala

+    data: BlockData,
+    dispose: Boolean) extends ManagedBuffer {
+
+  private val refCount = new AtomicInteger(1)


maybe we should mention it in the class doc that the BlockData will be disposed automatically via reference count.

cloud-fan · 2017-03-22T03:07:08Z

core/src/main/scala/org/apache/spark/storage/BlockManagerManagedBuffer.scala

    blockId: BlockId,
-    chunkedBuffer: ChunkedByteBuffer) extends NettyManagedBuffer(chunkedBuffer.toNetty) {
+    data: BlockData,
+    dispose: Boolean) extends ManagedBuffer {


needDispose may be a better name

Hmm, I prefer dispose, because it's not about needing to dispose the buffer, but wanting to dispose the buffer.

cloud-fan · 2017-03-22T03:31:37Z

core/src/main/scala/org/apache/spark/storage/BlockManager.scala

            replicate(blockId, bytesToReplicate, level, remoteClassTag)
          } finally {
-            bytesToReplicate.unmap()
+            bytesToReplicate.dispose()


why change unmap to dispose?

Because there's no BlockData.unmap().

I'm afraid this may counteract the effort we made in #16499

Ideally unmap and dispose do different things

cc @mallman

BlockData.dispose calls ChunkedByteBuffer.unmap.

cloud-fan · 2017-03-22T03:37:11Z

core/src/main/scala/org/apache/spark/storage/DiskStore.scala

-        Closeables.close(fileOutputStream, threwException)
+        out.close()
+      } catch {
+        case ioe: IOException =>


why this? threwException starts with true

The code needs to catch any exception thrown by out.close() and also remove the block in that case. That wasn't done before.

cloud-fan · 2017-03-22T03:39:51Z

core/src/main/scala/org/apache/spark/storage/BlockManager.scala

+private[spark] trait BlockData {
+
+  def toInputStream(): InputStream
+


why the return type is Object?

See ManagedBuffer.convertToNetty().

SparkQA · 2017-03-24T00:22:53Z

Test build #75120 has finished for PR 17295 at commit 00b6d00.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

cloud-fan · 2017-03-25T12:41:42Z

core/src/main/scala/org/apache/spark/storage/BlockManager.scala

+  override def toByteBuffer(): ByteBuffer = buffer.toByteBuffer
+
+  override def size: Long = buffer.size
+


can we define the semantic of the BlockData.dispose clearly? It's quite confusing here that the dispose method call buffer.unmap while ChunkedByteBuffer also has a dispose method.

I think BlockData.dispose() is pretty well defined. "Release any resources held by the object." What's confusing is that there's both dispose() and unmap() in ChunkedByteBuffer, when there used to be only dispose(). It's confusing to have two different methods for releasing resources, and that confusion is not being caused by this patch.

BlockData is not just a wrapper around ChunkedByteBuffer; if it were there wouldn't be a need for it. Which is why calling the method unmmap() wouldn't make any sense here, since that's very specific to memory-mapped byte buffers.

BTW I'm really starting to think the fix in #16499, while technically correct, is more confusing that it should be. The problem is not that the code was disposing of off-heap buffers; the problem is that buffers read from the memory store should not be disposed of, while buffers read from the disk store should.

So it's not really a matter of dispose vs. unmap, but a matter of where the buffer come from. (Which is kinda what I had in this patch with the autoDispose parameter to ByteBufferBlockData. Perhaps I should revive that and get rid of StorageUtils.unmap, which is just confusing.)

It is not the type of buffer that defines whether it should be disposed or not, but rather where it comes from: if it comes from the memory store, it should not be disposed. Any other case (disk store, temporary serialized buffers, etc), the buffer should be disposed. It just happened that this sort of aligned with the types (buffers from the memory store are normal buffers, buffers from the disk store are mapped buffers in certain cases). But the origin defines who owns the buffer and, thus, who should dispose of it.

vanzin · 2017-03-26T00:13:50Z

I removed StorageUtils.unmap() in my last commit (see commit message for details). That makes the confusion go away.

The replication tests fail from time to time but they seem to be flaky without this patch. See;
https://spark-tests.appspot.com/test-details?suite_name=org.apache.spark.storage.BlockManagerProactiveReplicationSuite&test_name=proactive+block+replication+-+5+replicas+-+4+block+manager+deletions

SparkQA · 2017-03-26T02:57:25Z

Test build #75226 has finished for PR 17295 at commit d4013f9.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

cloud-fan · 2017-03-27T03:54:25Z

core/src/main/scala/org/apache/spark/storage/BlockManagerManagedBuffer.scala

 * so that the corresponding block's read lock can be released once this buffer's references
 * are released.
 *
+ * If `dispose` is set to try, the [[BlockData]]will be disposed when the buffer's reference


is set to try -> is set to true

cloud-fan · 2017-03-27T04:01:45Z

LGTM, cc @mallman to check the unmap part

SparkQA · 2017-03-27T19:37:20Z

Test build #75267 has finished for PR 17295 at commit ab4b5dd.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-03-28T20:37:44Z

Test build #75323 has finished for PR 17295 at commit 4a39cb2.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

cloud-fan · 2017-03-29T12:27:58Z

thanks, merging to master!

mallman · 2017-04-11T18:12:12Z

LGTM, cc @mallman to check the unmap part

LGTM, too. Sorry for the late reply... I've been away the past two weeks.

cloud-fan reviewed Mar 17, 2017

View reviewed changes

mridulm reviewed Mar 18, 2017

View reviewed changes

Marcelo Vanzin added 5 commits March 20, 2017 10:35

Feedback: style.

db4074c

Add helper method to read buffer from channel.

d697398

Store DiskStore block sizes in memory.

5dba0eb

Fix BlockData lifecycle management.

1428fcd

This wasn't really caused by the new code, but by old code that was not consistent in its disposal of BlockManager buffers. This commit fixes the places where the underlying buffer was just left for the GC instead of being explicitly disposed.

Remove block from size map in DiskStore.remove().

1b2a3e4

Merge branch 'master' into SPARK-19556

6848a59

Need to use ChunkedByteBuffer.unmap() instead of dispose().

6bda670

cloud-fan reviewed Mar 22, 2017

View reviewed changes

Feedback.

00b6d00

cloud-fan reviewed Mar 25, 2017

View reviewed changes

cloud-fan reviewed Mar 27, 2017

View reviewed changes

Fix typo.

ab4b5dd

Merge branch 'master' into SPARK-19556

4a39cb2

asfgit closed this in b56ad2b Mar 29, 2017

vanzin deleted the SPARK-19556 branch March 29, 2017 17:07

		def toNetty(): Object

		def toChunkedByteBuffer(allocator: Int => ByteBuffer): ChunkedByteBuffer

		private[spark] trait BlockData {

		def toInputStream(): InputStream

		override def toByteBuffer(): ByteBuffer = buffer.toByteBuffer

		override def size: Long = buffer.size

[SPARK-19556][core] Do not encrypt block manager data in memory. #17295

[SPARK-19556][core] Do not encrypt block manager data in memory. #17295

Uh oh!

Conversation

vanzin commented Mar 14, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mridulm commented Mar 14, 2017

Uh oh!

vanzin commented Mar 14, 2017

Uh oh!

mridulm commented Mar 15, 2017

Uh oh!

SparkQA commented Mar 15, 2017

Uh oh!

mridulm commented Mar 15, 2017

Uh oh!

vanzin commented Mar 15, 2017

Uh oh!

vanzin commented Mar 15, 2017

Uh oh!

mridulm commented Mar 15, 2017

Uh oh!

vanzin commented Mar 15, 2017

Uh oh!

sameeragarwal commented Mar 16, 2017

Uh oh!

cloud-fan commented Mar 16, 2017

Uh oh!

vanzin commented Mar 16, 2017

Uh oh!

cloud-fan commented Mar 17, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vanzin commented Mar 17, 2017

Uh oh!

mridulm left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vanzin commented Mar 14, 2017 •

edited

Loading

cloud-fan Mar 22, 2017 •

edited

Loading