Skip to content

Conversation

@yaooqinn
Copy link
Member

What changes were proposed in this pull request?

Shuffle Index temporay file is used for atomic creating shuffle index file, it is not needed when the index file already exists after another attempts of same task had it done.

How was this patch tested?

exitsting ut

cc @squito

@srowen
Copy link
Member

srowen commented Jan 29, 2018

Is this just trying to reuse a file that should have been cleaned up after prior failure? If so is that possible as a more direct solution? I wonder if there aren't corner cases here where the file exists and it is still being written to by another process. This could result in corruption. But I am not familiar with this mechanism

@jerryshao
Copy link
Contributor

I think it is necessary to add unit test to verify the changes.

@SparkQA
Copy link

SparkQA commented Jan 29, 2018

Test build #86768 has finished for PR 20422 at commit 98ea6a7.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@squito
Copy link
Contributor

squito commented Jan 30, 2018

thanks for taking a look at this @yaooqinn . To clarify -- there is no bug you are trying to fix here, is there? Its just an optimization? From a quick glance I think the change seems correct ... but also seems like such a minor improvement that I'm not sure I see the value in changing this.

@yaooqinn
Copy link
Member Author

thanks guys for reviewing. yes, this is just a minor improvement which I guess code here seem not very logical when I was trying to do some optimizations for my customer's heavy shuffle case. If it's too trivial, I can close it.

@jerryshao
Copy link
Contributor

I agree with @squito , unless there's a bug in it, it is risky and unnecessary to change the logic in this critical path.

@squito
Copy link
Contributor

squito commented Jan 30, 2018

Jenkins, ok to test

Copy link
Contributor

@squito squito left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually think this is OK, though I'm still somewhat wary of changing this code for minor gains.

It is already tested in IndexShuffleBlockResolverSuite. Btw, I noticed that the comment here is wrong:

https://github.com/apache/spark/blob/master/core/src/test/scala/org/apache/spark/shuffle/sort/IndexShuffleBlockResolverSuite.scala#L126

It should say "the dataFile should be the new one, since we deleted the dataFile from the first attempt".

It would also be good to expand that test to actually read the index file and make sure it has the right values.

}
indexTmp.delete()
} else {
val out = new DataOutputStream(new BufferedOutputStream(new FileOutputStream(indexTmp)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move this below the comment "This is the first successul attempt".

I'd also include a comment about why we write to a temporary file, even though we're always going to rename (because in case the task dies somehow, we'd prefer to not leave a half-written index file in the final location).

@SparkQA
Copy link

SparkQA commented Jan 30, 2018

Test build #86792 has finished for PR 20422 at commit 98ea6a7.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jan 30, 2018

Test build #86806 has finished for PR 20422 at commit 6196770.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@yaooqinn
Copy link
Member Author

retest this please

@SparkQA
Copy link

SparkQA commented Jan 30, 2018

Test build #86831 has finished for PR 20422 at commit 6196770.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jan 31, 2018

Test build #86858 has finished for PR 20422 at commit 87e6dc0.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@yaooqinn
Copy link
Member Author

@squito add a test for index file. plz check it again, thanks.

@squito
Copy link
Contributor

squito commented Jan 31, 2018

@jerryshao are you ok with making this change? I think our original comments corssed paths as I was taking a closer look

Copy link
Contributor

@squito squito left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the update @yaooqinn . i left some style comments, but also waiting to see @jerryshao 's opinion. Sorry for the back and forth, in any case its always useful to have more people looking at this code critically.

val idxName = s"shuffle_${shuffleId}_${mapId}_0.index"
val resolver = new IndexShuffleBlockResolver(conf, blockManager)

val lengths = (1 to 2).map(_ => 8L).toArray
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could do Array.fill(2)(8L)

assert(firstByte2(0) === 2)
}

test("SPARK-23253: index files should be created properly") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for adding this, but actually I'm not sure this is covering any cases in the previous test, is it? I was thinking of just adding something to read the actual index file, and make sure it had the right values to go with the update to the data file (or no updates in some cases).

you may have added a couple more asserts than the original test -- if so, maybe they can just be added to the original?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 we can add check the index file in the original test case.

@@ -123,7 +123,7 @@ class IndexShuffleBlockResolverSuite extends SparkFunSuite with BeforeAndAfterEa
assert(dataFile.length() === 35)
assert(!dataTmp2.exists())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not related to the change, but this should be assert(!dataTmp3.exists())

package org.apache.spark.shuffle.sort

import java.io.{File, FileInputStream, FileOutputStream}
import java.io._
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this change is not necessary

assert(firstByte(0) === 0)

// The index file should not change
val secondBytes = new Array[Byte](8)
Copy link
Contributor

@jiangxb1987 jiangxb1987 Feb 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: should have a better name, perhaps indexSecondValue ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure whether we should change this, just my two cents.

@SparkQA
Copy link

SparkQA commented Feb 1, 2018

Test build #86905 has finished for PR 20422 at commit a96f6c4.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Feb 1, 2018

Test build #86906 has finished for PR 20422 at commit 246dbca.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Feb 1, 2018

Test build #86907 has finished for PR 20422 at commit f3f3627.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

@jerryshao jerryshao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Copy link
Contributor

@squito squito left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

} {
indexIn.close()
}
assert(secondValueOffset(7) === 10, "The index file should not change")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: here and below, would be more clear if you use DataInputStream.readLong() (no magic 7 offset, and you check the rest of the bytes):

val indexIn = new DataInputStream( newFileInputStream(indexFile))
Utils.tryWithSafeFinally {
  indexIn.readLong()  // first offset is always 0
  assert(10 === indexIn.readLong(),"The index file should not change")
}

@jiangxb1987
Copy link
Contributor

LGTM

@SparkQA
Copy link

SparkQA commented Feb 2, 2018

Test build #86958 has finished for PR 20422 at commit 5677448.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Feb 2, 2018

Test build #86959 has finished for PR 20422 at commit a3b85f1.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@squito
Copy link
Contributor

squito commented Feb 2, 2018

merged to master.

thanks @yaooqinn for doing and updating the tests too

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants