Skip to content
Closed
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,9 @@ import org.apache.spark.util.Utils
*/
object DriverRegistry extends Logging {

// Initialize DriverManager first to prevent potential deadlocks between DriverManager and Driver
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to say more about why this can avoid deadlock.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DriverManager.getDrivers
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @rxin .
Could you give me some comments on this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More context about this change? Based on your PR description, this is to resolve the deadlocks among executors? How does it work after applying this change?

Copy link
Member Author

@dongjoon-hyun dongjoon-hyun Jan 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the same situation like STORM in the PR description and this occurs in Spark, too.
In the Spark executor, the stacks shows the deadlock between DriverManager and Driver.

  • Pheonix library call DriverManager.loadInitialDrivers()
  • Spark DriverRegistry call PhoenixDriver constructor before DriverManager created.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, so far, I only have this log only. It's difficult to reproduce this deadlock.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How did you test it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need a test; otherwise, it is not the right thing to merge such a PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean a unit test case?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if you can merge this PR if I can test this patch in somewhere of our customer cluster. :)

Copy link
Contributor

@cloud-fan cloud-fan Jan 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if it's too hard to write a UT, I think a manual test is also fine.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for review, @cloud-fan .
So far, this deadlock situation is reported intermittently without any logs.


private val wrapperMap: mutable.Map[String, DriverWrapper] = mutable.Map.empty

def register(className: String): Unit = {
Expand Down