Skip to content

Conversation

@bhavanki
Copy link

The logic in SparkContext for creating a new task scheduler now looks for a "spark.local.maxFailures" property to specify the number of task failures in a local job that will cause the job to fail. Its default is the prior fixed value of 1 (no retries).

The patch includes documentation updates and new unit tests.

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'd rename the variable maxTaskFailures

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.

@roji
Copy link

roji commented Aug 10, 2014

+1, this is important to us for locally testing exception logic before running on a real cluster

@JoshRosen
Copy link
Contributor

I think there's already a mechanism to set this by using local[N, maxFailures] to create your SparkContext:

// Regular expression for local[N, maxRetries], used in tests with failing tasks
     val LOCAL_N_FAILURES_REGEX = """local\[([0-9]+)\s*,\s*([0-9]+)\]""".r

// ...

case LOCAL_N_FAILURES_REGEX(threads, maxFailures) =>
         val scheduler = new TaskSchedulerImpl(sc, maxFailures.toInt, isLocal = true)
         val backend = new LocalBackend(scheduler, threads.toInt)
         scheduler.initialize(backend)
         scheduler

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@bhavanki
Copy link
Author

@JoshRosen You are right, the local[N, maxfailures] mechanism works already, but the filer of SPARK-2083 stated that "it is not documented and hard to manage", and suggested a patch like this one.

@SparkQA
Copy link

SparkQA commented Sep 5, 2014

Can one of the admins verify this patch?

@andrewor14
Copy link
Contributor

@kbzod Why do we need a separate config for the local case? I think the correct solution is to use the same config, but set a different default value for local mode. Right now this doesn't work because we pass in a hard-coded value of 1, but we can change that to take in spark.task.maxFailures instead (and then default to 1).

Also, can you rebase to master when you have the chance?

@andrewor14
Copy link
Contributor

add to whitelist

@pwendell
Copy link
Contributor

If this is being used for testing, I don't see a compelling reason to adding a config over using the constructor.

@pwendell
Copy link
Contributor

I'm going to close this issue as wontfix.

@asfgit asfgit closed this in f73b56f Nov 10, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants