Skip to content

Conversation

@navis
Copy link
Contributor

@navis navis commented Sep 18, 2015

Kryo fails with buffer overflow even with max value (2G).

{noformat}
org.apache.spark.SparkException: Kryo serialization failed: Buffer overflow. Available: 0, required: 1
Serialization trace:
containsChild (org.apache.spark.sql.catalyst.expressions.BoundReference)
child (org.apache.spark.sql.catalyst.expressions.SortOrder)
array (scala.collection.mutable.ArraySeq)
ordering (org.apache.spark.sql.catalyst.expressions.InterpretedOrdering)
interpretedOrdering (org.apache.spark.sql.types.StructType)
schema (org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema). To avoid this, increase spark.kryoserializer.buffer.max value.
at org.apache.spark.serializer.KryoSerializerInstance.serialize(KryoSerializer.scala:263)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:240)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
{noformat}

@SparkQA
Copy link

SparkQA commented Sep 18, 2015

Test build #1772 has finished for PR 8808 at commit a26512b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class TaskCommitDenied(
    • class Interaction(override val uid: String) extends Transformer
    • abstract class LocalNode(conf: SQLConf) extends QueryPlan[LocalNode] with Logging

@rxin
Copy link
Contributor

rxin commented Sep 18, 2015

Thanks - I've merged this.

@asfgit asfgit closed this in e3b5d6c Sep 18, 2015
@JoshRosen
Copy link
Contributor

@rxin, should this also go into 1.5.1.?

@rxin
Copy link
Contributor

rxin commented Sep 18, 2015

Yes - I cherry-picked it now. Thanks.

asfgit pushed a commit that referenced this pull request Sep 18, 2015
…ialized

Kryo fails with buffer overflow even with max value (2G).

{noformat}
org.apache.spark.SparkException: Kryo serialization failed: Buffer overflow. Available: 0, required: 1
Serialization trace:
containsChild (org.apache.spark.sql.catalyst.expressions.BoundReference)
child (org.apache.spark.sql.catalyst.expressions.SortOrder)
array (scala.collection.mutable.ArraySeq)
ordering (org.apache.spark.sql.catalyst.expressions.InterpretedOrdering)
interpretedOrdering (org.apache.spark.sql.types.StructType)
schema (org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema). To avoid this, increase spark.kryoserializer.buffer.max value.
        at org.apache.spark.serializer.KryoSerializerInstance.serialize(KryoSerializer.scala:263)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:240)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
{noformat}

Author: navis.ryu <[email protected]>

Closes #8808 from navis/SPARK-10684.

(cherry picked from commit e3b5d6c)
Signed-off-by: Reynold Xin <[email protected]>
@rxin
Copy link
Contributor

rxin commented Sep 18, 2015

@navis can you give us the data type that caused this problem?

@navis
Copy link
Contributor Author

navis commented Sep 22, 2015

@rxin It's just a table with 100+ string columns partitioned by a string key. It happened by a simple query just like select <100+> from <table> where <partition-key condition>.

@uhonnavarkar
Copy link

Is this problem still exist in Spark 1.5.2 / 1.6.0?

@JoshRosen
Copy link
Contributor

@uhonnavarkar, this patch was incorporated into 1.5.1 and 1.6.0, if that's what you're asking.

@uhonnavarkar
Copy link

I have 2 question below.

  1. I have downloaded pre-build version of spark-1.5.2-bin-hadoop2.6.tgz from spark website(http://spark.apache.org/downloads.html). still I see stack-trace in Removed reference to incubation in Spark user docs. #2 below. Is this package contains fix?

  2. When I try to query the table(Dataframe) getting the exception. Also it is inconsistent, some time it works some time not.
    Stack Trace:
    Job aborted due to stage failure: Task 0 in stage 1063.0 failed 4 times, most recent failure: Lost task 0.3 in stage 1063.0 (TID 15469, XX.XXX.XX.XXX): org.apache.spark.SparkException: Kryo serialization failed: Buffer overflow. Available: 0, required: 2. To avoid this, increase spark.kryoserializer.buffer.max value. at org.apache.spark.serializer.KryoSerializerInstance.serialize(KryoSerializer.scala:263) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:240) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Driver stacktrace:

And this is how I am creating Spark Context:
sparkConf = new SparkConf().setMaster(sparkMaster)
.setAppName("ABCTesting")
.set("spark.home", spark_home)
.set("spark.shuffle.consolidateFiles","true")
.set("spark.shuffle.manager","sort")
.set("spark.shuffle.spill", "false")
.set("spark.executor.memory", spark_executor_memory)
.set("spark.executor.extraClassPath", spark_executor_extra_classpath)
.set("spark.cores.max", spark_cores_max)
.set("spark.sql.shuffle.partitions", "15")
.set("spark.driver.memory", spark_driver_memory)
.set("spark.default.parallelism", "90")
.set("spark.serializer", "org.apache.spark.serializer.KryoSerializer");
sparkContext = new JavaSparkContext(sparkConf);

ashangit pushed a commit to ashangit/spark that referenced this pull request Oct 19, 2016
…ialized

Kryo fails with buffer overflow even with max value (2G).

{noformat}
org.apache.spark.SparkException: Kryo serialization failed: Buffer overflow. Available: 0, required: 1
Serialization trace:
containsChild (org.apache.spark.sql.catalyst.expressions.BoundReference)
child (org.apache.spark.sql.catalyst.expressions.SortOrder)
array (scala.collection.mutable.ArraySeq)
ordering (org.apache.spark.sql.catalyst.expressions.InterpretedOrdering)
interpretedOrdering (org.apache.spark.sql.types.StructType)
schema (org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema). To avoid this, increase spark.kryoserializer.buffer.max value.
        at org.apache.spark.serializer.KryoSerializerInstance.serialize(KryoSerializer.scala:263)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:240)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
{noformat}

Author: navis.ryu <[email protected]>

Closes apache#8808 from navis/SPARK-10684.

(cherry picked from commit e3b5d6c)
Signed-off-by: Reynold Xin <[email protected]>
(cherry picked from commit 2c6a51e)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants