Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Get a core dump when trying to run the Scala example from the website #4540

Closed
danimateos opened this issue Jan 5, 2017 · 1 comment
Closed

Comments

@danimateos
Copy link

I've been trying to get the MNist Scala example from the website to run with no luck. I have fixed a compilation error but now, when I run it, I get a core dump. I don't know how to proceed.

Environment info

Operating System:
Ubuntu 16.10 amd64

Compiler:
sbt

Package used (Python/R/Scala/Julia):
Scala package
scalaVersion 2.11.8

MXNet version:
"ml.dmlc.mxnet" % "mxnet-full_2.10-linux-x86_64-gpu" % "0.1.1"

Error Message:

$ sbt 'runMain playground.DigitsExample'
[info] Loading global plugins from /home/dani/myconfigs/.sbt/0.13/plugins
[info] Set current project to mxnet-playground (in build file:/home/dani/repos/mxnet-playground/)
[info] Updating {file:/home/dani/repos/mxnet-playground/}mxnet-playground...
[info] Resolving jline#jline;2.12.1 ...
[info] Done updating.
[info] Compiling 1 Scala source to /home/dani/repos/mxnet-playground/target/scala-2.11/classes...
[info] Running playground.DigitsExample 
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/dani/.ivy2/cache/ml.dmlc.mxnet/mxnet-full_2.10-linux-x86_64-cpu/jars/mxnet-full_2.10-linux-x86_64-cpu-0.1.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/dani/.ivy2/cache/org.slf4j/slf4j-log4j12/jars/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
log4j:WARN No appenders could be found for logger (MXNetJVM).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
[11:54:06] src/io/iter_mnist.cc:94: MNISTIter: load 60000 images, shuffle=1, shape=(50,1,28,28)
[11:54:06] src/io/iter_mnist.cc:94: MNISTIter: load 10000 images, shuffle=1, shape=(50,1,28,28)
[11:54:06] /home/ubuntu/release/mxnet/dmlc-core/include/dmlc/logging.h:245: [11:54:06] /home/ubuntu/release/mxnet/mshadow/mshadow/./././dot_engine-inl.h:431: Check failed: dst.size(0) == sleft[0] && dst.size(1) == sright[1] && sleft[1] == sright[0] dot-gemm: matrix shape mismatch
[11:54:06] /home/ubuntu/release/mxnet/dmlc-core/include/dmlc/logging.h:245: [11:54:06] src/engine/./threaded_engine.h:295: [11:54:06] /home/ubuntu/release/mxnet/mshadow/mshadow/./././dot_engine-inl.h:431: Check failed: dst.size(0) == sleft[0] && dst.size(1) == sright[1] && sleft[1] == sright[0] dot-gemm: matrix shape mismatch
An fatal error occurred in asynchronous engine operation. If you do not know what caused this error, you can try set environment variable MXNET_ENGINE_TYPEto NaiveEngine and run with debugger (i.e. gdb). This will force all operations to be synchronous and backtrace will give you the series of calls that lead to this error. Remember to set MXNET_ENGINE_TYPE back to empty after debugging.
terminate called after throwing an instance of 'dmlc::Error'
  what():  [11:54:06] src/engine/./threaded_engine.h:295: [11:54:06] /home/ubuntu/release/mxnet/mshadow/mshadow/./././dot_engine-inl.h:431: Check failed: dst.size(0) == sleft[0] && dst.size(1) == sright[1] && sleft[1] == sright[0] dot-gemm: matrix shape mismatch
An fatal error occurred in asynchronous engine operation. If you do not know what caused this error, you can try set environment variable MXNET_ENGINE_TYPEto NaiveEngine and run with debugger (i.e. gdb). This will force all operations to be synchronous and backtrace will give you the series of calls that lead to this error. Remember to set MXNET_ENGINE_TYPE back to empty after debugging.
/usr/share/sbt-launcher-packaging/bin/sbt-launch-lib.bash: line 41: 24698 Aborted                 (core dumped) "$@"

Minimum reproducible example

danimateos/mxnet-playground@8836995

Steps to reproduce

  1. Clone https://github.com/danimateos/mxnet-playground/tree/master @ 8836995
  2. Dowload the MNIST data to data/, using for example the script in mxnet/
  3. sbt runMain playground.DigitsExample

What have you tried to solve it?

I have set java environment variables as in the mxnet/scala-package/examples/scripts/run_gan_mnist.sh script from mxnet. I have set them in build.sbt using the following syntax:

// Needed for javaOptions
fork in run := true

javaOptions += "-Xmx4G"

envVars := Map(  
  "mnist-data-path" -> "data",
  "gpu" -> "1",
  "output-path" -> "out"
)

I have also tried export MXNET_ENGINE_TYPE=NaiveEngine but the stack trace wasn't very helpful:

sbt 'runMain playground.DigitsExample'
[info] Loading global plugins from /home/dani/myconfigs/.sbt/0.13/plugins
[info] Set current project to mxnet-playground (in build file:/home/dani/repos/mxnet-playground/)
[info] Running playground.DigitsExample 
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/dani/.ivy2/cache/ml.dmlc.mxnet/mxnet-full_2.10-linux-x86_64-cpu/jars/mxnet-full_2.10-linux-x86_64-cpu-0.1.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/dani/.ivy2/cache/org.slf4j/slf4j-log4j12/jars/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
log4j:WARN No appenders could be found for logger (MXNetJVM).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
[12:01:21] src/io/iter_mnist.cc:94: MNISTIter: load 60000 images, shuffle=1, shape=(50,1,28,28)
[12:01:21] src/engine/engine.cc:36: MXNet start using engine: NaiveEngine
[12:01:22] src/io/iter_mnist.cc:94: MNISTIter: load 10000 images, shuffle=1, shape=(50,1,28,28)
[12:01:22] /home/ubuntu/release/mxnet/dmlc-core/include/dmlc/logging.h:245: [12:01:22] /home/ubuntu/release/mxnet/mshadow/mshadow/./././dot_engine-inl.h:431: Check failed: dst.size(0) == sleft[0] && dst.size(1) == sright[1] && sleft[1] == sright[0] dot-gemm: matrix shape mismatch
[error] (run-main-0) ml.dmlc.mxnet.MXNetError: [12:01:22] /home/ubuntu/release/mxnet/mshadow/mshadow/./././dot_engine-inl.h:431: Check failed: dst.size(0) == sleft[0] && dst.size(1) == sright[1] && sleft[1] == sright[0] dot-gemm: matrix shape mismatch
ml.dmlc.mxnet.MXNetError: [12:01:22] /home/ubuntu/release/mxnet/mshadow/mshadow/./././dot_engine-inl.h:431: Check failed: dst.size(0) == sleft[0] && dst.size(1) == sright[1] && sleft[1] == sright[0] dot-gemm: matrix shape mismatch
	at ml.dmlc.mxnet.Base$.checkCall(Base.scala:108)
	at ml.dmlc.mxnet.Executor.forward(Executor.scala:162)
	at ml.dmlc.mxnet.DataParallelExecutorManager$$anonfun$forward$3.apply(Executor.scala:405)
	at ml.dmlc.mxnet.DataParallelExecutorManager$$anonfun$forward$3.apply(Executor.scala:404)
	at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733)
	at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
	at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
	at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732)
	at ml.dmlc.mxnet.DataParallelExecutorManager.forward(Executor.scala:404)
	at ml.dmlc.mxnet.Model$$anonfun$trainMultiDevice$1.apply$mcVI$sp(Model.scala:246)
	at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:160)
	at ml.dmlc.mxnet.Model$.trainMultiDevice(Model.scala:232)
	at ml.dmlc.mxnet.FeedForward.fit(FeedForward.scala:289)
	at ml.dmlc.mxnet.FeedForward.fit(FeedForward.scala:214)
	at ml.dmlc.mxnet.FeedForward$Builder.build(FeedForward.scala:555)
	at playground.DigitsExample$.delayedEndpoint$playground$DigitsExample$1(DigitsExample.scala:44)
	at playground.DigitsExample$delayedInit$body.apply(DigitsExample.scala:6)
	at scala.Function0$class.apply$mcV$sp(Function0.scala:34)
	at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
	at scala.App$$anonfun$main$1.apply(App.scala:76)
	at scala.App$$anonfun$main$1.apply(App.scala:76)
	at scala.collection.immutable.List.foreach(List.scala:381)
	at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:35)
	at scala.App$class.main(App.scala:76)
	at playground.DigitsExample$.main(DigitsExample.scala:6)
	at playground.DigitsExample.main(DigitsExample.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
[trace] Stack trace suppressed: run last compile:runMain for the full output.
java.lang.RuntimeException: Nonzero exit code: 1
	at scala.sys.package$.error(package.scala:27)
[trace] Stack trace suppressed: run last compile:runMain for the full output.
[error] (compile:runMain) Nonzero exit code: 1
[error] Total time: 2 s, completed Jan 5, 2017 12:01:22 PM
@yajiedesign
Copy link
Contributor

This issue is closed due to lack of activity in the last 90 days. Feel free to reopen if this is still an active issue. Thanks!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants