-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Long-running script messes up output after a while #5106
Comments
Is this from the juliareleases PPA? (Or just paste in the output from Oh wait, Debian. Did Debian get the final 0.2 release in their repositories? |
Yes, Debian unstable ships 0.2 release:
|
@maleadt, is it possible to reproduce this if you update to the latest from git? Understand that this may be difficult depending on how long "long-running" means. |
This is probably the same/similar intermittent slippery memory bug that still plagues the tests from time to time. |
The debugger would be nice to have here. If the crash dumps you into a debugger, there ids some chance of figuring out what if going on. |
It might be enough to run this with MEMDEBUG and AddressSanitizer enabled (If you are willing to do that, I'll post instructions). |
I just tried running with a master checkout, but couldn't replicate the the issue... However, the build differs quite a lot from the Debian-shipped one I use (system libraries, etc). If I can't replicate it I'll have a look at compiling master with the same build options from the Debian package. EDIT: scratch that, I couldn't replicate it with the Debian build. I'll post here when I have more information. |
@JeffBezanson could this be related: 4cb1c98 (should be cherry-picked for 0.2) |
I doubt it, since this crash was during a script. |
Actually, this seems more related: #5069 prior to that fix, |
That wasn't in the 0.2 release though. |
@JeffBezanson where is the gc root for such a thing if it could be JIT into the code? no, but it seems to be exactly the same error, and some tuples were not given gc roots even before your changes. |
Reproduced it finally on 47a1ae5 (master as of this morning). @loladiro How exactly do I build this debug version, I'm guessing CC=clang CFLAGS="-DMEMDEBUG -fsanitize=address"? |
I tried with the following
... but it fails to link
Any hints? |
In case somebody can use a reproducible test case to trigger this, please ping me. (I'll have to add you to a private repo) Also, you can find a valgrind output of that test case here: https://gist.github.com/rened/8472496 |
59bf492 seems to have fixed this! I can no longer reproduce this error. |
Oh, that's excellent news! |
That's a relief. It's definitely possible that that commit fixes this. |
If you do notice it happening again, please do reopen (or open a new issue). |
FYI: I got a strange Travis Error which displays the same |
I haven't been able to reproduce this bug for a while -- it has always been quite difficult -- so I can't confirm whether 59bf492 has changed anything... |
Something like this still happens on travis intermittently. |
FYI I got a similar error on |
This may or may not be relevant: On a fork of master with considerable changes in multi.jl, I am seeing my feature specific tests fail / seg fault - each run is failing differently.
|
Record backtrace generates these false positives from the way it is designed. IF you trap calls to jl_throw, you might get more info I the error |
It looks like the Gadfly tests fail sometimes with the same error, but only occasionally and seemingly at random. Scroll to the bottom here: https://travis-ci.org/dcjones/Gadfly.jl/jobs/19685997 |
PackageEval hit this last night when testing Optim, julia v0.2.1+2 It does something similar to the original issue openers script - long running, shells out to run new copies of Julia on test scripts. |
Believed fixed by #6085. |
Woot! It seems a bit unlikely that this will be the last time we have a gc-unrooted bug. I really don't know the guts of the gc at all, but might it be possible to introduce a mode (through an optional compiler flag, perhaps set to true by |
I have a pretty long-running script (several hours), which is responsible for invoking executables (including
julia
itself again), and capturing its output. After some hours though, the parent process manages to mess up its output, crashing in some core part library:This list of
<?::
's goes on for quite a while (hundreds of repetitions), and then it stops. Afterwards, newprint
entries seem work again: although I don't see the output on my screen, they do end up in the log fileSTDOUT
is redirected to (see invocation command below). This might bee atee
issue though.I'm using julia 0.2 from the Debian repositories, and the script is launched as following:
The text was updated successfully, but these errors were encountered: