Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Corruption caused by ReadOnlyMemoryError test #11691

Closed
JeffBezanson opened this issue Jun 12, 2015 · 21 comments · Fixed by #14020
Closed

Corruption caused by ReadOnlyMemoryError test #11691

JeffBezanson opened this issue Jun 12, 2015 · 21 comments · Fixed by #14020

Comments

@JeffBezanson
Copy link
Member

https://travis-ci.org/JuliaLang/julia/jobs/66569134 in file test
https://travis-ci.org/JuliaLang/julia/jobs/66559939 in file test
https://travis-ci.org/JuliaLang/julia/jobs/66554918 in llvmcall test

All crashed in different places.

@yuyichao
Copy link
Contributor

All 32-bits ....

@carnaval
Copy link
Contributor

if you try and reproduce the same env. remember that the vm is 64 bit cross compiling for i686

If no one beats me to it I'll try to have a look in the afternoon

@yuyichao
Copy link
Contributor

I'm booting up my container but you'll probably beat me at finding the problem =)

@yuyichao
Copy link
Contributor

I've got a segfault but it looks pretty random....

@yuyichao
Copy link
Contributor

Hmm. Actually it was quite repeatable. This is a out-of-bound access. @mbauman

It happens on Array{UInt8, 1}[0x48, 0x65, 0x6c, 0x6c, 0x78, 0x20, 0x57, 0x6f, 0x72, 0x6c, 0x64][5] = 0x78

@carnaval
Copy link
Contributor

does --check-bounds=yes catches it ?

@yuyichao
Copy link
Contributor

I'll try that. Note that the access looks perfectly normal though maybe corrupt somewhere else......

Dump of assembler code for function julia_setindex!_6675:
   0xf6511130 <+0>:   push   %ebp
   0xf6511131 <+1>:   mov    %esp,%ebp
   0xf6511133 <+3>:   push   %ebx
   0xf6511134 <+4>:   push   %esi
   0xf6511135 <+5>:   call   0xf651113a <julia_setindex!_6675+10>
   0xf651113a <+10>:  pop    %ebx
   0xf651113b <+11>:  add    $0x1e50fe,%ebx
   0xf6511141 <+17>:  mov    0x10(%ebp),%ecx
   0xf6511144 <+20>:  mov    0x8(%ebp),%eax
   0xf6511147 <+23>:  lea    -0x1(%ecx),%edx
   0xf651114a <+26>:  cmp    0x4(%eax),%edx
   0xf651114d <+29>:  jae    0xf651115e <julia_setindex!_6675+46>
   0xf651114f <+31>:  mov    0xc(%ebp),%cl
   0xf6511152 <+34>:  mov    (%eax),%esi
=> 0xf6511154 <+36>:  mov    %cl,(%esi,%edx,1)
   0xf6511157 <+39>:  lea    -0x8(%ebp),%esp
   0xf651115a <+42>:  pop    %esi
   0xf651115b <+43>:  pop    %ebx
   0xf651115c <+44>:  pop    %ebp
   0xf651115d <+45>:  ret    
   0xf651115e <+46>:  mov    %esp,%edx
   0xf6511160 <+48>:  lea    -0x10(%edx),%esi
   0xf6511163 <+51>:  mov    %esi,%esp
   0xf6511165 <+53>:  mov    %ecx,-0x10(%edx)
   0xf6511168 <+56>:  sub    $0x4,%esp
   0xf651116b <+59>:  push   $0x1
   0xf651116d <+61>:  push   %esi
   0xf651116e <+62>:  push   %eax
   0xf651116f <+63>:  call   0xf64236f0 <jl_bounds_error_ints@plt>
   0xf6511174 <+68>:  add    $0x10,%esp

@yuyichao
Copy link
Contributor

No it doesn't. (This is in a precompiled function though. Would the command line option affect this?)

@yuyichao
Copy link
Contributor

P.S. I said it is out-of-bound access because of this

#0  0xf6511154 in julia_setindex!_6675 (                                          
    x=<error reading variable: access outside bounds of object referenced via synthetic pointer>, i0=<optimized out>) at array.jl:321                               

Not necessarily out-of-bound in julia array sense.

@yuyichao
Copy link
Contributor

And seems that I didn't mention yet, this is the file test on my gc-debug branch. 100% reproducible. (and GC_VERIFY didn't complain about anything...)

@Keno
Copy link
Member

Keno commented Jun 12, 2015

That GDB error doesn't mean much, so it's not necessarily an out of bounds error.

@carnaval
Copy link
Contributor

I think you're hitting the mmap segfault handler test.
120 = 'x' (see line 210 of test/file.jl)

@yuyichao
Copy link
Contributor

You are right. I'll comment out that test and try again.. Or is it possible that the signal handler does sth bad.? (#11003 ?)

@timholy
Copy link
Member

timholy commented Jun 12, 2015

#6877 (comment)

@tkelman
Copy link
Contributor

tkelman commented Jun 13, 2015

I think this is due to #11491, I've been getting the file segfault locally on win64 and just finished bisecting it down to a2b6943 (not sure why I see it locally but not on appveyor though). As Tim remembered from last year, the one failure that was in llvmcall was on the same worker that had previously run the file test.

@quinnj
Copy link
Member

quinnj commented Jun 13, 2015

#11491 is a particularly seg-faulty PR, though I was able to test it several times locally on Windows and OSX 64-bit. There isn't anything in there that should be 32/64-bit sensitive though AFAIK, so not sure...

@tkelman
Copy link
Contributor

tkelman commented Jun 14, 2015

So do we need to comment-out that test?

@yuyichao
Copy link
Contributor

Is there a branch that removes these tests to see if Travis is indeed happy?

@tkelman
Copy link
Contributor

tkelman commented Jun 14, 2015

Not yet? Go for it?

yuyichao added a commit that referenced this issue Jun 14, 2015
yuyichao added a commit that referenced this issue Jun 14, 2015
@yuyichao yuyichao changed the title recent segfaults in tests Corruption caused by ReadOnlyMemoryError test on 32-bit Jun 14, 2015
@yuyichao yuyichao added the system:32-bit Affects only 32-bit systems label Jun 14, 2015
@tkelman
Copy link
Contributor

tkelman commented Jun 14, 2015

also happens on win64 btw, so it's not only a 32 bit issue - it just appears to be way less common on linux 64 (and mac 64?)

@yuyichao yuyichao changed the title Corruption caused by ReadOnlyMemoryError test on 32-bit Corruption caused by ReadOnlyMemoryError test Jun 14, 2015
@yuyichao yuyichao removed the system:32-bit Affects only 32-bit systems label Jun 14, 2015
@yuyichao
Copy link
Contributor

32bit tag removed (also from the title...) #11003 still make me suspect LLVM is doing sth bad (if I understand that issue correctly).

yuyichao added a commit that referenced this issue Jun 28, 2015
…o cause some corruption on 32bit linux #11691"

This reverts commit 224829e.
yuyichao added a commit that referenced this issue Jul 7, 2015
…o cause some corruption on 32bit linux #11691"

This reverts commit 224829e.
fcard pushed a commit to fcard/julia that referenced this issue Jul 8, 2015
yuyichao added a commit that referenced this issue Jul 10, 2015
…o cause some corruption on 32bit linux #11691"

This reverts commit 224829e.
yuyichao added a commit that referenced this issue Jul 11, 2015
…o cause some corruption on 32bit linux #11691"

This reverts commit 224829e.
yuyichao added a commit to yuyichao/julia that referenced this issue Jul 26, 2015
…o cause some corruption on 32bit linux JuliaLang#11691"

This reverts commit 224829e.
yuyichao added a commit that referenced this issue Jul 30, 2015
…o cause some corruption on 32bit linux #11691"

This reverts commit 224829e.
yuyichao added a commit that referenced this issue Jul 30, 2015
…o cause some corruption on 32bit linux #11691"

This reverts commit 224829e.
yuyichao added a commit to yuyichao/julia that referenced this issue Nov 16, 2015
…o cause some corruption on 32bit linux JuliaLang#11691"

This reverts commit 224829e.
yuyichao added a commit that referenced this issue Nov 16, 2015
…o cause some corruption on 32bit linux #11691"

This reverts commit 224829e.
yuyichao added a commit that referenced this issue Nov 16, 2015
…o cause some corruption on 32bit linux #11691"

This reverts commit 224829e.
yuyichao added a commit that referenced this issue Nov 16, 2015
…o cause some corruption on 32bit linux #11691"

This reverts commit 224829e.
zhmz90 pushed a commit to zhmz90/julia that referenced this issue Nov 19, 2015
…o cause some corruption on 32bit linux JuliaLang#11691"

This reverts commit 224829e.
zhmz90 pushed a commit to zhmz90/julia that referenced this issue Nov 21, 2015
…o cause some corruption on 32bit linux JuliaLang#11691"

This reverts commit 224829e.
yuyichao added a commit that referenced this issue Nov 29, 2015
…o cause some corruption on 32bit linux #11691"

This reverts commit 224829e.

(cherry picked from commit c65bff9)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants