Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

as.matrix seg fault #1460

Open
paciorek opened this issue Jun 3, 2024 · 1 comment
Open

as.matrix seg fault #1460

paciorek opened this issue Jun 3, 2024 · 1 comment

Comments

@paciorek
Copy link
Contributor

paciorek commented Jun 3, 2024

A user reported a seg fault at the end of MCMC:

"""
I am having an "caught segfault" error on the execution of the MCMC. After a few hours of the execution of the first chain the following error pops up.

*** caught segfault ***
address 0x7fcfa3bc8240, cause 'memory not mapped'

Traceback:
1: as.matrix.CmodelValues(mcmc$mvSamples)
2: as.matrix(mcmc$mvSamples)
3: runMCMC(cMy.MCMC, niter = 120000, nburnin = 80000, nchains = 3, summary = TRUE)
An irrecoverable exception occurred. R is aborting now ...
/appl/opt/csc-cli-utils/bin/singularity_wrapper: line 42: 1711121 Segmentation fault apptainer --silent exec $SING_FLAGS $SING_IMAGE "${@:2}"
"""

I have reproduced this with the user's code (side note: it doesn't seem to be related to using Singularity). However it doesn't occur with shorter runs, and the full run (120k iterations, 80k burnin) takes something like 1.5 days. So I am still trying to track it down. When browsing in as.matrix.CmodelValues, it occurs the first time that fastMatrixInsert is inserting a single column, though I don't know if that is related to the problem.

This is of course heavily-used code, so quite curious. At the moment I'm trying to see if there is anything odd about the actual values being inserted in that single column.

@paciorek
Copy link
Contributor Author

paciorek commented Jun 4, 2024

A couple other tidbits:

  1. User subsequently reported that if they increase the memory available, the problem went away. Odd as that is not the behavior of a seg fault though I have a vague feeling I have encountered such behavior before, and perhaps with more memory there is less chance that writing to unallocated memory causes a problem.
  2. However, when I ran on a machine with a ton of memory I could still reproduce the problem.
  3. I stopped the R code just before the seg-faulting call to fastMatrixInsert. All the dimensions looked fine as did the values in the matrices.
  4. I inserted some print statements into fastMatrixInsert. Sure enough, I didn't get the seg fault.

So I suspect that if I invoke the C++ debugger I won't see the error, but that is the next thing to try.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant