Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: OSError: memory map must have a non-zero length in load_to_np_impl #70

Open
mrvollger opened this issue Jan 24, 2023 · 1 comment

Comments

@mrvollger
Copy link
Contributor

mrvollger commented Jan 24, 2023

Hi @38 and @arq5x,

I am getting an error when trying to open a d4 matrix in pyd4:

OSError: memory map must have a non-zero length

I have tried remaking the input file a few times but I keep getting this error. Interestingly if I use the command line tool d4tools I get no error accessing the same region. I have also used the python code successfully on three other samples but it is failing here, so I am at a bit of a loss.

I include details and inputs below, thanks in advance!

Details:
Here is a full traceback of the error

python test.d4.py 
Traceback (most recent call last):
  File "/mmfs1/gscratch/stergachislab/mvollger/projects/GM12878_aCRE_2022-08-16/test.d4.py", line 14, in <module>
    matrix["chr1", 0, 1000]
  File "/mmfs1/gscratch/stergachislab/mvollger/miniconda3/envs/fiberseq-smk/lib/python3.9/site-packages/pyd4/__init__.py", line 100, in __getitem__
    data = [track[key] for track in self.tracks]
  File "/mmfs1/gscratch/stergachislab/mvollger/miniconda3/envs/fiberseq-smk/lib/python3.9/site-packages/pyd4/__init__.py", line 100, in <listcomp>
    data = [track[key] for track in self.tracks]
  File "/mmfs1/gscratch/stergachislab/mvollger/miniconda3/envs/fiberseq-smk/lib/python3.9/site-packages/pyd4/__init__.py", line 430, in __getitem__
    return self.load_to_np(key)
  File "/mmfs1/gscratch/stergachislab/mvollger/miniconda3/envs/fiberseq-smk/lib/python3.9/site-packages/pyd4/__init__.py", line 513, in load_to_np
    return self._for_each_region(regions, load_to_np_impl)
  File "/mmfs1/gscratch/stergachislab/mvollger/miniconda3/envs/fiberseq-smk/lib/python3.9/site-packages/pyd4/__init__.py", line 454, in _for_each_region
    ret.append(func(name, begin, end))
  File "/mmfs1/gscratch/stergachislab/mvollger/miniconda3/envs/fiberseq-smk/lib/python3.9/site-packages/pyd4/__init__.py", line 507, in load_to_np_impl
    self.load_values_to_buffer(name, begin, end, buf_addr)
OSError: memory map must have a non-zero length

But when I access the same region with d4tools it works fine:

$ d4tools view results/Phased_GM12878_pat/fdr.coverages.d4 chr1:0-100000 | head
chr1    0       10000   0       0       0       0       0       0       0       0       0       0       0       0
chr1    10000   10001   0       0       0       1       0       0       0       0       0       1       5       13
chr1    10001   10003   0       0       0       1       0       0       0       0       0       1       5       14
chr1    10003   10009   0       0       0       1       0       0       0       0       0       1       5       16
chr1    10009   10012   0       0       0       1       0       0       0       0       0       1       4       17
chr1    10012   10014   0       0       0       1       0       0       0       0       0       0       4       18
chr1    10014   10031   0       0       0       1       0       0       0       0       0       0       3       19
chr1    10031   10032   0       0       0       1       0       0       0       0       0       0       4       18
chr1    10032   10033   0       0       0       1       0       0       0       0       0       0       3       19
chr1    10033   10043   0       0       0       1       0       0       0       0       0       0       2       20

Here is a link to the file:
https://eichlerlab.gs.washington.edu/help/mvollger/tracks/fiberseq/fdr.coverages.d4
and here is the python code I have that gives the error:

import pyd4
import sys
import logging
import os

in_d4 ="./results/Phased_GM12878_pat/fdr.coverages.d4"
logging.info(f"Reading in d4 file: {in_d4}")
file = pyd4.D4File(in_d4)
logging.info(f"Opened d4 file: {in_d4}")
chroms = file.chroms()
matrix = file.open_all_tracks()
track_names = matrix.track_names
logging.info(f"Trying to open d4 matrix")
matrix["chr1", 0, 100000]
38 added a commit that referenced this issue Feb 20, 2023
@38
Copy link
Owner

38 commented Feb 20, 2023

Thanks for reporting the issue, it seems this is a bug related to the mapped IO interface. The reason why d4tools view doesn't have this issue is because d4tools view uses the streamed IO. I've committed a potential fix to the repo, please let me know if the latest commit solved your issue.

Thanks!
Hao

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants