Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

running megalodon with all context DNA methylation models #319

Open
AlineMuyle opened this issue Aug 29, 2022 · 3 comments
Open

running megalodon with all context DNA methylation models #319

AlineMuyle opened this issue Aug 29, 2022 · 3 comments

Comments

@AlineMuyle
Copy link

I am using megalodon to infer 5mC in all contexts CG, CHG and CHH. After some investigation, it seems that Megalodon version 2.5.0 + Guppy basecall server 6.2.1 do not work with old rerio models such as res_dna_r941_min_modbases-all-context_v001.cfg or res_dna_r941_min_modbases_5mC_v001.cfg

This is a known issue I have seen in various posts such as #292 and https://bytemeta.vip/repo/nanoporetech/megalodon/issues/266
Usually people use older versions (Megalodon v2.4.2 + guppy 5.0.16) to be able to run their analyses.

I would like to use the latest versions of the programs and therefore it would be useful if you could please make the necessary changes for a model to work in all contexts with current versions.

For your information, my current installation works with CG context only model dna_r9.4.1_450bps_fast.cfg but with all contexts models the job fails, here bellow is some more details.

I use the following configuration:

  • Python 3.9.5
  • guppy3 3.1.2
  • ont-pyguppy-client-lib 6.2.1
  • Guppy basecall server 6.2.1

And the following command line:
megalodon fast5/
--guppy-params "-d ./rerio/basecall_models/"
--guppy-config res_dna_r941_min_modbases-all-context_v001.cfg
--mod-motif m C 0
--guppy-server-path ./ont-guppy_6.2.1/bin/guppy_basecall_server
--output-directory ./male_megalodon_results_all_contexts --overwrite --mod-output-formats bedmethyl
--outputs basecalls mods mod_basecalls per_read_mods mod_mappings mappings
--reference scaffolds.fasta
--devices 0 1
--processes 44
--guppy-concurrent-reads 40
--guppy-timeout 120
--output-directory output-dir
--num-read-enumeration-threads 1
--num-extract-signal-processes 2 \

The job fails and I get the following guppy log:

2022-08-22 06:31:06.579141 [guppy/message] ONT Guppy basecall server software version 6.2.1+6588110, client-server API version 11.0.0, minimap2 version 2.22-r1101
log path: /home/amuyle/male_megalodon_results_all_contexts/guppy_log
chunk size: 2000
chunks per runner: 512
max queued reads: 2000
num basecallers: 4
num socket threads: 2
max returned events: 50000
gpu device: cuda:0 cuda:1
kernel path:
runners per device: 4
Use of this software is permitted solely under the terms of the end user license agreement (EULA).By running, copying or accessing this software, you are demonstrating your acceptance of the EULA.
The EULA may be found in /home/amuyle/ont-guppy_6.2.1/bin
2022-08-22 06:31:06.579653 [guppy/info] crashpad_handler not supported on this platform.
2022-08-22 06:31:06.580498 [guppy/info] Listening on port ipc:///tmp/ddf6-cc72-0497-139e.
2022-08-22 06:31:08.677890 [guppy/message]
Config loaded:
config file: /home/amuyle/ont-guppy_6.2.1/data/res_dna_r941_min_modbases_5mC_v001.cfg
model file: /home/amuyle/ont-guppy_6.2.1/data/res_dna_r941_min_modbases_5mC_v001.jsn
model version id None
adapter scaler model file: /home/amuyle/ont-guppy_6.2.1/data/adapter_scaling_dna_r9.4.1_min.jsn
2022-08-22 06:31:08.860612 [guppy/info] CUDA device 0 (compute 7.0) initialised, memory limit 16945709056B (16623796224B free)
2022-08-22 06:31:08.960819 [guppy/info] CUDA device 1 (compute 7.0) initialised, memory limit 16945709056B (16623796224B free)
2022-08-22 06:31:08.967429 [guppy/info] lamp_arrangements arrangement folder not found: /home/amuyle/ont-guppy_6.2.1/data/read_splitting/lamp_arrangements
2022-08-22 06:31:09.126687 [guppy/message] Starting server on port: ipc:///tmp/ddf6-cc72-0497-139e
2022-08-22 06:31:09.139738 [guppy/info] client connection request. ["res_dna_r941_min_modbases_5mC_v001:>timeout_interval=15000>client_name=>barcode_kits=>detect_barcodes=0>move_and_trace_enabled=1>post_out=1"]
2022-08-22 06:31:09.141183 [guppy/info] New client connected Client 1 anonymous_client_1 id: 0a549260-51c4-4d0b-9088-5040248d105e (connection string = 'res_dna_r941_min_modbases_5mC_v001:>timeout_interval=15000>client_name=>barcode_kits=>detect_barcodes=0>move_and_trace_enabled=1>post_out=1').

@marcus1487
Copy link
Collaborator

This is likely an issue with the megalodon and guppy interface. As megalodon is being deprecated I would recommend trying to use the latest Remora all-context model with Bonito. The next release of Remora will make model conversion much easier so that the all-context model can be used with Guppy as well (and Dorado in the near future).

@AlineMuyle
Copy link
Author

Could you please explain what you mean by the latest Remora all-context model ?
I just reinstalled remora and the following command line 'remora model list_pretrained' only shows CG models.

@AlineMuyle
Copy link
Author

I think I found it!
https://github.com/nanoporetech/rerio#remora-models
Thank you for your help

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants