How to Maximize 5hmC Calling Efficiency #326

andrew-galbraith · 2022-10-18T20:25:33Z

Hello I am attempting to run megalodon for 5hmC calling on 100s of cancer nanopore samples and so far have gotten a couple runs to work. However, to run megalodon on the whole genome these runs have taken a week to run. I am wondering what parameters would be ideal for optimizing megalodon efficiency. Here are the current run parameters I am using:

#SBATCH --mem-per-cpu=64gb
#SBATCH --gres=gpu:2

megalodon <path_to_fast5s_folder> --guppy-server-path <path_to_guppy_6.38_server> --guppy-config dna_r9.4.1_450bps_sup_prom.cfg --reference <path_to_reference> --remora-modified-bases dna_r9.4.1_e8 sup 0.0.0 5hmc_5mc CG 0 --device 0 1 --outputs per_read_mods basecalls --chunk-size 500 --max-concurrent-chunks 100

Currently, I have tried using 1-10 fast5 files per run separating each set of fast5 files to their own folder for the whole genome. I am doing these runs on a local cluster of gpus with fairly limited resources. Please, let me know if you think there are any flaws with this approach and what I could alter to optimize efficiency. I know their are the fast remora models but ideally I don't want to compromise any accuracy. Let me know what your thoughts are thanks!

marcus1487 · 2022-10-19T17:27:54Z

I would recommend using Guppy or Dorado for modified base calling going forward. Megalodon is not being supported going forward and you are likely to get much better performance from the production basecallers where the Remora models have been integrated and optimized. If there is something missing from the outputs of the production basecallers in terms of modified base support please raise those issues there.

andrew-galbraith · 2022-10-19T23:03:34Z

Hello Marcus,

Sounds good! Thank you we had tried guppy but had subpar results; we'll try again with the newest version.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to Maximize 5hmC Calling Efficiency #326

How to Maximize 5hmC Calling Efficiency #326

andrew-galbraith commented Oct 18, 2022

marcus1487 commented Oct 19, 2022

andrew-galbraith commented Oct 19, 2022

How to Maximize 5hmC Calling Efficiency #326

How to Maximize 5hmC Calling Efficiency #326

Comments

andrew-galbraith commented Oct 18, 2022

marcus1487 commented Oct 19, 2022

andrew-galbraith commented Oct 19, 2022