--cov_corr value #19

palomo11 · 2018-02-26T19:27:20Z

Hi,

I'm using RefineM with my genomes and 24 samples.

In the readme, it is suggested that:

"If you have more than 6 data point (i.e. BAM files) comprising your coverage profiles you may wish to consider using the coverage correlation criteria (--cov_corr) instead of or in addition to this absolute error criteria"

Which would be a recommended value to be used? 0.95?
Is it better to combine both or just use cov_corr?

In addition, could you explain a bit how exactly the cov_perc and the cov_corr are calculated?

Thank you very much in advance.

donovan-h-parks · 2018-02-26T20:24:43Z

Hello.

I'm not sure about the best threshold to use. I haven't had a chance to play with data where this filtering is relevant. It really depends on how conservative you want to be. My gut feeling is something a bit more lenient than 0.95 though. Perhaps 0.8???

The median coverage of a bin is the median across all contigs comprising a bin.

cov_corr: Pearson's r between the median coverage of a bin in each sample vs. the coverage of each individual contig.

cov_corr: This is the mean absolute error in coverage between a contig and the median coverage of a bin taken over all samples. For each sample this is given by: abs(coverage_contig - median_coverage_bin) * 100 / median_coverage_bin.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

--cov_corr value #19

--cov_corr value #19

palomo11 commented Feb 26, 2018 •

edited

Loading

donovan-h-parks commented Feb 26, 2018

--cov_corr value #19

--cov_corr value #19

Comments

palomo11 commented Feb 26, 2018 • edited Loading

donovan-h-parks commented Feb 26, 2018

palomo11 commented Feb 26, 2018 •

edited

Loading