You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am writing to inquire about an issue I encountered when running the PGAP with my genome data. As part of the pipeline, PGAP provides CheckM results. However, I noticed a significant discrepancy between the CheckM results reported by PGAP and those obtained when I ran CheckM standalone on the same genome.
The genome data I used for both analyses is identical. Also there was no specific difference in GC contents, total genome size, N50, gene counts compared to ncbi complete reference genome.
Also I ran CheckM2
CheckM2:
Name Completeness Contamination Completeness_Model_Used Translation_Table_Used Coding_Density Contig_N50 Average_Gene_Length Genome_Size GC_Content Total_Coding_Sequences Total_Contigs Max_Contig_Length Additional_Notes
1C88_PGAP 81.25 6.64 Neural Network (Specific Model) 11 0.797 2487636 199.0716743119266 2589755 0.32 3488 5 2487636 None
Are there any specific parameters or configurations in PGAP's implementation of CheckM that could explain the observed differences?
Could this discrepancy be related to differences in the CheckM database versions or other factors?
Your guidance on this matter would be greatly appreciated. Please let me know if you require any additional details about the genome or the analysis setup.
Thank you for your support.
The text was updated successfully, but these errors were encountered:
I am writing to inquire about an issue I encountered when running the PGAP with my genome data. As part of the pipeline, PGAP provides CheckM results. However, I noticed a significant discrepancy between the CheckM results reported by PGAP and those obtained when I ran CheckM standalone on the same genome.
PGAP code: ./pgap.py -n -o /result -g fasta/1C88.fasta -s "Staphylococcus epidermidis" --no-internet --debug --docker singularity --container-path ./pgap.sif --ignore-all-errors
PGAP result: Bin Id Marker lineage # genomes # markers # marker sets 0 1 2 3 4 5+ Completeness Contamination Strain heterogeneity
annotation Staphylococcus epidermidis (6) 20 933 208 467 465 0 1 0 0 49.85 0.32 33.33
CheckM code: checkm analyze staphylococcus_epidermidis.ms ./1C88_bins ./1C88_staphylococcus_outputtttt -x fna -t 30
CheckM result: Bin Id Marker lineage # genomes # markers # marker sets 0 1 2 3 4 5+ Completeness Contamination Strain heterogeneity
1C88_PGAP Staphylococcus epidermidis (6) 20 933 208 53 873 5 2 0 0 94.51 1.12 18.18
The genome data I used for both analyses is identical. Also there was no specific difference in GC contents, total genome size, N50, gene counts compared to ncbi complete reference genome.
Also I ran CheckM2
CheckM2:
Name Completeness Contamination Completeness_Model_Used Translation_Table_Used Coding_Density Contig_N50 Average_Gene_Length Genome_Size GC_Content Total_Coding_Sequences Total_Contigs Max_Contig_Length Additional_Notes
1C88_PGAP 81.25 6.64 Neural Network (Specific Model) 11 0.797 2487636 199.0716743119266 2589755 0.32 3488 5 2487636 None
Are there any specific parameters or configurations in PGAP's implementation of CheckM that could explain the observed differences?
Could this discrepancy be related to differences in the CheckM database versions or other factors?
Your guidance on this matter would be greatly appreciated. Please let me know if you require any additional details about the genome or the analysis setup.
Thank you for your support.
The text was updated successfully, but these errors were encountered: