-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BAF from BAM #18
Comments
Following up on this with more detail. We currently generate BAF in
For these reasons, it would be a huge improvement to generate BAF directly from the CRAMs. Since BAF is only used for random forest training in
Some of this is dependent on this GATK PR, which adds sample metadata to |
Current BAF calculation relies on genotype specified in VCF (_is_het function). We'll need some way of classifying samples as hets from DP and AD alone. |
Update sv_pipeline_docker.yml
Addressed in #351 |
BAF generation could be improved by using GATK's
CollectAllelicCounts
on sample BAMs rather than gVCFs/VCFs, which is not only more costly but also operationally more complicated (eg see #8). I propose collecting only at gnomAD SNP sites with >0.1% AF.The text was updated successfully, but these errors were encountered: