-
Notifications
You must be signed in to change notification settings - Fork 594
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AddFlowBaseQuality tool #8235
AddFlowBaseQuality tool #8235
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dror27 - can you please annotate this tool as belonging to FlowBasedTools group?
We did this for FlowFeatureMapper
src/main/java/org/broadinstitute/hellbender/tools/walkers/groundtruth/AddFlowBaseQuality.java
Outdated
Show resolved
Hide resolved
src/main/java/org/broadinstitute/hellbender/tools/walkers/groundtruth/AddFlowBaseQuality.java
Show resolved
Hide resolved
src/main/java/org/broadinstitute/hellbender/tools/walkers/groundtruth/AddFlowBaseQuality.java
Outdated
Show resolved
Hide resolved
src/main/java/org/broadinstitute/hellbender/tools/walkers/groundtruth/AddFlowBaseQuality.java
Outdated
Show resolved
Hide resolved
@ilyasoifer - I have addressed your comments. Please review |
@meganshand - can you take a look please? It is a nice tool that converts the indel qualities to base qualities. Sometimes people are interested in that |
final double[][] errorProbBands = extractErrorProbBands(fbRead, minErrorRate); | ||
final double[] result = new double[fbRead.getBasesNoCopy().length]; | ||
|
||
// loop over hmers via flow key |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section isn't clear to me. Can you add some documentation about the flow key? I think I'm just missing the structure of the flow error probabilities. You can either add some more comments here, or just point the reader to something if it already exists elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a javadoc link to FlowBasedRead#flowMatrix where the flow probabilities are described.
@Argument(fullName = MAXIMAL_QUALITY_SCORE_LONG_NAME, doc = "clip quality score to the given value (phred)") | ||
public int maxQualityScore = 126; | ||
|
||
@Argument(fullName = REPLACE_QUALITY_MODE_LONG_NAME, doc = "replace existing base qualities while saving previous qualities to OQ (when true) or simply write to BQ (when false) ") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand the use case of writing to BQ without changing the actual base qualities. Both of these options make the file larger and contain the same information (since you continue to store the old base qualities either in the QUAL field or in the OQ tag and you store the new qualities either in the BQ tag or the QUAL field). Is there a reason to not replace the QUAL field and only keep the new qualities in the BQ tag?
Additionally the BQ tag is reserved in the spec for "Offset to base alignment quality (BAQ)" so might not be the best choice to put these new base qualities.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I understand the two modes, they do not involve storing the same information twice.
In the 'replace' mode, the QUAL field is replaced with qualities computed from the flow probabilities .The old QUAL field is saved in OQ.
In the 'non replace mode', the QUAL field is preserved while a new quality string, computed from the flow probabilities, is saved in BQ.
It is assumed that computed probabilities will be different than the original QUAL - giving rise to this tool.
As to the BQ being reserved, I have replaced with XQ, which is the user defined space.
- added link to flow based prob doc - renamed BQ to XQ to avoid clash with standard
A new tool, for flow based fils, that writes reads from SAM format file (SAM/BAM/CRAM) that pass criteria to a new file while adding a base-quality attribute (BQ)