Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GatherGeneGCLength breaks on transcript writing #305

Open
1 task done
jamesnemesh opened this issue May 9, 2022 · 0 comments
Open
1 task done

GatherGeneGCLength breaks on transcript writing #305

jamesnemesh opened this issue May 9, 2022 · 0 comments

Comments

@jamesnemesh
Copy link
Collaborator

Instructions

Affected tool(s)

GatherGeneGCLength

At Broad:
Exception in thread "main" java.lang.IllegalStateException: the input sequence name 'S100A1' has already been added at htsjdk.samtools.reference.FastaReferenceWriter.startSequence(FastaReferenceWriter.java:408) at htsjdk.samtools.reference.FastaReferenceWriter.startSequence(FastaReferenceWriter.java:364) at org.broadinstitute.dropseqrna.utils.FastaSequenceFileWriter.writeSequence(FastaSequenceFileWriter.java:68) at org.broadinstitute.dropseqrna.annotation.GatherGeneGCLength.writeTranscriptSequence(GatherGeneGCLength.java:219) at org.broadinstitute.dropseqrna.annotation.GatherGeneGCLength.doWork(GatherGeneGCLength.java:138) at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:308) at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:103) at org.broadinstitute.dropseqrna.cmdline.DropSeqMain.main(DropSeqMain.java:42)

Affected version(s)

  • Latest development/master branch as of [5/9/2022]

Description

When GatherGeneGCLength to emit transcript sequences, program fails.

Steps to reproduce

Run on our human metadata:

/broad/mccarroll/software/dropseq/priv/GatherGeneGCLength ANNOTATIONS_FILE=/broad/mccarroll/software/metadata/individual_reference/GRCh38.89/GRCh38.gtf REFERENCE_SEQUENCE=/broad/mccarroll/software/metadata/individual_reference/GRCh38.89/GRCh38.fasta.gz O=GRCh38.gc.txt OUTPUT_TRANSCRIPT_LEVEL=true OUTPUT_TRANSCRIPT_SEQUENCES=GRCh38_maskedAlt.transcript_sequences.txt VALIDATION_STRINGENCY=LENIENT

Note: this is due to the functionality required by OUTPUT_TRANSCRIPT_SEQUENCES. If that argument is removed this program runs successfully.

Expected behavior

The transcript file should be written.

Actual behavior

Stack trace.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant