[ ] Count amount of readsegs
in original fastq file (no way to do that later on in processing) for a better rpkm
score.
[ ] Potentially create a diff (to for example keep only the viruses found in the tumor) or merge the two files in the csv
so you can see it yourself (https://stackoverflow.com/questions/16265831/merging-two-csv-files-using-python, https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.merge.html)
[ ] "Remove the ones mapped to phage" (what does xiaowei mean by this??)
[x] Find the virus name based on the loci and annotate output with it (all the top ones can be labeled with Homo – potentially human contigs that were not in the reference sequence, in other words, they were unlikely to be viral)