Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate command line for report without external call to ps #117

Merged
merged 1 commit into from
Sep 30, 2022

Conversation

boulund
Copy link
Contributor

@boulund boulund commented Sep 30, 2022

An improvement of the suggestion in #116
I originally discovered this issue when trying to run KrakenUniq in Singularity with the Biocontainer built from the conda releases (in this case the container with tag 1.0.0--pl5321h19e8d03_0) where the version of ps used apparently doesn't support the same command line arguments as were used in the original report generation code to produce a string with the command line used to call krakenuniq.

Here are examples of the command lines of the report:

Original code

# CL:perl /ceph/home/boulund/work/stag-uniq/.snakemake/conda/207ba047bd7ea05bc7d1265affc0c46c/bin/krakenuniq --db /ceph/db/krakenuniq/minikraken_20171013_4GB --threads 4 --output output_dir/krakenuniq/test1.kraken.gz --report-file output_dir/krakenuniq/test1.kreport --paired output_dir/host_removal/test1_1.fq.gz output_dir/host_removal/test1_2.fq.gz

This PR

# CL:/ceph/home/boulund/work/stag-uniq/.snakemake/conda/207ba047bd7ea05bc7d1265affc0c46c/bin/krakenuniq --db /ceph/db/krakenuniq/minikraken_20171013_4GB --threads 4 --output output_dir/krakenuniq/test1.kraken.gz --report-file output_dir/krakenuniq/test1.kreport --preload-size 2G --paired output_dir/host_removal/test1_1.fq.gz output_dir/host_removal/test1_2.fq.gz

I know the initial perl isn't included, but it's easy to add if you find it important to keep.

@@ -230,7 +231,7 @@ if ($paired) {
my $cmd = $use_exact_counting? $CLASSIFY_EXACT : $CLASSIFY;
print STDERR "$cmd @flags\n";
if (defined $report_file) {
my $mycmd = qx/ps -o args= $$/;
my $mycmd = join(' ', $0, @cmdline);
open(my $RF, ">", $report_file) or die "Could not open report file for writing";
print $RF "# KrakenUniq v#####=VERSION=##### DATE:$date DB:@db_prefix DB_SIZE:$db_size WD:$wd\n# CL:$mycmd\n";
Copy link
Contributor Author

@boulund boulund Sep 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just noted an unintended side-effect of the suggestion in this PR: the report file no longer includes an empty line on line 3. If this is desired to ensure compatibility with older versions it should probably be added in again.

Edit: If possible, I would prefer not to have an empty line after the first two lines that are prefixed with # even if it breaks compatibility. It makes it more straightforward to parse it as a TSV using a library that ignores leading lines with comments before the real header line. The old format required skipping past the empty line to start parsing at the real header line at line 4.

Suggested change
print $RF "# KrakenUniq v#####=VERSION=##### DATE:$date DB:@db_prefix DB_SIZE:$db_size WD:$wd\n# CL:$mycmd\n";
print $RF "# KrakenUniq v#####=VERSION=##### DATE:$date DB:@db_prefix DB_SIZE:$db_size WD:$wd\n# CL:$mycmd\n\n";

@boulund
Copy link
Contributor Author

boulund commented Sep 30, 2022

@ctb suggested I put an explanation about the original issue in the PR, in the interest of overexplaining :)

The original code was using a non-portable way to generate a string with the command line used to call the program. This string is put as the second line of output in the report file (see my examples earlier). It was using the perl function qx to call an external program to generate that command line string, in this case ps.
However, it's not the optimal way to go about it when generating such a string, mainly since it's an unnecessary call to an external program for something that can be done entirely in Perl. The reason that this is an issue was made clear when I tried running KrakenUniq in a container environment that uses a different version of ps that apparently doesn't support the same arguments.
The suggested PR simply generates the same string entirely in Perl, avoiding the call to the external ps program and thus avoiding an unnecessary dependency on an external program (that also wasn't included in the list of declared dependencies as far as I know).

@alekseyzimin alekseyzimin merged commit cc1ab3e into fbreitwieser:master Sep 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants