Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: option for listing content with full file paths #685

Open
nylander opened this issue Mar 14, 2024 · 0 comments
Open

Feature request: option for listing content with full file paths #685

nylander opened this issue Mar 14, 2024 · 0 comments

Comments

@nylander
Copy link

nylander commented Mar 14, 2024

Hi,

Feature request: provide an option for data ls to print full file paths.

Description

If one would choose to download only a part of the file hierarchy in a data project, one can use the option --source-path-file FILE for dds data get. The expected content in FILE is a list of strings with file paths on the file server.
In order to create such a file for "cherry picking", one would first need to know the full paths on the server (given by using dds data ls --tree or in combination with --json), and then exclude all content that is not wanted for download.
Creating such a file from the current dds output is challenging (manually cut-and-paste, or try to write a parser for the tree-like output, or write a parser for the json format, then delete).

On the other hand, of we had full file paths, one could, for example, pipe the output to grep and easily get selections from the file hierarchy (and know the full file paths for dds data get).

I started trying to add the feature to data_lister.py, options.py, and __main__.py, but the recursive calls to the API etc quickly made the code too complicated for me :-)

Example

Current dds data ls --tree output

    $ dds data ls --project "xxxx" --tree
    ...
    Files & directories in project: xxxx/
    └── P30753/
        ├── 00-Reports/
        │   ├── xyz_24_06_lanes_info.txt
        │   ├── xyz_24_06_library_info.txt
        │   ├── xyz_24_06_project_summary.html
        │   ├── xyz_24_06_project_summary.md
        │   ├── xyz_24_06_qc_multiqc_report.html
        │   ├── xyz_24_06_sample_info.txt
        │   ├── P30753_experiments.xml
        │   ├── P30753_runs.xml
        │   └── manifestFiles/
        │       ├── P30753_1001.02-FASTQ.20240216_LH00202_0058_B22GHKHLT3.P30753_1001_S93_L006_manifest.txt
        │       ├── P30753_1001.02-FASTQ.20240216_LH00202_0058_B22GHKHLT3.P30753_1001_S93_L007_manifest.txt
        [...]
        ├── ACKNOWLEDGEMENTS.txt
        ├── DELIVERY.README.RAW_DATA.txt
        ├── P30753_1001/
        │   └── 02-FASTQ/
        │       └── 20240216_LH00202_0058_B22GHKHLT3/
        │           ├── P30753_1001_S93_L006_R1_001.fastq.gz
        │           ├── P30753_1001_S93_L006_R2_001.fastq.gz
        │           ├── P30753_1001_S93_L007_R1_001.fastq.gz
        │           └── P30753_1001_S93_L007_R2_001.fastq.gz
        [...]

Desired usage and output

    $ dds data ls --project "xxxx" --file-paths

    xxxx/P30753/00-Reports/xyz_24_06_lanes_info.txt
    xxxx/P30753/00-Reports/xyz_24_06_library_info.txt
    xxxx/P30753/00-Reports/xyz_24_06_project_summary.html
    xxxx/P30753/00-Reports/xyz_24_06_project_summary.md
    xxxx/P30753/00-Reports/xyz_24_06_qc_multiqc_report.html
    xxxx/P30753/00-Reports/xyz_24_06_sample_info.txt
    xxxx/P30753/00-Reports/P30753_experiments.xml
    xxxx/P30753/00-Reports/P30753_runs.xml
    xxxx/P30753/00-Reports/manifestFiles/P30753_1001.02-FASTQ.20240216_LH00202_0058_B22GHKHLT3.P30753_1001_S93_L006_manifest.txt
    xxxx/P30753/00-Reports/manifestFiles/P30753_1001.02-FASTQ.20240216_LH00202_0058_B22GHKHLT3.P30753_1001_S93_L007_manifest.txt
    [...]
    xxxx/P30753/ACKNOWLEDGEMENTS.txt
    xxxx/P30753/DELIVERY.README.RAW_DATA.txt
    xxxx/P30753/P30753_1001/02-FASTQ/20240216_LH00202_0058_B22GHKHLT3/P30753_1001_S93_L006_R1_001.fastq.gz
    xxxx/P30753/P30753_1001/02-FASTQ/20240216_LH00202_0058_B22GHKHLT3/P30753_1001_S93_L006_R2_001.fastq.gz
    xxxx/P30753/P30753_1001/02-FASTQ/20240216_LH00202_0058_B22GHKHLT3/P30753_1001_S93_L007_R1_001.fastq.gz
    xxxx/P30753/P30753_1001/02-FASTQ/20240216_LH00202_0058_B22GHKHLT3/P30753_1001_S93_L007_R2_001.fastq.gz
    [...]

Cheers
Johan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant