Skip to content

Commit

Permalink
Update spatial decomposition resource scripts (#429)
Browse files Browse the repository at this point in the history
* add resource scripts

* add spatial data summary to simulated dataset

* add simulated dataset properties in comment

* update id

* remove id from params.yaml

Co-authored-by: Kai Waldrant <[email protected]>

* add label highmem to config

* check if .obs contains key is_primary_data

---------

Co-authored-by: Kai Waldrant <[email protected]>
Former-commit-id: 949427c
  • Loading branch information
sainirmayi and KaiWaldrant authored Apr 16, 2024
1 parent 6f7ef2c commit c698895
Show file tree
Hide file tree
Showing 5 changed files with 61 additions and 13 deletions.
5 changes: 3 additions & 2 deletions src/tasks/spatial_decomposition/dataset_simulator/script.py
Original file line number Diff line number Diff line change
Expand Up @@ -186,10 +186,11 @@ def filter_genes_cells(adata):
umi_lb=par['umi_lb'],
umi_ub=par['umi_ub']
)
adata.uns["spatial_data_summary"] = f"Dirichlet alpha={par['alpha']}"
adata_merged.uns["spatial_data_summary"] = f"Dirichlet alpha={par['alpha']}"
filter_genes_cells(adata_merged)
adata_merged.X = None
adata_merged.obs['is_primary_data'] = adata_merged.obs['is_primary_data'].fillna(False)
if "is_primary_data" in adata_merged.obs:
adata_merged.obs['is_primary_data'] = adata_merged.obs['is_primary_data'].fillna(False)

print("Writing output to file")
adata_merged.write_h5ad(par["simulated_data"])
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
#!/bin/bash

# Simulating spot-resolution spatial data with alpha = 1

cat > /tmp/params.yaml << 'HERE'
id: spatial_decomposition_process_datasets
input_states: s3://openproblems-data/resources/datasets/**/state.yaml
settings: '{"output_spatial_masked": "$id/spatial_masked.h5ad", "output_single_cell": "$id/single_cell_ref.h5ad", "output_solution": "$id/solution.h5ad", "alpha": 1.0, "simulated_data": "$id/dataset_simulated.h5ad"}'
rename_keys: 'input:output_dataset'
output_state: "$id/state.yaml"
publish_dir: s3://openproblems-data/resources/spatial_decomposition/datasets
HERE

cat > /tmp/nextflow.config << HERE
process {
executor = 'awsbatch'
withName:'.*publishStatesProc' {
memory = '16GB'
disk = '100GB'
}
withLabel:highmem {
memory = '350GB'
}
}
HERE

tw launch https://github.com/openproblems-bio/openproblems-v2.git \
--revision main_build \
--pull-latest \
--main-script target/nextflow/spatial_decomposition/workflows/process_datasets/main.nf \
--workspace 53907369739130 \
--compute-env 1pK56PjjzeraOOC2LDZvN2 \
--params-file /tmp/params.yaml \
--entry-name auto \
--config /tmp/nextflow.config \
# --labels spatial_decomposition,process_datasets
22 changes: 22 additions & 0 deletions src/tasks/spatial_decomposition/resources_scripts/run_benchmark.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
#!/bin/bash

RUN_ID="run_$(date +%Y-%m-%d_%H-%M-%S)"
publish_dir="s3://openproblems-data/resources/spatial_decomposition/results/${RUN_ID}"

cat > /tmp/params.yaml << HERE
input_states: s3://openproblems-data/resources/spatial_decomposition/datasets/**/state.yaml
rename_keys: 'input_single_cell:output_single_cell,input_spatial_masked:output_spatial_masked,input_solution:output_solution'
output_state: "state.yaml"
publish_dir: "$publish_dir"
HERE

tw launch https://github.com/openproblems-bio/openproblems-v2.git \
--revision main_build \
--pull-latest \
--main-script target/nextflow/spatial_decomposition/workflows/run_benchmark/main.nf \
--workspace 53907369739130 \
--compute-env 1pK56PjjzeraOOC2LDZvN2 \
--params-file /tmp/params.yaml \
--entry-name auto \
--config src/wf_utils/labels_tw.config \
--labels spatial_decomposition,full
Empty file.
Original file line number Diff line number Diff line change
Expand Up @@ -11,17 +11,6 @@ cd "$REPO_ROOT"

set -e

# nextflow run . \
# -main-script target/nextflow/spatial_decomposition/workflows/process_datasets/main.nf \
# -profile docker \
# -entry auto \
# -c src/wf_utils/labels_ci.config \
# --id run_test \
# --input_states "resources_test/common/**/state.yaml" \
# --rename_keys 'input:output_dataset' \
# --settings '{"output_spatial_masked": "$id/spatial_masked.h5ad", "output_single_cell": "$id/single_cell_ref.h5ad", "output_solution": "$id/solution.h5ad"}' \
# --publish_dir "resources_test/spatial_decomposition"

# generate spatial dataset
nextflow run . \
-main-script target/nextflow/spatial_decomposition/workflows/process_datasets/main.nf \
Expand Down

0 comments on commit c698895

Please sign in to comment.