Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Task denoising method dca #431

Merged
Merged
Show file tree
Hide file tree
Changes from 174 commits
Commits
Show all changes
189 commits
Select commit Hold shift + click to select a range
dc0c756
Create alra.py
wes-lewis Mar 22, 2021
6b19b08
pre-commit
github-actions[bot] Mar 22, 2021
8d922dc
import alra
scottgigante-immunai Mar 16, 2022
431110e
pre-commit
github-actions[bot] Mar 16, 2022
72b04e8
set alra version
scottgigante-immunai Mar 16, 2022
41246b7
Merge branch 'scottgigante/bugfix/alra_import' of github.com:scottgig…
scottgigante-immunai Mar 16, 2022
9e88b0b
Merge branch 'main' into scottgigante/bugfix/alra_import
scottgigante-immunai Mar 16, 2022
718dd93
split up alra oneliner
scottgigante-immunai Mar 17, 2022
09f2a4e
debug
scottgigante-immunai Mar 17, 2022
9616ee7
fix syntax error
scottgigante-immunai Mar 17, 2022
21835a3
pre-commit
github-actions[bot] Mar 17, 2022
3f8afb7
use dgCMatrix
scottgigante-immunai Mar 17, 2022
01e904a
Merge branch 'scottgigante/bugfix/alra_import' of github.com:scottgig…
scottgigante-immunai Mar 17, 2022
5936885
Merge branch 'main' into scottgigante/bugfix/alra_import
scottgigante Mar 17, 2022
f776042
output is stored in obsm
scottgigante-immunai Mar 29, 2022
a45ebb4
remove prints
scottgigante-immunai Mar 29, 2022
29d1ebc
Merge branch 'main' into scottgigante/bugfix/alra_import
scottgigante-immunai Apr 1, 2022
6cb82e5
Merge branch 'main' into scottgigante/bugfix/alra_import
scottgigante-immunai Apr 7, 2022
47d4147
Merge branch 'method-alra' into scottgigante/bugfix/alra_import
wes-lewis Apr 11, 2022
13a7c0b
pre-commit
github-actions[bot] Apr 11, 2022
858065a
Merge pull request #1 from scottgigante-immunai/scottgigante/bugfix/a…
wes-lewis Apr 11, 2022
7fe93fb
Merge branch 'main' into method-alra
scottgigante-immunai Apr 12, 2022
655d1ec
Update alra.py
wes-lewis Apr 19, 2022
c8a28af
fix csr casting
wes-lewis Apr 19, 2022
abe564c
Update alra.py
wes-lewis Apr 19, 2022
00e3b12
pre-commit
github-actions[bot] Apr 19, 2022
b31052c
add ValueError
wes-lewis Apr 19, 2022
c7665a2
pre-commit
github-actions[bot] Apr 19, 2022
3104532
simplify ValueError to avoid errors
wes-lewis Apr 19, 2022
0a6bd4e
pre-commit
github-actions[bot] Apr 19, 2022
3143e25
cast to array for MSE
wes-lewis Apr 20, 2022
a18f887
pre-commit
github-actions[bot] Apr 20, 2022
8776950
separate error line functions
wes-lewis Apr 25, 2022
0070c9d
Remove to_array()
wes-lewis Apr 26, 2022
6b0386f
pre-commit
github-actions[bot] Apr 26, 2022
4e2e7c2
Merge branch 'main' into method-alra
scottgigante-immunai May 2, 2022
f18319c
Merge branch 'main' into method-alra
scottgigante-immunai May 5, 2022
8702150
try casting to a matrix one more time
wes-lewis May 9, 2022
38f6ee6
notate that wes' ALRA fork must be used instead
wes-lewis May 9, 2022
87cf784
pre-commit
github-actions[bot] May 9, 2022
f978871
source from wes' code
wes-lewis May 9, 2022
36fb615
Merge branch 'main' into method-alra
scottgigante-immunai May 9, 2022
2bcfb8c
fix URL
wes-lewis May 9, 2022
3e54ab7
shorten line lengths
scottgigante-immunai May 10, 2022
62f8359
Check output is ndarray
scottgigante-immunai May 10, 2022
06db9eb
Merge branch 'main' into method-alra
scottgigante-immunai May 13, 2022
8c4f17c
Fix typo
scottgigante-immunai May 14, 2022
0649df5
Return dense data
scottgigante-immunai May 14, 2022
e5d4228
don't need tocsr now that the data is dense
scottgigante-immunai May 14, 2022
bc40e57
Merge branch 'main' into method-alra
scottgigante-immunai May 14, 2022
2e774d1
Return directly to denoised
scottgigante-immunai May 14, 2022
4bacd02
code cleanup
scottgigante-immunai May 14, 2022
710d55f
Revert debugging
scottgigante-immunai May 14, 2022
9c50ef0
Don't edit adata.obsm['train']
scottgigante-immunai May 14, 2022
14d3ca3
access train_norm
scottgigante-immunai May 14, 2022
856b45f
Add warning about editing adata.obsm['train']
scottgigante-immunai May 14, 2022
438b208
pre-commit
github-actions[bot] May 14, 2022
4c8caaf
check train and test are not modified
scottgigante-immunai May 14, 2022
4d14a4c
pre-commit
github-actions[bot] May 14, 2022
4153803
Retry ALRA on failure
scottgigante-immunai May 14, 2022
26a3299
pre-commit
github-actions[bot] May 14, 2022
7e818f4
Switch t(as.matrix()) order
scottgigante-immunai May 14, 2022
3a59d98
Check dense data
scottgigante-immunai May 14, 2022
11f7290
Return sparse data
scottgigante-immunai May 14, 2022
ee148d1
Check input data is sparse
scottgigante-immunai May 14, 2022
1ff077e
Fix typo
scottgigante-immunai May 14, 2022
b7fa68b
pre-commit
github-actions[bot] May 14, 2022
91503da
Don't send the full AnnData to R
scottgigante-immunai May 14, 2022
ff7a7a4
Expect sparse input, dense array output
scottgigante-immunai May 14, 2022
a97aa51
train and test must be floats
scottgigante-immunai May 14, 2022
73121bd
Convert back to float
scottgigante-immunai May 14, 2022
a9754aa
Fail on final attempt
scottgigante-immunai May 14, 2022
6c5bd78
put the retry inside python
scottgigante-immunai May 14, 2022
7c52f57
Remove the retry from R
scottgigante-immunai May 14, 2022
ba7c4f5
pre-commit
github-actions[bot] May 14, 2022
5a7f1da
layers['counts'] might not be sparse
scottgigante-immunai May 15, 2022
71e1a98
pre-commit
github-actions[bot] May 15, 2022
a9bffa4
Log error each time
scottgigante-immunai May 15, 2022
4344cc7
import logging
scottgigante-immunai May 15, 2022
ba9e877
pre-commit
github-actions[bot] May 15, 2022
d19a78b
Better way to check matrices
scottgigante-immunai May 15, 2022
56fbf5a
pre-commit
github-actions[bot] May 15, 2022
dfa2874
fix array equal comparison
scottgigante-immunai May 15, 2022
1803505
add explicit comment
scottgigante-immunai May 15, 2022
ab39e50
More explicit toarray
scottgigante-immunai May 15, 2022
1abb2b8
Can't check for untouched train/test
scottgigante-immunai May 15, 2022
2bb2414
Don't import scprep
scottgigante-immunai May 15, 2022
6f21958
Merge branch 'main' into method-alra
scottgigante-immunai May 15, 2022
70837ee
Just use a fixed target_sum
scottgigante-immunai May 15, 2022
b23945b
Sample data should match API
scottgigante-immunai May 15, 2022
b984c90
pre-commit
github-actions[bot] May 15, 2022
caded4a
flake8
scottgigante-immunai May 16, 2022
7b8660b
no_denoising still needs to densify
scottgigante-immunai May 16, 2022
0987352
convert to csc
scottgigante-immunai May 16, 2022
55921bd
pre-commit
github-actions[bot] May 16, 2022
208a0d3
Convert to csr
scottgigante-immunai May 16, 2022
2acd17d
conversion of sparse doesn't work, try anndata
scottgigante-immunai May 16, 2022
8dfd0c8
accept sce
scottgigante-immunai May 16, 2022
cad5391
pre-commit
github-actions[bot] May 16, 2022
bff9fa4
Convert to dense
scottgigante-immunai May 16, 2022
1fdaf16
pre-commit
github-actions[bot] May 16, 2022
b63619a
Convert to dense
scottgigante-immunai May 16, 2022
b71d239
pre-commit
github-actions[bot] May 16, 2022
35ab901
Try `.tocsr()`
scottgigante-immunai May 16, 2022
7dba013
Create dca.py
wes-lewis May 17, 2022
ff029ab
pre-commit
github-actions[bot] May 17, 2022
bbaf46a
Create dca.py
wes-lewis May 17, 2022
221f417
pre-commit
github-actions[bot] May 17, 2022
cf95450
add dca
wes-lewis May 17, 2022
2b71c93
add dca
wes-lewis May 17, 2022
ff44c8f
Update dca.py
wes-lewis May 17, 2022
97f47ad
Update dca.py
wes-lewis May 17, 2022
82b28d3
pre-commit
github-actions[bot] May 17, 2022
584ea3f
Update dca.py
wes-lewis May 17, 2022
e214e98
Update dca.py
wes-lewis May 17, 2022
87e3c09
Delete dca.py
wes-lewis May 17, 2022
6c80a67
Update requirements.txt
wes-lewis May 17, 2022
72209dd
Update __init__.py
wes-lewis May 17, 2022
6c5d803
pre-commit
github-actions[bot] May 17, 2022
84a9ae7
Update dca.py
wes-lewis May 17, 2022
551201d
pre-commit
github-actions[bot] May 17, 2022
81cd986
Update dca.py
wes-lewis May 17, 2022
61064f4
pre-commit
github-actions[bot] May 17, 2022
6e9e03e
put dca import inside method
wes-lewis May 19, 2022
25d0869
pre-commit
github-actions[bot] May 19, 2022
b0c7c00
Update dca.py
wes-lewis May 19, 2022
6917611
Merge branch 'method-alra' of https://github.com/wes-lewis/SingleCell…
wes-lewis Jun 7, 2022
d5513fb
Merge branch 'master' into task-denoising-method-dca
wes-lewis Jun 7, 2022
9d36c83
Merge pull request #7 from openproblems-bio/main
wes-lewis Jun 7, 2022
51af989
Update requirements.txt
wes-lewis Jun 13, 2022
792ae3c
pre-commit
github-actions[bot] Jun 13, 2022
181fb62
Merge branch 'main' into task-denoising-method-dca
scottgigante-immunai Jun 13, 2022
18eb84c
Create README.md
wes-lewis Jun 13, 2022
a2604e3
Update README.md
wes-lewis Jun 13, 2022
72ef3dd
Create Dockerfile
wes-lewis Jun 13, 2022
4c815ed
Create requirements.txt
wes-lewis Jun 13, 2022
ba764e2
pre-commit
github-actions[bot] Jun 13, 2022
3d5d5f5
Create requirements.txt
wes-lewis Jun 13, 2022
02a103f
pre-commit
github-actions[bot] Jun 13, 2022
70c9c08
remove dca from python-extras readme
wes-lewis Jun 14, 2022
f101cd6
fix image specification
wes-lewis Jun 14, 2022
f2b94b9
remove dca from here
wes-lewis Jun 14, 2022
ea5765d
Update Dockerfile
wes-lewis Jun 14, 2022
5c82a75
pin dca 0.3*
wes-lewis Jun 14, 2022
eb7c857
Update dca.py
wes-lewis Jun 14, 2022
6410a87
Update __init__.py
wes-lewis Jun 14, 2022
3f90c05
Update requirements.txt
wes-lewis Jun 14, 2022
cb21310
Update README.md
wes-lewis Jun 14, 2022
e15b50b
Update README.md
wes-lewis Jun 14, 2022
45c68b9
Update README.md
wes-lewis Jun 14, 2022
c1c22cf
Update requirements.txt
wes-lewis Jun 14, 2022
1aa43c8
Update `check_version` api
scottgigante-immunai Jun 14, 2022
6d3840d
Merge branch 'main' into task-denoising-method-dca
scottgigante-immunai Jun 14, 2022
e1cc3a3
Require pyyaml==5.4.1 to prevent kopt error
scottgigante-immunai Jun 14, 2022
7c84ae5
pre-commit
github-actions[bot] Jun 14, 2022
3dbc058
Fix keras version
scottgigante-immunai Jun 14, 2022
193d724
Update dca.py
wes-lewis Jun 15, 2022
4838543
pre-commit
github-actions[bot] Jun 15, 2022
40a6f5d
Update dca.py
wes-lewis Jun 15, 2022
81a66c4
pre-commit
github-actions[bot] Jun 15, 2022
fa8de52
Merge pull request #9 from openproblems-bio/main
wes-lewis Jun 20, 2022
dd57e02
Update dca.py
wes-lewis Jun 21, 2022
d67622b
pre-commit
github-actions[bot] Jun 21, 2022
7e4e68f
Update dca.py
wes-lewis Jun 21, 2022
019220f
pre-commit
github-actions[bot] Jun 21, 2022
bd497c9
Add test args
wes-lewis Jun 21, 2022
e20b6c7
Merge branch 'main' into task-denoising-method-dca
scottgigante-immunai Jun 21, 2022
fd45114
fix thread count and pass epochs to dca
wes-lewis Jun 27, 2022
5811dbb
pre-commit
github-actions[bot] Jun 27, 2022
4ceed52
add in masking
wes-lewis Jun 27, 2022
8df6f8a
pre-commit
github-actions[bot] Jun 27, 2022
d5b946b
Merge branch 'main' into task-denoising-method-dca
scottgigante-immunai Jun 27, 2022
6c63c97
Update README.md
wes-lewis Jun 28, 2022
cd81217
Update README.md
wes-lewis Jun 28, 2022
cac88e7
add removezeros and insert_at functions
wes-lewis Jun 28, 2022
ab1fcfb
pre-commit
github-actions[bot] Jun 28, 2022
ca0e758
Update dca.py
wes-lewis Jun 28, 2022
73e4355
pre-commit
github-actions[bot] Jun 28, 2022
a3d423d
Remove zero counts from train data
scottgigante-immunai Jul 5, 2022
ecfbc67
Remove filtering from DCA
scottgigante-immunai Jul 5, 2022
852a5dd
Remove unused code
scottgigante-immunai Jul 5, 2022
77f4de9
pre-commit
github-actions[bot] Jul 5, 2022
234090b
Don't need a line break
scottgigante-immunai Jul 5, 2022
7667668
Update utils.py
scottgigante-immunai Jul 5, 2022
d07e2a1
pre-commit
github-actions[bot] Jul 5, 2022
1da6d7c
Use epochs if passed
scottgigante-immunai Jul 12, 2022
4475d55
Fix metric descriptions
scottgigante-immunai Jul 12, 2022
001bbf8
Merge branch 'main' into task-denoising-method-dca
scottgigante-immunai Jul 12, 2022
2749be8
don't compute coverage on non-test args
scottgigante-immunai Jul 12, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions docker/openproblems-python-tf2.4/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
FROM singlecellopenproblems/openproblems:latest

ARG NB_USER="sagemaker-user"
ARG NB_UID="1000"
ARG NB_GID="100"

USER root
WORKDIR /

# Install Python packages
COPY ./docker/openproblems-python-tf2.4/requirements.txt ./requirements.txt
RUN pip install --no-cache-dir -r requirements.txt

USER $NB_UID
WORKDIR /home/$NB_USER
14 changes: 14 additions & 0 deletions docker/openproblems-python-tf2.4/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# openproblems-python-tf2.4 Docker image

Base image: singlecellopenproblems/openproblems

OS: Debian Stretch

Python: 3.8

Python packages:


* keras >=2.4,<2.6
* tensorflow >=2.4,<2.5
* dca
4 changes: 4 additions & 0 deletions docker/openproblems-python-tf2.4/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
dca==0.3.*
keras>=2.4,<2.6 # pinned in dca
pyyaml==5.4.1 # pinned in #431
tensorflow==2.4.* # pinned in dca
7 changes: 4 additions & 3 deletions openproblems/tasks/denoising/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,10 +8,11 @@ A key challenge in evaluating denoising methods is the general lack of a ground

# The metrics
scottgigante-immunai marked this conversation as resolved.
Show resolved Hide resolved

Metrics for data denoising aim to
Metrics for data denoising aim to assess denoising accuracy by comparing the denoised *training* set to the randomly sampled *test* set. Two comparisons have been implemented, *MSE* and *Poisson*, which penalize differences between the denoised *train* and *test* set under gaussian or poisson loss functions, respectively.

* **TODO**: TODO
* **TODO**: TODO
The *MSE* metric multiplies the *denoised* data by the rowsums of the *test* data, and divides by the sum of the *train* data. The result becomes a normalized version of the *denoised* data, which is compared to the *test* data via gaussian MSE.

The *Poisson* metric multiplies the *denoised* data by the rowsums of the *test* data, and divides by the sum of the *train* data. The result becomes a normalized version of the *denoised* data, which is compared to the *test* data via poisson MSE.

## API

Expand Down
1 change: 1 addition & 0 deletions openproblems/tasks/denoising/methods/__init__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
from .alra import alra
from .dca import dca
from .magic import magic
from .magic import magic_approx
from .no_denoising import no_denoising
40 changes: 40 additions & 0 deletions openproblems/tasks/denoising/methods/dca.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
from ....tools.decorators import method
from ....tools.utils import check_version

# import numpy as np
import scanpy as sc


def _dca(adata, test=False, epochs=None):
if test:
epochs = 30
scottgigante-immunai marked this conversation as resolved.
Show resolved Hide resolved
else:
epochs = epochs or 300
wes-lewis marked this conversation as resolved.
Show resolved Hide resolved
from dca.api import dca

# find all-zero genes (columns)
gene_sums = np.asarray(adata.obsm["train"].sum(axis=0)).flatten()
is_missing = gene_sums == 0
# make adata object with train counts
adata2 = sc.AnnData(adata.obsm["train"])
# mask all-zero genes
adata2.X[:, is_missing] = 1
# run DCA
dca(adata2, epochs=epochs)
adata.obsm["denoised"] = adata2.X # adata2.X should call the count matrix of DCA.
# return masked values to zero
adata.obsm["denoised"][:.is_missing] = 0
adata.uns["method_code_version"] = check_version("dca")
return adata


@method(
method_name="DCA",
paper_name="Single-cell RNA-seq denoising using...",
paper_url="https://www.nature.com/articles/s41467-018-07931-2",
paper_year=2019,
code_url="https://github.com/theislab/dca",
image="openproblems-python-tf2.4",
)
def dca(adata, test=False, epochs=None):
return _dca(adata, test=test, epochs=epochs)