Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: random_ expects 'from' to be less than 'to', but got from=0 >= to=0 #74

Open
Shengqian95 opened this issue Oct 16, 2024 · 5 comments

Comments

@Shengqian95
Copy link

Dear Dr. Rosen,

Many thanks for providing the wonderful integration tool. I have used SATURN for my cell type evolution project. While other cell types were successfully integrated, I met an error when I integrated fibroblast across 10 species (mouse and other 9 species).

Pretraining...
^M 0%| | 0/200 [00:00<?, ?it/s]^MEpoch 1: L1 Loss 0.0 Rank Loss 97.66471862792969, Avg Loss CPLA: 1212, Avg Loss EBUR: 940,
Saving Pretrain AnnData
^M 0%| | 0/32 [00:00<?, ?it/s]^M 3%|▎ | 1/32 [00:00<00:06, 4.50it/s]^M 6%|▋ | 2/32 [00:00<00:06, 4.83it
STARTING METRIC LEARNING
STARTING METRIC TRAINING
Epoch 1 Iteration 0: Loss = 0.14469335973262787, Number of mined triplets = 128462
Traceback (most recent call last):
File "/share/home/bin/miniconda3/envs/SATURN_CUDA121/SATURN/train-saturn.ncores.aggl.py", line 1104, in <modu
trainer(args)
File "/share/home/bin/miniconda3/envs/SATURN_CUDA121/SATURN/train-saturn.ncores.aggl.py", line 829, in traine
train(metric_model, loss_func, mining_func, device,
File "/share/home/bin/miniconda3/envs/SATURN_CUDA121/SATURN/train-saturn.ncores.aggl.py", line 103, in train
indices_tuple = mining_func(embeddings, labels, species, mnn=mnn)
File "/share/home/bin/miniconda3/envs/SATURN_CUDA121/lib/python3.9/site-packages/torch/nn/modules/module.py",
return self.call_impl(*args, **kwargs)
File "/share/home/bin/miniconda3/envs/SATURN_CUDA121/lib/python3.9/site-packages/torch/nn/modules/module.py",
return forward_call(*args, **kwargs)
File "/share/home/bin/miniconda3/envs/SATURN_CUDA121/SATURN/miners/base_miner.py", line 31, in forward
mining_output = self.mine(embeddings, labels, ref_emb, ref_labels, species, mnn)
File "/share/home/bin/miniconda3/envs/SATURN_CUDA121/SATURN/miners/triplet_margin_miner.py", line 35, in mine
anchor_idx, positive_idx, negative_idx = lmu.get_species_triplet_indices(
File "/share/home/bin/miniconda3/envs/SATURN_CUDA121/SATURN/utils/loss_and_miner_utils.py", line 423, in get

n_rand = torch.randint(0, len(poss_n_ind), (choice_size,))
RuntimeError: random_ expects 'from' to be less than 'to', but got from=0 >= to=0

I wonder whether it because the number of fibroblast subcluster varied greatly (see example below). But pairwise integration (like mouse vs another one species) is okay.

species 1, cell_type_final
c0 2790
c1 18
Name: count, dtype: int64
species 2, cell_type_final
c0 1753
c1 94
Name: count, dtype: int64
species 3, cell_type_final
c0 1067
c1 1004
Name: count, dtype: int64
...
species 10, cell_type_final
cel_type_final
c0 1989
c1 34
Name: count, dtype: int64

I'd appreciate it if you could get back to me. Thank you.

Best wishes,
Sheng

@Yanay1
Copy link
Collaborator

Yanay1 commented Nov 3, 2024

What batch size are you using? How many cell types are there that could possibly be negative examples (eg, from the same species but with a different cell type to the anchor cell?

@Shengqian95
Copy link
Author

Many thanks for your reply. The batch size and pretrain batch size are both 1024. The detailed parameters are below. There are only two sub cell types of fibroblast (as shown above, c0 and c1) within each species.

python3 -u /share/home/qiansheng/bin/miniconda3/envs/SATURN_CUDA121/SATURN/train-saturn.ncores.aggl.py
--in_data=./${input_data}.csv
--device_num=$device
--pretrain_batch_size 1024
--batch_size 1024
--in_label_col=cel_type_final
--ref_label_col=cel_type_final
--work_dir=./multiple_seeds_results/
--num_macrogenes=${nmac}
--pretrain
--model_dim=256
--polling_freq=201
--epochs=50
--hv_genes=0
--hv_span=1
--pretrain_epochs=200
--pe_sim_penalty=1.0
--l1_penalty=0
--centroid_score_func=default
--seed=0
--org=${input_data}l1_0_pe_1.0_ESM2_macrogenes${nmac}_hv_genes_0_centroid_score_func_default_batch_label_split
--embedding_model=ESM2 >s_mac${nmac}.log 2>&1

@Yanay1
Copy link
Collaborator

Yanay1 commented Nov 4, 2024

How many cells of each cell type are there?

@Shengqian95
Copy link
Author

The number of cells in each sub-cell type (c0, c1, c2, ...) is below.
species1
cel_type_final
c0 3006
c1 281
c2 39
Name: count, dtype: int64

species2
cel_type_final
c0 19918
c1 5631
c2 670
c3 648
c4 409
c5 345
Name: count, dtype: int64

species3
cel_type_final
c0 8410
c1 6952
c2 1324
Name: count, dtype: int64

species4
cel_type_final
c0 5091
c1 459
c2 55
Name: count, dtype: int64

species5
cel_type_final
c0 899
c1 668
c2 591
c3 16
Name: count, dtype: int64

species6
cel_type_final
c0 7016
c1 108
Name: count, dtype: int64

species7
cel_type_final
c0 1247
c1 1030
c2 585
Name: count, dtype: int64

species8
cel_type_final
c0 23188
c1 286
Name: count, dtype: int64

@Yanay1
Copy link
Collaborator

Yanay1 commented Nov 12, 2024

I think the most likely reason this is happening is that within a batch, there are not enough examples of the negative cell type for the triplet mining function.

Without making major modifications to the codebase, you could consider the following steps to try and avoid this:

  1. increase the batch size for metric learning (so that there are more likely to be possible negative examples)
  2. reduce the number of species being integrated
  3. balance the cell type frequencies more, perhaps by re-annotating or by re-sampling the dataset

Alternatively, you could also check out our method UCE https://www.biorxiv.org/content/10.1101/2023.11.28.568918v1 https://github.com/snap-stanford/UCE which won't have any of these issues since it doesn't require training a model from scratch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants