RETRO: RuntimeError: stack expects each tensor to be equal size, but got [2, 32] at entry 0 and [1, 32] at entry 29 #135

mocarsha · 2022-07-21T15:01:53Z

Hi,

Running the exact code on github for deepmind's retrieval transformer - RETRO, getting the following error:

RuntimeError: stack expects each tensor to be equal size, but got [2, 32] at entry 0 and [1, 32] at entry 29

Could you please help me with this? I used the same dataset as in the code.

The text was updated successfully, but these errors were encountered:

vpj · 2022-08-07T10:24:35Z

Can you please provide the full error?

Zahin112 · 2024-06-26T03:12:19Z

I am having same issue while running train.py. Here's the full detailed error:

Load data...[DONE] 2.39ms
Tokenize...[DONE] 29.36ms
Build vocabulary...[DONE] 0.62ms
Load BERT tokenizer...[DONE] 340.26ms
Load BERT model...[DONE] 882.21ms
Load index...[DONE] 69.50ms
2024-06-25 11:59:59.603955: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-06-25 11:59:59.604002: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-06-25 11:59:59.605299: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-06-25 11:59:59.611551: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-06-25 12:00:00.750669: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
No labml server url specified. Please start a labml server and specify the URL. Docs: https://github.com/labmlai/labml/tree/master/app

retro_small: 706e157632ea11ef989a0242ac1c000c
[clean]: "cleanup notebooks"
116: Train: 5% 88,760ms loss.train: 3.71168 88,760ms 0:00m/ 0:47m Traceback (most recent call last):
File "/content/annotated_deep_learning_paper_implementations/labml_nn/transformers/retro/train.py", line 225, in
train()
File "/content/annotated_deep_learning_paper_implementations/labml_nn/transformers/retro/train.py", line 213, in train
trainer()
File "/content/annotated_deep_learning_paper_implementations/labml_nn/transformers/retro/train.py", line 134, in call
for i, (src, tgt, neighbors) in monit.enum('Train', self.dataloader):
File "/usr/local/lib/python3.10/dist-packages/labml/internal/monitor/iterator.py", line 84, in next
next_value = next(self._iterator)
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py", line 631, in next
data = self._next_data()
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py", line 675, in _next_data
data = self._dataset_fetcher.fetch(index) # may raise StopIteration
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/fetch.py", line 51, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/usr/local/lib/python3.10/dist-packages/labml_nn/transformers/retro/dataset.py", line 131, in getitem
neighbors = torch.stack([torch.stack([self.tds.text_to_i(n) for n in chunks]) for chunks in s[2]])
RuntimeError: stack expects each tensor to be equal size, but got [2, 32] at entry 0 and [1, 32] at entry 31

Also, it says "No labml server url specified. Please start a labml server and specify the URL.". Do I need to create the server? is it required? can you explain please?

mocarsha closed this as completed Jul 21, 2022

mocarsha reopened this Jul 21, 2022

vpj self-assigned this Aug 7, 2022

vpj added the question Further information is requested label Aug 7, 2022

vpj closed this as completed Aug 27, 2022

vpj reopened this Jun 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RETRO: RuntimeError: stack expects each tensor to be equal size, but got [2, 32] at entry 0 and [1, 32] at entry 29 #135

RETRO: RuntimeError: stack expects each tensor to be equal size, but got [2, 32] at entry 0 and [1, 32] at entry 29 #135

mocarsha commented Jul 21, 2022 •

edited

Loading

vpj commented Aug 7, 2022

Zahin112 commented Jun 26, 2024 •

edited

Loading

RETRO: RuntimeError: stack expects each tensor to be equal size, but got [2, 32] at entry 0 and [1, 32] at entry 29 #135

RETRO: RuntimeError: stack expects each tensor to be equal size, but got [2, 32] at entry 0 and [1, 32] at entry 29 #135

Comments

mocarsha commented Jul 21, 2022 • edited Loading

vpj commented Aug 7, 2022

Zahin112 commented Jun 26, 2024 • edited Loading

mocarsha commented Jul 21, 2022 •

edited

Loading

Zahin112 commented Jun 26, 2024 •

edited

Loading