-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The training model reported an error! #1
Comments
Here is my configuration file: batch_size: 64 # batch size for training fine_tune_from: # path to pre-trained model to fine-tune from std_margin: 1 # margin for the std values in the loss optimizer: adam # optimizer to use. Options: adam, adamw dataset: # dataset parameters scatnet: # scatnet parameters encoder_type: resnet # choices: resnet, deit hog: # histogram of oriented gradients parameters comment: stl10_self_supervised_scale_z |
You are using 16bit precision, while training on CPU. I do not think that is possible to do. Try changing the precision to 32 |
python main_self_supervised.py --config configs\stl10_self_supervised.yaml
E:\Anaconda\envs\mv_mr\lib\site-packages\torchvision\models_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and will be removed in 0.15, please use 'weights' instead.
warnings.warn(
E:\Anaconda\envs\mv_mr\lib\site-packages\torchvision\models_utils.py:223: UserWarning: Arguments other than a weight enum or
None
for 'weights' are deprecated since 0.13 and will be removed in 0.15. The current behavior is equivalent to passingweights=None
.warnings.warn(msg)
E:\Anaconda\envs\mv_mr\lib\site-packages\pytorch_lightning\trainer\connectors\accelerator_connector.py:898: UserWarning: You are running on single node with no parallelization, so distributed has no effect.
rank_zero_warn("You are running on single node with no parallelization, so distributed has no effect.")
E:\Anaconda\envs\mv_mr\lib\site-packages\pytorch_lightning\trainer\connectors\accelerator_connector.py:658: UserWarning: You passed
Trainer(accelerator='cpu', precision=16)
but native AMP is not supported on CPU. Usingprecision='bf16'
instead.rank_zero_warn(
Using bfloat16 Automatic Mixed Precision (AMP)
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
E:\Anaconda\envs\mv_mr\lib\site-packages\pytorch_lightning\trainer\configuration_validator.py:291: LightningDeprecationWarning: Base
Callback.on_train_batch_end
hook signature has changed in v1.5. Thedataloader_idx
argument will be removed in v1.7.rank_zero_deprecation(
Files already downloaded and verified
Files already downloaded and verified
Missing logger folder: lightning_logs\2023-09-19-15-52-00_stl10_self_supervised_scale_z
Files already downloaded and verified
Files already downloaded and verified
| Name | Type | Params
0 | _encoder | ResnetMultiProj | 174 M
1 | _loss_dc | DistanceCorrelation | 0
2 | _scatnet | Scattering2D | 0
3 | _hog | HOGLayer | 0
4 | _identity | Identity | 0
5 | online_finetuner | Linear | 20.5 K
174 M Trainable params
0 Non-trainable params
174 M Total params
698.194 Total estimated model params size (MB)
Validation sanity check: 0it [00:00, ?it/s]Files already downloaded and verified
Validation sanity check: 0%| | 0/2 [00:00<?, ?it/s]Traceback (most recent call last):
File "F:\Fangweijie\mv-mr-main\main_self_supervised.py", line 92, in
main(args)
File "F:\Fangweijie\mv-mr-main\main_self_supervised.py", line 74, in main
trainer.fit(module, ckpt_path=path_checkpoint)
File "E:\Anaconda\envs\mv_mr\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 740, in fit
self._call_and_handle_interrupt(
File "E:\Anaconda\envs\mv_mr\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 685, in _call_and_handle_interrupt
return trainer_fn(*args, **kwargs)
File "E:\Anaconda\envs\mv_mr\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 777, in _fit_impl
self._run(model, ckpt_path=ckpt_path)
File "E:\Anaconda\envs\mv_mr\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1199, in _run
self._dispatch()
File "E:\Anaconda\envs\mv_mr\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1279, in _dispatch
self.training_type_plugin.start_training(self)
File "E:\Anaconda\envs\mv_mr\lib\site-packages\pytorch_lightning\plugins\training_type\training_type_plugin.py", line 202, in start_training
self._results = trainer.run_stage()
File "E:\Anaconda\envs\mv_mr\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1289, in run_stage
return self._run_train()
File "E:\Anaconda\envs\mv_mr\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1311, in _run_train
self._run_sanity_check(self.lightning_module)
File "E:\Anaconda\envs\mv_mr\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1375, in _run_sanity_check
self._evaluation_loop.run()
File "E:\Anaconda\envs\mv_mr\lib\site-packages\pytorch_lightning\loops\base.py", line 145, in run
self.advance(*args, **kwargs)
File "E:\Anaconda\envs\mv_mr\lib\site-packages\pytorch_lightning\loops\dataloader\evaluation_loop.py", line 110, in advance
dl_outputs = self.epoch_loop.run(dataloader, dataloader_idx, dl_max_batches, self.num_dataloaders)
File "E:\Anaconda\envs\mv_mr\lib\site-packages\pytorch_lightning\loops\base.py", line 145, in run
self.advance(*args, **kwargs)
File "E:\Anaconda\envs\mv_mr\lib\site-packages\pytorch_lightning\loops\epoch\evaluation_epoch_loop.py", line 122, in advance
output = self._evaluation_step(batch, batch_idx, dataloader_idx)
File "E:\Anaconda\envs\mv_mr\lib\site-packages\pytorch_lightning\loops\epoch\evaluation_epoch_loop.py", line 217, in _evaluation_step
output = self.trainer.accelerator.validation_step(step_kwargs)
File "E:\Anaconda\envs\mv_mr\lib\site-packages\pytorch_lightning\accelerators\accelerator.py", line 239, in validation_step
return self.training_type_plugin.validation_step(*step_kwargs.values())
File "E:\Anaconda\envs\mv_mr\lib\site-packages\pytorch_lightning\plugins\training_type\training_type_plugin.py", line 219, in validation_step
return self.model.validation_step(*args, **kwargs)
File "F:\Fangweijie\mv-mr-main\src\model\self_supervised_module.py", line 277, in validation_step
return self.step(batch, batch_idx, stage='val')
File "F:\Fangweijie\mv-mr-main\src\model\self_supervised_module.py", line 260, in step
loss_dc = self._step_dc(im_orig, z_scaled, representation)
File "F:\Fangweijie\mv-mr-main\src\model\self_supervised_module.py", line 194, in _step_dc
im_orig_hog = self._hog(im_orig_r)
File "E:\Anaconda\envs\mv_mr\lib\site-packages\torch\nn\modules\module.py", line 1130, in call_impl
return forward_call(*input, **kwargs)
File "F:\Fangweijie\mv-mr-main\src\model\hog.py", line 39, in forward
out.scatter(1, phase_int.floor().long() % self.nbins, norm)
RuntimeError: scatter(): Expected self.dtype to be equal to src.dtype
The text was updated successfully, but these errors were encountered: