Skip to content

Conversation

@dependabot
Copy link

@dependabot dependabot bot commented on behalf of github Feb 14, 2021

Bumps pytorch-lightning from 1.0.3 to 1.1.8.

Release notes

Sourced from pytorch-lightning's releases.

Standard weekly patch release

[1.1.8] - 2021-02-08

Fixed

  • Separate epoch validation from step validation (#5208)
  • Fixed toggle_optimizers not handling all optimizer parameters (#5775)

Contributors

@ananthsub, @rohitgr7

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Standard weekly patch release

[1.1.7] - 2021-02-03

Fixed

  • Fixed TensorBoardLogger not closing SummaryWriter on finalize (#5696)
  • Fixed filtering of pytorch "unsqueeze" warning when using DP (#5622)
  • Fixed num_classes argument in F1 metric (#5663)
  • Fixed log_dir property (#5537)
  • Fixed a race condition in ModelCheckpoint when checking if a checkpoint file exists (#5144)
  • Remove unnecessary intermediate layers in Dockerfiles (#5697)
  • Fixed auto learning rate ordering (#5638)

Contributors

@awaelchli @guillochon @noamzilo @rohitgr7 @SkafteNicki @sumanthratna

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Standard weekly patch release

[1.1.6] - 2021-01-26

Changed

  • Increased TPU check timeout from 20s to 100s (#5598)
  • Ignored step param in Neptune logger's log_metric method (#5510)
  • Pass batch outputs to on_train_batch_end instead of epoch_end outputs (#4369)

Fixed

  • Fixed toggle_optimizer to reset requires_grad state (#5574)
  • Fixed FileNotFoundError for best checkpoint when using DDP with Hydra (#5629)
  • Fixed an error when logging a progress bar metric with a reserved name (#5620)
  • Fixed Metric's state_dict not included when child modules (#5614)
  • Fixed Neptune logger creating multiple experiments when GPUs > 1 (#3256)
  • Fixed duplicate logs appearing in console when using the python logging module (#5509)
  • Fixed tensor printing in trainer.test() (#5138)
  • Fixed not using dataloader when hparams present (#4559)

... (truncated)

Changelog

Sourced from pytorch-lightning's changelog.

[1.1.8] - 2021-02-08

Fixed

  • Separate epoch validation from step validation (#5208)
  • Fixed toggle_optimizers not handling all optimizer parameters (#5775)

[1.1.7] - 2021-02-03

Fixed

  • Fixed TensorBoardLogger not closing SummaryWriter on finalize (#5696)
  • Fixed filtering of pytorch "unsqueeze" warning when using DP (#5622)
  • Fixed num_classes argument in F1 metric (#5663)
  • Fixed log_dir property (#5537)
  • Fixed a race condition in ModelCheckpoint when checking if a checkpoint file exists (#5144)
  • Remove unnecessary intermediate layers in Dockerfiles (#5697)
  • Fixed auto learning rate ordering (#5638)

[1.1.6] - 2021-01-26

Changed

  • Increased TPU check timeout from 20s to 100s (#5598)
  • Ignored step param in Neptune logger's log_metric method (#5510)
  • Pass batch outputs to on_train_batch_end instead of epoch_end outputs (#4369)

Fixed

  • Fixed toggle_optimizer to reset requires_grad state (#5574)
  • Fixed FileNotFoundError for best checkpoint when using DDP with Hydra (#5629)
  • Fixed an error when logging a progress bar metric with a reserved name (#5620)
  • Fixed Metric's state_dict not included when child modules (#5614)
  • Fixed Neptune logger creating multiple experiments when GPUs > 1 (#3256)
  • Fixed duplicate logs appearing in console when using the python logging module (#5509)
  • Fixed tensor printing in trainer.test() (#5138)
  • Fixed not using dataloader when hparams present (#4559)

[1.1.5] - 2021-01-19

Fixed

  • Fixed a visual bug in the progress bar display initialization (#4579)
  • Fixed logging on_train_batch_end in a callback with multiple optimizers (#5521)
  • Fixed reinit_scheduler_properties with correct optimizer (#5519)
  • Fixed val_check_interval with fast_dev_run (#5540)

... (truncated)

Commits

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot merge will merge this PR after your CI passes on it
  • @dependabot squash and merge will squash and merge this PR after your CI passes on it
  • @dependabot cancel merge will cancel a previously requested merge and block automerging
  • @dependabot reopen will reopen this PR if it is closed
  • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

@dependabot dependabot bot added the dependencies Pull requests that update a dependency file label Feb 14, 2021
@dependabot dependabot bot force-pushed the dependabot/pip/python/requirements/pytorch-lightning-1.1.8 branch from 63128ec to 1c83739 Compare February 16, 2021 03:38
@dependabot dependabot bot force-pushed the dependabot/pip/python/requirements/pytorch-lightning-1.1.8 branch from 1c83739 to ad0e4b5 Compare February 18, 2021 07:15
@dependabot @github
Copy link
Author

dependabot bot commented on behalf of github Feb 20, 2021

Superseded by #13.

@dependabot dependabot bot closed this Feb 20, 2021
@dependabot dependabot bot deleted the dependabot/pip/python/requirements/pytorch-lightning-1.1.8 branch February 20, 2021 08:02
rkooo567 pushed a commit that referenced this pull request Jul 27, 2022
We encountered SIGSEGV when running Python test `python/ray/tests/test_failure_2.py::test_list_named_actors_timeout`. The stack is:

```
#0  0x00007fffed30f393 in std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(std::string const&) ()
   from /lib64/libstdc++.so.6
#1  0x00007fffee707649 in ray::RayLog::GetLoggerName() () from /home/admin/dev/Arc/merge/ray/python/ray/_raylet.so
#2  0x00007fffee70aa90 in ray::SpdLogMessage::Flush() () from /home/admin/dev/Arc/merge/ray/python/ray/_raylet.so
#3  0x00007fffee70af28 in ray::RayLog::~RayLog() () from /home/admin/dev/Arc/merge/ray/python/ray/_raylet.so
#4  0x00007fffee2b570d in ray::asio::testing::(anonymous namespace)::DelayManager::Init() [clone .constprop.0] ()
   from /home/admin/dev/Arc/merge/ray/python/ray/_raylet.so
#5  0x00007fffedd0d95a in _GLOBAL__sub_I_asio_chaos.cc () from /home/admin/dev/Arc/merge/ray/python/ray/_raylet.so
#6  0x00007ffff7fe282a in call_init.part () from /lib64/ld-linux-x86-64.so.2
#7  0x00007ffff7fe2931 in _dl_init () from /lib64/ld-linux-x86-64.so.2
#8  0x00007ffff7fe674c in dl_open_worker () from /lib64/ld-linux-x86-64.so.2
#9  0x00007ffff7b82e79 in _dl_catch_exception () from /lib64/libc.so.6
#10 0x00007ffff7fe5ffe in _dl_open () from /lib64/ld-linux-x86-64.so.2
#11 0x00007ffff7d5f39c in dlopen_doit () from /lib64/libdl.so.2
#12 0x00007ffff7b82e79 in _dl_catch_exception () from /lib64/libc.so.6
#13 0x00007ffff7b82f13 in _dl_catch_error () from /lib64/libc.so.6
#14 0x00007ffff7d5fb09 in _dlerror_run () from /lib64/libdl.so.2
#15 0x00007ffff7d5f42a in dlopen@@GLIBC_2.2.5 () from /lib64/libdl.so.2
#16 0x00007fffef04d330 in py_dl_open (self=<optimized out>, args=<optimized out>)
    at /tmp/python-build.20220507135524.257789/Python-3.7.11/Modules/_ctypes/callproc.c:1369
```

The root cause is that when loading `_raylet.so`, `static DelayManager _delay_manager` is initialized and `RAY_LOG(ERROR) << "RAY_testing_asio_delay_us is set to " << delay_env;` is executed. However, the static variables declared in `logging.cc` are not initialized yet (in this case, `std::string RayLog::logger_name_ = "ray_log_sink"`).

It's better not to rely on the initialization order of static variables in different compilation units because it's not guaranteed. I propose to change all `RAY_LOG`s to `std::cerr` in `DelayManager::Init()`.

The crash happens in Ant's internal codebase. Not sure why this test case passes in the community version though.

BTW, I've tried different approaches:

1. Using a static local variable in `get_delay_us` and remove the global variable. This doesn't work because `init()` needs to access the variable as well.
2. Defining the global variable as type `std::unique_ptr<DelayManager>` and initialize it in `get_delay_us`. This works but it requires a lock to be thread-safe.
rkooo567 pushed a commit that referenced this pull request Jul 22, 2024
…e script and matching RLModule example class (tiny CNN).. (ray-project#45774)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Pull requests that update a dependency file

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant