Weekly patch release
App
Added
- Added the
start
method to the work (#15523) - Added a
MultiNode
Component to run with distributed computation with any frameworks (#15524) - Expose
RunWorkExecutor
to the work and provides default ones for theMultiNode
Component (#15561) - Added a
start_with_flow
flag to theLightningWork
which can be disabled to prevent the work from starting at the same time as the flow (#15591) - Added support for running Lightning App with VSCode IDE debugger (#15590)
- Added
bi-directional
delta updates between the flow and the works (#15582) - Added
--setup
flag tolightning run app
CLI command allowing for dependency installation via app comments (#15577) - Auto-upgrade / detect environment mis-match from the CLI (#15434)
- Added Serve component (#15609)
Changed
- Changed the
flow.flows
to be recursive wont to align the behavior with theflow.works
(#15466) - The
params
argument inTracerPythonScript.run
no longer prepends--
automatically to parameters (#15518) - Only check versions / env when not in the cloud (#15504)
- Periodically sync database to the drive (#15441)
- Slightly safer multi node (#15538)
- Reuse existing commands when running connect more than once (#15471)
Fixed
- Fixed writing app name and id in connect.txt file for the command CLI (#15443)
- Fixed missing root flow among the flows of the app (#15531)
- Fixed bug with Multi Node Component and add some examples (#15557)
- Fixed a bug where payload would take a very long time locally (#15557)
- Fixed an issue with the
lightning
CLI taking a long time to error out when the cloud is not reachable (#15412)
Lite
Fixed
- Fix an issue with the SLURM
srun
detection causing permission errors (#15485) - Fixed the import of
lightning_lite
causing a warning 'Redirects are currently not supported in Windows or MacOs' (#15610)
PyTorch
Fixed
- Fixed
TensorBoardLogger
not validating the input array type when logging the model graph (#15323) - Fixed an attribute error in
ColossalAIStrategy
at import time whentorch.distributed
is not available (#15535) - Fixed an issue when calling
fs.listdir
with file URI instead of path inCheckpointConnector
(#15413) - Fixed an issue with the
BaseFinetuning
callback not setting thetrack_running_stats
attribute for batch normaliztion layers (#15063) - Fixed an issue with
WandbLogger(log_model=True|'all)
raising an error and not being able to serialize tensors in the metadata (#15544) - Fixed the gradient unscaling logic when using
Trainer(precision=16)
and fused optimizers such asAdam(..., fused=True)
(#15544) - Fixed model state transfer in multiprocessing launcher when running multi-node (#15567)
- Fixed manual optimization raising
AttributeError
with Bagua Strategy (#12534) - Fixed the import of
pytorch_lightning
causing a warning 'Redirects are currently not supported in Windows or MacOs' (#15610)
Full Changelog: 1.8.0...1.8.1