Skip to content

Commit

Permalink
Weekly Patch Release v1.9.2 (#16725)
Browse files Browse the repository at this point in the history
* Add .git-blame-ignore-revs (#16709)

Co-authored-by: Jirka Borovec <[email protected]>

* Fix strategy type validation in connectors (#16693)

* Disable strict loading in multiprocessing launcher (#16365)


Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Jirka <[email protected]>

* Fix min-epochs and early-stopping triggering too many validation runs (#16719)

Co-authored-by: Jirka Borovec <[email protected]>

* Update hydra-core requirement from <1.3.0,>=1.0.5 to >=1.0.5,<1.4.0 in /requirements (#16736)

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* [App] Add support for private data (#16738)

Co-authored-by: thomas <[email protected]>

* [App] Add rm one level below project level (#16740)

Co-authored-by: Ethan Harris <[email protected]>
Co-authored-by: Justus Schock <[email protected]>
Co-authored-by: thomas <[email protected]>

* ci: cleaning caches (#16752)

* CI: Update colossalai version (#16747)

Co-authored-by: Carlos Mocholí <[email protected]>
type

* Update version and changelog for 1.9.2

---------

Co-authored-by: Akihiro Nitta <[email protected]>
Co-authored-by: Jirka Borovec <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Jirka <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: thomas chaton <[email protected]>
Co-authored-by: thomas <[email protected]>
Co-authored-by: Ethan Harris <[email protected]>
Co-authored-by: Justus Schock <[email protected]>
  • Loading branch information
10 people authored Feb 15, 2023
1 parent 3add657 commit c5b836a
Show file tree
Hide file tree
Showing 38 changed files with 478 additions and 201 deletions.
7 changes: 1 addition & 6 deletions .azure/gpu-tests-pytorch.yml
Original file line number Diff line number Diff line change
Expand Up @@ -100,19 +100,14 @@ jobs:

- bash: pip uninstall -y -r requirements/pytorch/strategies.txt
condition: eq(variables['scope'], '')
displayName: 'UnInstall strategies'
displayName: 'Uninstall strategies'

- bash: |
set -e
CUDA_VERSION_BAGUA=$(python -c "print([ver for ver in [116,113,111,102] if $CUDA_VERSION_MM >= ver][0])")
pip install "bagua-cuda$CUDA_VERSION_BAGUA"
CUDA_VERSION_MM_COLOSSALAI=$(python -c "import torch ; print(''.join(map(str, torch.version.cuda)))")
CUDA_VERSION_COLOSSALAI=$(python -c "print([ver for ver in [11.3, 11.1] if $CUDA_VERSION_MM_COLOSSALAI >= ver][0])")
pip install "colossalai==0.1.12+torch${PYTORCH_VERSION}cu${CUDA_VERSION_COLOSSALAI}" --find-links https://release.colossalai.org
pip install -r requirements/pytorch/strategies.txt --find-links ${TORCH_URL}
python requirements/pytorch/check-avail-strategies.py
condition: eq(variables['scope'], 'strategies')
displayName: 'Install strategies'
Expand Down
3 changes: 2 additions & 1 deletion .azure/ipu-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,8 @@ jobs:
for fpath in `ls requirements/**/*.txt`; do \
python ./requirements/pytorch/adjust-versions.py $fpath; \
done
pip install -e .[dev]
pip install -e .[extra,examples,test]
pip list
env:
PACKAGE_NAME: "pytorch"
Expand Down
16 changes: 16 additions & 0 deletions .git-blame-ignore-revs
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# copyright Lightning AI team (#16647)
770b7929255389503e907350e2380ff449229816
# [App] Add Missing Copyright (#16625)
2bab2bac01694680b6c3e4f3a19d5bcd361fcaf4
# adding license (#16450)
e4c3441b25a8c194a873c8850e9507771de7053c
# update copyright in PL & Fabric (#16481)
98f7696d1681974d34fad59c03b4b58d9524ed13
# add copyr (#6661)
d471fa30b3bf95cfe601014bac544754067241ca
# add copyright to tests (#5143)
35401706bf0b89b07bc1748fdc2df612baa2be2a
# added copyright notices (#3062)
f43028f3ae5333b4ef0b08cc34f5560736381962
# copyright (#2710)
44d85c12191098b9bad40536375b29b154d91a47
34 changes: 34 additions & 0 deletions .github/workflows/cleanup-caches.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# https://docs.github.com/en/actions/using-workflows/caching-dependencies-to-speed-up-workflows#force-deleting-cache-entries
name: cleanup caches by a branch
on:
pull_request:
types: [closed]

jobs:

pr-cleanup:
runs-on: ubuntu-latest
steps:
- name: Check out code
uses: actions/checkout@v3

- name: Cleanup
run: |
gh extension install actions/gh-actions-cache
REPO=${{ github.repository }}
BRANCH="refs/pull/${{ github.event.pull_request.number }}/merge"
echo "Fetching list of cache key"
cacheKeysForPR=$(gh actions-cache list -R $REPO -B $BRANCH | cut -f 1 )
## Setting this to not fail the workflow while deleting cache keys.
set +e
echo "Deleting caches..."
for cacheKey in $cacheKeysForPR
do
gh actions-cache delete $cacheKey -R $REPO -B $BRANCH --confirm
done
echo "Done"
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
5 changes: 1 addition & 4 deletions dockers/base-cuda/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -152,10 +152,7 @@ RUN \
# install ColossalAI
# TODO: 1.13 wheels are not released, remove skip once they are
if [[ $PYTORCH_VERSION != "1.13" ]]; then \
CUDA_VERSION_MM_COLOSSALAI=$(python -c "import torch ; print(''.join(map(str, torch.version.cuda)))") ; \
CUDA_VERSION_COLOSSALAI=$(python -c "print([ver for ver in [11.3, 11.1] if $CUDA_VERSION_MM_COLOSSALAI >= ver][0])") ; \
pip install "colossalai==0.1.12+torch${PYTORCH_VERSION}cu${CUDA_VERSION_COLOSSALAI}" "setuptools==59.5.0" \
--find-links https://release.colossalai.org ; \
pip install "colossalai==0.2.3"; \
python -c "import colossalai; print(colossalai.__version__)" ; \
fi

Expand Down
2 changes: 1 addition & 1 deletion dockers/base-xla/tpu_workflow_pytorch.jsonnet
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ local tputests = base.BaseTest {
for fpath in `ls requirements/**/*.txt`; do
python requirements/pytorch/adjust-versions.py $fpath {PYTORCH_VERSION};
done
PACKAGE_NAME=pytorch pip install .[dev]
PACKAGE_NAME=pytorch pip install .[extra,test]
pip list
echo $KUBE_GOOGLE_CLOUD_TPU_ENDPOINTS
Expand Down
2 changes: 1 addition & 1 deletion requirements/app/base.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
lightning-cloud>=0.5.24
lightning-cloud>=0.5.26
packaging
typing-extensions>=4.0.0, <=4.4.0
deepdiff>=5.7.0, <6.2.4
Expand Down
2 changes: 1 addition & 1 deletion requirements/pytorch/extra.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
# extended list of package dependencies to reach full functionality
matplotlib>3.1, <3.6.2
omegaconf>=2.0.5, <2.4.0
hydra-core>=1.0.5, <1.3.0
hydra-core>=1.0.5, <1.4.0
jsonargparse[signatures]>=4.18.0, <4.19.0
rich>=10.14.0, !=10.15.0.a, <13.0.0
tensorboardX>=2.2, <=2.5.1 # min version is set by torch.onnx missing attribute
4 changes: 2 additions & 2 deletions requirements/pytorch/strategies.txt
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
# NOTE: the upper bound for the package version is only set for CI stability, and it is dropped while installing this package
# in case you want to preserve/enforce restrictions on the latest compatible version, add "strict" as an in-line comment

# colossalai>=0.1.12
colossalai>=0.2.0, <=0.2.4
fairscale>=0.4.5, <0.4.13
deepspeed>=0.6.0, <=0.8.0
deepspeed>=0.6.0, <0.8.0 # TODO: Include 0.8.x after https://github.com/microsoft/DeepSpeed/commit/b587c7e85470329ac25df7c7c2521ff9b2833db7 gets released
# no need to install with [pytorch] as pytorch is already installed
horovod>=0.21.2, !=0.24.0, <=0.26.1
hivemind==1.1.5; sys_platform == 'linux'
27 changes: 4 additions & 23 deletions src/lightning_app/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,32 +5,13 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).


## [1.9.2] - 2023-02-DD
## [1.9.2] - 2023-02-15

### Added

-


### Changed

-


### Deprecated

-


### Removed

-


### Fixed

-

- Added Storage Commands ([#16740](https://github.com/Lightning-AI/lightning/pull/16740))
* `rm`: Delete files from your Cloud Platform Filesystem
- Added `lightning connect data` to register data connection to private s3 buckets ([#16738](https://github.com/Lightning-AI/lightning/pull/16738))


## [1.9.1] - 2023-02-10
Expand Down
7 changes: 6 additions & 1 deletion src/lightning_app/cli/commands/cp.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ def cp(src_path: str, dst_path: str, r: bool = False, recursive: bool = False) -
if pwd == "/" or len(pwd.split("/")) == 1:
return _error_and_exit("Uploading files at the project level isn't allowed yet.")

client = LightningClient()
client = LightningClient(retry=False)

src_path, src_remote = _sanitize_path(src_path, pwd)
dst_path, dst_remote = _sanitize_path(dst_path, pwd)
Expand Down Expand Up @@ -87,6 +87,9 @@ def _upload_files(live, client: LightningClient, local_src: str, remote_dst: str
else:
project_id = _get_project_id_from_name(remote_dst)

if len(remote_splits) > 2:
remote_dst = os.path.join(*remote_splits[2:])

local_src = Path(local_src).resolve()
upload_paths = []

Expand All @@ -101,6 +104,8 @@ def _upload_files(live, client: LightningClient, local_src: str, remote_dst: str

clusters = client.projects_service_list_project_cluster_bindings(project_id)

live.stop()

for upload_path in upload_paths:
for cluster in clusters.clusters:
filename = str(upload_path).replace(str(os.getcwd()), "")[1:]
Expand Down
6 changes: 3 additions & 3 deletions src/lightning_app/cli/commands/ls.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,8 @@
from typing import Generator, List, Optional

import click
import lightning_cloud
import rich
from fastapi import HTTPException
from lightning_cloud.openapi import Externalv1LightningappInstance
from rich.console import Console
from rich.live import Live
Expand Down Expand Up @@ -65,7 +65,7 @@ def ls(path: Optional[str] = None, print: bool = True, use_live: bool = True) ->
lines = f.readlines()
root = lines[0].replace("\n", "")

client = LightningClient()
client = LightningClient(retry=False)
projects = client.projects_service_list_memberships()

if root == "/":
Expand Down Expand Up @@ -256,7 +256,7 @@ def _collect_artifacts(
page_token=response.next_page_token,
tokens=tokens,
)
except HTTPException:
except lightning_cloud.openapi.rest.ApiException:
# Note: This is triggered when the request is wrong.
# This is currently happening due to looping through the user clusters.
pass
Expand Down
104 changes: 104 additions & 0 deletions src/lightning_app/cli/commands/rm.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
# Copyright The Lightning AI team.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import os

import click
import lightning_cloud
import rich

from lightning_app.cli.commands.ls import _add_colors, _get_prefix
from lightning_app.cli.commands.pwd import _pwd
from lightning_app.utilities.app_helpers import Logger
from lightning_app.utilities.cli_helpers import _error_and_exit
from lightning_app.utilities.network import LightningClient

logger = Logger(__name__)


@click.argument("rm_path", required=True)
@click.option("-r", required=False, hidden=True)
@click.option("--recursive", required=False, hidden=True)
def rm(rm_path: str, r: bool = False, recursive: bool = False) -> None:
"""Delete files on the Lightning Cloud filesystem."""

root = _pwd()

if rm_path in (".", ".."):
return _error_and_exit('rm "." and ".." may not be removed')

if ".." in rm_path:
return _error_and_exit('rm ".." or higher may not be removed')

root = os.path.join(root, rm_path)
splits = [split for split in root.split("/") if split != ""]

if root == "/" or len(splits) == 1:
return _error_and_exit("rm at the project level isn't supported")

client = LightningClient(retry=False)
projects = client.projects_service_list_memberships()

project = [project for project in projects.memberships if project.name == splits[0]]

# This happens if the user changes cluster and the project doesn't exist.
if len(project) == 0:
return _error_and_exit(
f"There isn't any Lightning Project matching the name {splits[0]}." " HINT: Use `lightning cd`."
)

project_id = project[0].project_id

# Parallelise calls
lit_apps = client.lightningapp_instance_service_list_lightningapp_instances(project_id=project_id, async_req=True)
lit_cloud_spaces = client.cloud_space_service_list_cloud_spaces(project_id=project_id, async_req=True)

lit_apps = lit_apps.get().lightningapps
lit_cloud_spaces = lit_cloud_spaces.get().cloudspaces

lit_ressources = [lit_resource for lit_resource in lit_cloud_spaces if lit_resource.name == splits[1]]

if len(lit_ressources) == 0:

lit_ressources = [lit_resource for lit_resource in lit_apps if lit_resource.name == splits[1]]

if len(lit_ressources) == 0:
_error_and_exit(f"There isn't any Lightning Ressource matching the name {splits[1]}.")

lit_resource = lit_ressources[0]

prefix = "/".join(splits[2:])
prefix = _get_prefix(prefix, lit_resource)

clusters = client.projects_service_list_project_cluster_bindings(project_id)
succeeded = False

for cluster in clusters.clusters:
try:
client.lightningapp_instance_service_delete_project_artifact(
project_id=project_id,
cluster_id=cluster.cluster_id,
filename=prefix,
)
succeeded = True
break
except lightning_cloud.openapi.rest.ApiException:
pass

prefix = os.path.join(*splits)

if succeeded:
rich.print(_add_colors(f"Successfuly deleted `{prefix}`.", color="green"))
else:
return _error_and_exit(f"No file or folder named `{prefix}` was found.")
Loading

0 comments on commit c5b836a

Please sign in to comment.