[ray local cluster] nodes marked as uninitialized #39565

jmakov · 2023-09-11T23:38:04Z

What happened + What you expected to happen

Running ray up ray.yaml I'd expect that all of the 4 nodes would be setup and join the cluster as I've set min_workers: 4. ray monitor ray.yaml is showing the nodes as uninitialized though.

Versions / Dependencies

ray 2.6.4
python 3.9.18
manjaro

Reproduction script

ray.yaml

# A unique identifier for the head node and workers of this cluster.
cluster_name: test

# Running Ray in Docker images is optional (this docker section can be commented out).
# This executes all commands on all nodes in the docker container,
# and opens all the necessary ports to support the Ray cluster.
# Empty string means disabled. Assumes Docker is installed.
#docker:
#    image: "rayproject/ray-ml:latest-gpu" # You can change this to latest-cpu if you don't need GPU support and want a faster startup
#    # image: rayproject/ray:latest-gpu   # use this one if you don't need ML dependencies, it's faster to pull
#    container_name: "ray_container"
#    # If true, pulls latest version of image. Otherwise, `docker run` will only pull the image
#    # if no cached version is present.
#    pull_before_run: True
#    run_options:   # Extra options to pass into "docker run"
#        - --ulimit nofile=65536:65536

provider:
    type: local
    head_ip: 192.168.0.101
    # You may need to supply a public ip for the head node if you need
    # to run `ray up` from outside of the Ray cluster's network
    # (e.g. the cluster is in an AWS VPC and you're starting ray from your laptop)
    # This is useful when debugging the local node provider with cloud VMs.
    # external_head_ip: YOUR_HEAD_PUBLIC_IP
    worker_ips:
      - 192.168.0.106
      - 192.168.0.107
      - 192.168.0.108
      - 192.168.0.110
    # Optional when running automatic cluster management on prem. If you use a coordinator server,
    # then you can launch multiple autoscaling clusters on the same set of machines, and the coordinator
    # will assign individual nodes to clusters as needed.
    #    coordinator_address: "<host>:<port>"

# How Ray will authenticate with newly launched nodes.
auth:
    ssh_user: myuser
    # You can comment out `ssh_private_key` if the following machines don't need a private key for SSH access to the Ray
    # cluster:
    #   (1) The machine on which `ray up` is executed.
    #   (2) The head node of the Ray cluster.
    #
    # The machine that runs ray up executes SSH commands to set up the Ray head node. The Ray head node subsequently
    # executes SSH commands to set up the Ray worker nodes. When you run ray up, ssh credentials sitting on the ray up
    # machine are copied to the head node -- internally, the ssh key is added to the list of file mounts to rsync to head node.
    # ssh_private_key: ~/.ssh/id_rsa

# The minimum number of workers nodes to launch in addition to the head
# node. This number should be >= 0.
# Typically, min_workers == max_workers == len(worker_ips).
# This field is optional.
min_workers: 4

# The maximum number of workers nodes to launch in addition to the head node.
# This takes precedence over min_workers.
# Typically, min_workers == max_workers == len(worker_ips).
# This field is optional.
#max_workers: 4
# The default behavior for manually managed clusters is
# min_workers == max_workers == len(worker_ips),
# meaning that Ray is started on all available nodes of the cluster.
# For automatically managed clusters, max_workers is required and min_workers defaults to 0.

# The autoscaler will scale up the cluster faster with higher upscaling speed.
# E.g., if the task requires adding more nodes then autoscaler will gradually
# scale up the cluster in chunks of upscaling_speed*currently_running_nodes.
# This number should be > 0.
upscaling_speed: 1.0

idle_timeout_minutes: 5

# Files or directories to copy to the head and worker nodes. The format is a
# dictionary from REMOTE_PATH: LOCAL_PATH. E.g. you could save your conda env to an environment.yaml file, mount
# that directory to all nodes and call `conda -n my_env -f /path1/on/remote/machine/environment.yaml`. In this
# example paths on all nodes must be the same (so that conda can be called always with the same argument)
file_mounts: {
    "/mnt/ray": ".",
}

# Files or directories to copy from the head node to the worker nodes. The format is a
# list of paths. The same path on the head node will be copied to the worker node.
# This behavior is a subset of the file_mounts behavior. In the vast majority of cases
# you should just use file_mounts. Only use this if you know what you're doing!
cluster_synced_files: []

# Whether changes to directories in file_mounts or cluster_synced_files in the head node
# should sync to the worker node continuously
file_mounts_sync_continuously: False

# Patterns for files to exclude when running rsync up or rsync down
rsync_exclude:
    - "**/.git"
    - "**/.git/**"

# Pattern files to use for filtering out files when running rsync up or rsync down. The file is searched for
# in the source directory and recursively through all subdirectories. For example, if .gitignore is provided
# as a value, the behavior will match git's behavior for finding and using .gitignore files.
rsync_filter:
    - ".gitignore"

# List of commands that will be run before `setup_commands`. If docker is
# enabled, these commands will run outside the container and before docker
# is setup.
initialization_commands: []

# List of shell commands to run to set up each nodes.
setup_commands:
    # If we have e.g. conda dependencies stored in "/path1/on/local/machine/environment.yaml", we can prepare the
    # work environment on each worker by:
    #   1. making sure each worker has access to this file i.e. see the `file_mounts` section
    #   2. adding a command here that creates a new conda environment on each node or if the environment already exists,
    #     it updates it:
    #      conda env create -q -n my_venv -f /path1/on/local/machine/environment.yaml || conda env update -q -n my_venv -f /path1/on/local/machine/environment.yaml
    #
    # Ray developers:
    # you probably want to create a Docker image that
    # has your Ray repo pre-cloned. Then, you can replace the pip installs
    # below with a git checkout <your_sha> (and possibly a recompile).
    # To run the nightly version of ray (as opposed to the latest), either use a rayproject docker image
    # that has the "nightly" (e.g. "rayproject/ray-ml:nightly-gpu") or uncomment the following line:
    # - pip install -U "ray[default] @ https://s3-us-west-2.amazonaws.com/ray-wheels/latest/ray-3.0.0.dev0-cp37-cp37m-manylinux2014_x86_64.whl"
    - source ~/mambaforge-pypy3/etc/profile.d/conda.sh && mamba env update -f /mnt/ray/env.yaml --prune

# Custom commands that will be run on the head node after common setup.
head_setup_commands: []

# Custom commands that will be run on worker nodes after common setup.
worker_setup_commands: []

# Command to start ray on the head node. You don't need to change this.
head_start_ray_commands:
  # If we have e.g. conda dependencies, we could create on each node a conda environment (see `setup_commands` section).
  # In that case we'd have to activate that env on each node before running `ray`:
  # - conda activate my_venv && ray stop
  # - conda activate my_venv && ulimit -c unlimited && ray start --head --port=6379 --autoscaling-config=~/ray_bootstrap_config.yaml
    - source ~/mambaforge-pypy3/etc/profile.d/conda.sh && conda activate test && ray stop
    - source ~/mambaforge-pypy3/etc/profile.d/conda.sh && conda activate test && ulimit -c unlimited && ray start --head --disable-usage-stats --port=6379 --autoscaling-config=~/ray_bootstrap_config.yaml --system-config='{"automatic_object_spilling_enabled":true,"max_io_workers":8,"min_spilling_size":104857600,"object_spilling_config":"{\"type\":\"filesystem\",\"params\":{\"directory_path\":\"/mnt/ray/object_spilling\"}}"}'


# Command to start ray on worker nodes. You don't need to change this.
worker_start_ray_commands:
  # If we have e.g. conda dependencies, we could create on each node a conda environment (see `setup_commands` section).
  # In that case we'd have to activate that env on each node before running `ray`:
  # - conda activate my_venv && ray stop
  # - ray start --address=$RAY_HEAD_IP:6379
    - source ~/mambaforge-pypy3/etc/profile.d/conda.sh && conda activate test && ray stop
    - source ~/mambaforge-pypy3/etc/profile.d/conda.sh && conda activate test && ulimit -c unlimited && ray start --address=$RAY_HEAD_IP:6379 --disable-usage-stats

Issue Severity

High: It blocks me from completing my task.

The text was updated successfully, but these errors were encountered:

rkooo567 · 2023-09-25T21:46:42Z

cc @rickyyx can you follow up with the investigation?

    type: local
    head_ip: 192.168.0.101
    # You may need to supply a public ip for the head node if you need
    # to run `ray up` from outside of the Ray cluster's network
    # (e.g. the cluster is in an AWS VPC and you're starting ray from your laptop)
    # This is useful when debugging the local node provider with cloud VMs.
    # external_head_ip: YOUR_HEAD_PUBLIC_IP
    worker_ips:
      - 192.168.0.106
      - 192.168.0.107
      - 192.168.0.108
      - 192.168.0.110

Can you tell us what this exactly for?

rickyyx · 2023-09-25T21:58:11Z

Hey @jmakov - will you be able to get any monitor.* logs generated? That would be helpful to debug.

jmakov · 2023-09-26T21:03:25Z

Didn't see anything exciting happening there, only monitor.log has some entries:

2023-09-22 21:37:07,546 INFO monitor.py:699 -- Starting monitor using ray installation: /home/jernej_m/mambaforge-pypy3/envs/test_ray/lib/python3.10/site-packages/ray/__init__.py
2023-09-22 21:37:07,546 INFO monitor.py:700 -- Ray version: 2.6.3
2023-09-22 21:37:07,546 INFO monitor.py:701 -- Ray commit: {{RAY_COMMIT_SHA}}
2023-09-22 21:37:07,546 INFO monitor.py:702 -- Monitor started with command: ['/home/jernej_m/mambaforge-pypy3/envs/test_ray/lib/python3.10/site-packages/ray/autoscaler/_private/monitor.py', '--logs-dir=/tmp/ray/session_2023-09-22_21-37-05_827384_110848/logs', '--logging-rotate-bytes=536870912', '--logging-rotate-backup-count=5', '--gcs-address=192.168.0.101:6379', '--autoscaling-config=~/ray_bootstrap_config.yaml', '--monitor-ip=192.168.0.101']
2023-09-22 21:37:07,552 INFO monitor.py:167 -- session_name: session_2023-09-22_21-37-05_827384_110848
2023-09-22 21:37:07,554 INFO monitor.py:199 -- Starting autoscaler metrics server on port 44217
2023-09-22 21:37:07,556 INFO monitor.py:224 -- Monitor: Started
2023-09-22 21:37:07,571 INFO node_provider.py:53 -- ClusterState: Loaded cluster state: []
2023-09-22 21:37:07,572 INFO node_provider.py:114 -- ClusterState: Writing cluster state: ['192.168.0.106', '192.168.0.107', '192.168.0.108', '192.168.0.110', '192.168.0.101']
2023-09-22 21:37:07,572 INFO autoscaler.py:274 -- disable_node_updaters:False
2023-09-22 21:37:07,572 INFO autoscaler.py:282 -- disable_launch_config_check:False
2023-09-22 21:37:07,572 INFO autoscaler.py:294 -- foreground_node_launch:False
2023-09-22 21:37:07,572 INFO autoscaler.py:304 -- worker_liveness_check:True
2023-09-22 21:37:07,572 INFO autoscaler.py:312 -- worker_rpc_drain:True
2023-09-22 21:37:07,573 INFO autoscaler.py:362 -- StandardAutoscaler: {'cluster_name': 'test', 'auth': {'ssh_user': 'jernej_m', 'ssh_private_key': '~/ray_bootstrap_key.pem'}, 'upscaling_speed': 1.0, 'idle_timeout_minutes': 5, 'docker': {}, 'initialization_commands': [], 'setup_commands': ['source ~/mambaforge-pypy3/etc/profile.d/conda.sh && mamba env update -f /mnt/ray/mount/env.yaml -n test_ray --prune'], 'head_setup_commands': ['source ~/mambaforge-pypy3/etc/profile.d/conda.sh && mamba env update -f /mnt/ray/mount/env.yaml -n test_ray --prune'], 'worker_setup_commands': ['source ~/mambaforge-pypy3/etc/profile.d/c>
2023-09-22 21:37:07,574 INFO monitor.py:394 -- Autoscaler has not yet received load metrics. Waiting.
2023-09-22 21:37:12,588 INFO autoscaler.py:141 -- The autoscaler took 0.0 seconds to fetch the list of non-terminated nodes.
2023-09-22 21:37:12,588 INFO load_metrics.py:161 -- LoadMetrics: Removed ip: 192.168.0.108.
2023-09-22 21:37:12,588 INFO load_metrics.py:164 -- LoadMetrics: Removed 1 stale ip mappings: {'192.168.0.108'} not in {'192.168.0.101'}
2023-09-22 21:37:12,589 INFO autoscaler.py:421 --
======== Autoscaler status: 2023-09-22 21:37:12.589294 ========
Node status
---------------------------------------------------------------
Healthy:
 1 local.cluster.node
Pending:
 (no pending nodes)
Recent failures:
 (no failures)

Resources
---------------------------------------------------------------
Usage:
 0.0/32.0 CPU
 0.0/2.0 GPU
 0B/77.60GiB memory
 0B/37.25GiB object_store_memory

Demands:
 (no resource demands)
2023-09-22 21:37:12,590 INFO autoscaler.py:1368 -- StandardAutoscaler: Queue 4 new nodes for launch
2023-09-22 21:37:12,590 INFO autoscaler.py:464 -- The autoscaler took 0.003 seconds to complete the update iteration.
2023-09-22 21:37:12,591 INFO node_launcher.py:174 -- NodeLauncher0: Got 4 nodes to launch.
2023-09-22 21:37:12,592 INFO monitor.py:424 -- :event_summary:Resized to 56 CPUs, 4 GPUs.
2023-09-22 21:37:12,594 INFO node_provider.py:114 -- ClusterState: Writing cluster state: ['192.168.0.106', '192.168.0.107', '192.168.0.108', '192.168.0.110', '192.168.0.101']
2023-09-22 21:37:12,594 INFO node_provider.py:114 -- ClusterState: Writing cluster state: ['192.168.0.106', '192.168.0.107', '192.168.0.108', '192.168.0.110', '192.168.0.101']
2023-09-22 21:37:12,595 INFO node_provider.py:114 -- ClusterState: Writing cluster state: ['192.168.0.106', '192.168.0.107', '192.168.0.108', '192.168.0.110', '192.168.0.101']
2023-09-22 21:37:12,596 INFO node_provider.py:114 -- ClusterState: Writing cluster state: ['192.168.0.106', '192.168.0.107', '192.168.0.108', '192.168.0.110', '192.168.0.101']
2023-09-22 21:37:12,596 INFO node_launcher.py:174 -- NodeLauncher0: Launching 4 nodes, type local.cluster.node.
2023-09-22 21:37:17,608 INFO autoscaler.py:141 -- The autoscaler took 0.001 seconds to fetch the list of non-terminated nodes.
2023-09-22 21:37:17,609 INFO autoscaler.py:421 --
======== Autoscaler status: 2023-09-22 21:37:17.609649 ========
Node status
---------------------------------------------------------------
Healthy:
 2 local.cluster.node
Pending:
 192.168.0.106: local.cluster.node, uninitialized
 192.168.0.107: local.cluster.node, uninitialized
 192.168.0.110: local.cluster.node, uninitialized
Recent failures:
 (no failures)

Resources
---------------------------------------------------------------
Usage:
 0.0/56.0 CPU
 0.0/4.0 GPU
 0B/98.01GiB memory
 0B/46.00GiB object_store_memory

Demands:
 (no resource demands)
2023-09-22 21:37:17,619 INFO autoscaler.py:1316 -- Creating new (spawn_updater) updater thread for node 192.168.0.106.
2023-09-22 21:37:17,620 INFO autoscaler.py:1316 -- Creating new (spawn_updater) updater thread for node 192.168.0.107.
2023-09-22 21:37:17,620 INFO autoscaler.py:1316 -- Creating new (spawn_updater) updater thread for node 192.168.0.108.
2023-09-22 21:37:17,620 INFO autoscaler.py:1316 -- Creating new (spawn_updater) updater thread for node 192.168.0.110.

Running everything manually works. Would be nice to have a working cluster launcher for on prem clusters.

ajaichemmanam · 2023-10-01T05:30:46Z

+1 same issue for me. Even with systems on cloud (3rd party cloud, not AWS/GCS/Azure). Opened all ports, sometimes it gets connected, some times it shows uninitialized.

rickyyx · 2023-10-02T20:27:13Z

cc @gvspraveen could someone from the cluster team help take a look? I believe this is more relevant to cluster launcher as of now rather than the actual autoscaling logics since "running everything manually works".

jmakov · 2023-10-02T21:08:28Z

@rickyyx not to mention manually starting ray not working and cluster launcher not working. Wondering how ray works at all for anybody. As someone who uses ray for more than a year, every other release breaks a core part.

rkooo567 · 2023-10-02T21:17:31Z

cc @anyscalesam can you triage this issue with @gvspraveen?

architkulkarni · 2023-10-05T23:53:03Z

~~I'm able to reproduce this on AWS using pip install "ray[default]"==2.7.0 in the setup commands and using the latest ray master on the client side for the cluster launcher.~~[see below, it was just a port issue on my end]

@jmakov do you happen to remember if this was working for you on a previous version of Ray, and if so which one?

jmakov · 2023-10-06T00:25:00Z

So cluster launcher worked for me for the last +2 years using a local cluster (without Docker, just conda env). Think it was 2.6.0 before I made the mistake of upgrading, if I remember correctly. Think I'll just start writing my own tests and run before every upgrade.

ajaichemmanam · 2023-10-09T06:19:03Z

`2023-10-09 11:46:28,208 INFO node_provider.py:53 -- ClusterState: Loaded cluster state: ['216.48.179.215', '164.52.201.70']
Fetched IP: 164.52.201.70
Warning: Permanently added '164.52.201.70' (ED25519) to the list of known hosts.
==> /tmp/ray/session_latest/logs/monitor.err <==

==> /tmp/ray/session_latest/logs/monitor.log <==
2023-10-08 23:13:33,485 INFO monitor.py:690 -- Starting monitor using ray installation: /home/ray/anaconda3/lib/python3.11/site-packages/ray/__init__.py
2023-10-08 23:13:33,485 INFO monitor.py:691 -- Ray version: 2.7.1
2023-10-08 23:13:33,485 INFO monitor.py:692 -- Ray commit: 9f07c12615958c3af3760604f6dcacc4b3758a47
2023-10-08 23:13:33,486 INFO monitor.py:693 -- Monitor started with command: ['/home/ray/anaconda3/lib/python3.11/site-packages/ray/autoscaler/_private/monitor.py', '--logs-dir=/tmp/ray/session_2023-10-08_23-13-32_012785_2484/logs', '--logging-rotate-bytes=536870912', '--logging-rotate-backup-count=5', '--gcs-address=164.52.201.70:6379', '--autoscaling-config=/home/ray/ray_bootstrap_config.yaml', '--monitor-ip=164.52.201.70']
2023-10-08 23:13:33,489 INFO monitor.py:159 -- session_name: session_2023-10-08_23-13-32_012785_2484
2023-10-08 23:13:33,490 INFO monitor.py:191 -- Starting autoscaler metrics server on port 44217
2023-10-08 23:13:33,491 INFO monitor.py:216 -- Monitor: Started
2023-10-08 23:13:33,506 INFO node_provider.py:53 -- ClusterState: Loaded cluster state: []
2023-10-08 23:13:33,507 INFO node_provider.py:114 -- ClusterState: Writing cluster state: ['216.48.179.215', '164.52.201.70']
2023-10-08 23:13:33,507 INFO autoscaler.py:274 -- disable_node_updaters:False
2023-10-08 23:13:33,507 INFO autoscaler.py:282 -- disable_launch_config_check:False
2023-10-08 23:13:33,507 INFO autoscaler.py:294 -- foreground_node_launch:False
2023-10-08 23:13:33,507 INFO autoscaler.py:304 -- worker_liveness_check:True
2023-10-08 23:13:33,507 INFO autoscaler.py:312 -- worker_rpc_drain:True
2023-10-08 23:13:33,508 INFO autoscaler.py:362 -- StandardAutoscaler: {'cluster_name': 'default', 'auth': {'ssh_user': 'user', 'ssh_private_key': '~/ray_bootstrap_key.pem'}, 'upscaling_speed': 1.0, 'idle_timeout_minutes': 30, 'docker': {'image': 'rayproject/ray:2.7.1.9f07c1-py311-gpu', 'worker_image': 'rayproject/ray:2.7.1.9f07c1-py311-gpu', 'container_name': 'ray_container', 'pull_before_run': True, 'run_options': ['--ulimit nofile=65536:65536']}, 'initialization_commands': [], 'setup_commands': ['sudo apt-get update', 'sudo apt-get install gcc ffmpeg libsm6 libxext6  -y', 'pip install -r "/app/requirements-gpu.txt"'], 'head_setup_commands': ['sudo apt-get update', 'sudo apt-get install gcc ffmpeg libsm6 libxext6  -y', 'pip install -r "/app/requirements-gpu.txt"'], 'worker_setup_commands': ['sudo apt-get update', 'sudo apt-get install gcc ffmpeg libsm6 libxext6  -y', 'pip install -r "/app/requirements-gpu.txt"'], 'head_start_ray_commands': ['ray stop', 'ulimit -c unlimited && export RAY_health_check_timeout_ms=30000 && ray start --head --node-ip-address=164.52.201.70 --port=6379 --object-manager-port=8076 --autoscaling-config=~/ray_bootstrap_config.yaml --dashboard-host=0.0.0.0 --disable-usage-stats --log-color=auto -v'], 'worker_start_ray_commands': ['ray stop', 'ray start --address=164.52.201.70:6379 --object-manager-port=8076'], 'file_mounts': {'~/.ssh/id_rsa': '/home/ray/.ssh/id_rsa', '/app/requirements-gpu.txt': '/app/requirements-gpu.txt'}, 'cluster_synced_files': [], 'file_mounts_sync_continuously': False, 'rsync_exclude': ['**/.git', '**/.git/**'], 'rsync_filter': ['.gitignore'], 'provider': {'type': 'local', 'head_ip': '164.52.201.70', 'worker_ips': ['216.48.179.215']}, 'available_node_types': {'local.cluster.node': {'node_config': {}, 'resources': {}, 'min_workers': 1, 'max_workers': 1}}, 'head_node_type': 'local.cluster.node', 'max_workers': 1, 'no_restart': False}
2023-10-08 23:13:33,509 INFO monitor.py:385 -- Autoscaler has not yet received load metrics. Waiting.
2023-10-08 23:13:38,522 INFO autoscaler.py:141 -- The autoscaler took 0.0 seconds to fetch the list of non-terminated nodes.
2023-10-08 23:13:38,522 INFO autoscaler.py:421 -- 
======== Autoscaler status: 2023-10-08 23:13:38.522726 ========
Node status
---------------------------------------------------------------
Healthy:
 1 local.cluster.node
Pending:
 (no pending nodes)
Recent failures:
 (no failures)

Resources
---------------------------------------------------------------
Usage:
 0.0/12.0 CPU
 0.0/1.0 GPU
 0B/28.57GiB memory
 0B/14.29GiB object_store_memory

Demands:
 (no resource demands)
2023-10-08 23:13:38,524 INFO autoscaler.py:1379 -- StandardAutoscaler: Queue 1 new nodes for launch
2023-10-08 23:13:38,524 INFO autoscaler.py:464 -- The autoscaler took 0.002 seconds to complete the update iteration.
2023-10-08 23:13:38,524 INFO node_launcher.py:177 -- NodeLauncher0: Got 1 nodes to launch.
2023-10-08 23:13:38,525 INFO monitor.py:415 -- :event_summary:Resized to 12 CPUs, 1 GPUs.
2023-10-08 23:13:38,526 INFO node_provider.py:114 -- ClusterState: Writing cluster state: ['216.48.179.215', '164.52.201.70']
2023-10-08 23:13:38,526 INFO node_launcher.py:177 -- NodeLauncher0: Launching 1 nodes, type local.cluster.node.
2023-10-08 23:13:43,534 INFO autoscaler.py:141 -- The autoscaler took 0.0 seconds to fetch the list of non-terminated nodes.
2023-10-08 23:13:43,534 INFO autoscaler.py:421 -- 
======== Autoscaler status: 2023-10-08 23:13:43.534774 ========
Node status
---------------------------------------------------------------
Healthy:
 1 local.cluster.node
Pending:
 216.48.179.215: local.cluster.node, uninitialized
Recent failures:
 (no failures)

Resources
---------------------------------------------------------------
Usage:
 0.0/12.0 CPU
 0.0/1.0 GPU
 0B/28.57GiB memory
 0B/14.29GiB object_store_memory

Demands:
 (no resource demands)
2023-10-08 23:13:43,537 INFO autoscaler.py:1326 -- Creating new (spawn_updater) updater thread for node 216.48.179.215.`

ajaichemmanam · 2023-10-09T06:20:26Z

The above log is for
2023-10-08 23:13:33,485 INFO monitor.py:691 -- Ray version: 2.7.1
2023-10-08 23:13:33,485 INFO monitor.py:692 -- Ray commit: 9f07c12

jmakov · 2023-10-09T18:34:44Z

This issue is still present in ray 2.7.1

ajaichemmanam · 2023-10-15T08:01:08Z

Let us know if any other details are required

architkulkarni · 2023-10-19T23:06:28Z

Actually, when I reproduced the issue earlier, I had forgotten to open all the ports. After opening all ports, I wasn't able to reproduce the issue.

@jmakov or @ajaichemmanam if you're able to reproduce the issue and you have time, it would potentially be very helpful if you could amend your YAML file as follows:

worker_start_ray_commands:
    - ray stop
    - "echo \"Executing: ray start --address=$RAY_HEAD_IP:6379\" >> ray_worker_output.txt"
    - ray start --address=$RAY_HEAD_IP:6379 >> ray_worker_output.txt 2>&1

And share the ray_worker_output.txt from the failing worker nodes. (Or do modify the commands in any way you see fit, as long as we can see the output of ray start --address=...)

jmakov · 2023-10-20T12:23:12Z

@architkulkarni I've added ulimit -c unlimited && ray start --address=$RAY_HEAD_IP:6379 --disable-usage-stats >> /tmp/ray_worker_output.txt 2>&1 and get

ls /tmp/ray_worker_output.txt
ls: cannot access '/tmp/ray_worker_output.txt': No such file or directory

MatteoCorvi · 2024-06-13T10:52:17Z

Worker nodes almost always stuck as launching/uninitialized or no cluster status at all.
Only way a recent version (2.22) seems to be working for me is using a conda env with an old version of ray (2.3) and pip install -U ray==2.22. 100% success creating a working cluster on prem so far. New dashboard and logging, plus the cluster seems more stable so I assume improvements of new versions went through.

jacksonjacobs1 · 2024-06-13T15:40:38Z

Hi @MatteoCorvi,

Glad to hear you were able to get this working, but I'm a little confused about your solution. How is this different from simply installing ray version 2.22?

MatteoCorvi · 2024-06-13T16:43:00Z

Hi @jacksonjacobs1,
not sure but aside ray not much else was changed if I recall, so just updating might have kept old versions of the dependencies that don't cause issues.

jacksonjacobs1 · 2024-06-13T18:05:21Z

Interesting, thanks.

It would be fantastic if a Ray dev from the cluster team could comment on why newer versions of ray seem to break on-prem cluster launching & cleanup.

@anyscalesam What would be your recommendation for resolving this issue?

Tipmethewink · 2024-07-23T09:29:27Z

I'm running ray on AWS EC2 instances with the same issue. ray up... launches the head node though there's no further logs (no logging about setting up nodes) and the head node sits in uninitialized status, eventually ray up times out and everything shuts down. If I commented out file_mounts then the cluster came up fine. Which led me to realise ray doesn't use rsync over ssh (my assumption), it's using the default 873 port which I hadn't opened (it's not documented here). As soon as I opened 873, it all sprang to life.

jacksonjacobs1 · 2024-07-23T17:06:44Z

Hi @Tipmethewink, are you using existing EC2 instances (equivalent to an on-prem cluster) or using ray cluster launcher to provision new EC2 instances?

Tipmethewink · 2024-07-23T22:01:08Z

I'm using the cluster launcher: ray up cluster.yaml.

…

On Tue, 23 Jul 2024, 18:07 Jackson Jacobs, ***@***.***> wrote: Hi @Tipmethewink <https://github.com/Tipmethewink>, are you using existing EC2 instances (equivalent to an on-prem cluster) or using ray cluster launcher to provision new EC2 instances? — Reply to this email directly, view it on GitHub <#39565 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ALGU36MYJZKFLJXQ7ZUIK3TZN2ETVAVCNFSM6AAAAAA4UBMZY6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENBVG44DEMRZHA> . You are receiving this because you were mentioned.Message ID: ***@***.***>

jyakaranda · 2024-08-06T13:04:18Z

I got this same painful issue today, after retriving the codes and logs from ray dashboard, I made my worker node started finally.
I'm not sure if this would solve your guys problem, I'm still want to share my debugging process.

If you could start head node via ray up cluster.yaml, check out the monitor.log and monitor.out in dashboard at http://127.0.0.1:8265/#/logs (forwarded by ray dashboard cluster.yaml), sometimes these logs would tell you whether the worker node is starting or hanging. And in my case, the head node is hanging on simple ssh issue;

ssh hanging issue is tricky. in my case, it's due to ray is using same auth for all head node and worker nodes, but I didn't create a same user in worker node as header node. After create the same user on worker node and uncomment ssh_private_key, the worker node could finally be sshed and started from header node.
like former comment mentions, if the worker node didn't stop the container properly, the header node still could not start worker node properly too, so you might need to docker stop RAY_CONTAINER_NAME manually before ray up.

hopes these findings could help you

olly-writes-code · 2024-10-16T22:06:35Z

Hey folks, I ran into a similar issue when trying to set up an "On Prem" 1 click cluster via Lambda Labs.

I could start the cluster successfully when not using a docker image. But as soon as I switched to the docker image, I ran into the uninitialized issue.

I would get something like

poetry run ray status
======== Autoscaler status: 2024-10-16 21:28:29.027359 ========
Node status
---------------------------------------------------------------
Active:
 1 local.cluster.node
Pending:
 scrubbed_ip: local.cluster.node, uninitialized
Recent failures:
 (no failures)

Resources
---------------------------------------------------------------
Usage:
 0.0/8.0 CPU
 0B/18.61GiB memory
 0B/9.31GiB object_store_memory

Demands:
 (no resource demands)

Here's the config.yaml I was using.

cluster_name: test-cluster

upscaling_speed: 1.0

docker:
  container_name: basic-ray-ml-image
  image: rayproject/ray-ml:latest-gpu
  pull_before_run: true

provider:
 type: local
 head_ip: scrubbed_ip
 worker_ips:
  - scrubbed_ip

auth:
 ssh_user: ubuntu
 ssh_private_key: ~/.ssh/keypair

min_workers: 1
max_workers: 1

setup_commands:
 - pip install ray[default]

head_start_ray_commands:
 - ray stop
 - ray start --head --port=6379 --autoscaling-config=~/ray_bootstrap_config.yaml --dashboard-host=0.0.0.0

worker_start_ray_commands:
 - ray stop
 - ray start --address=$RAY_HEAD_IP:6379

I managed to fix this by

Manually rebooting the node that wouldn't initialize.
I noticed when SSHing into that node that docker ps would return permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock. I fixed this by running sudo usermod -aG docker $USER, exiting the machine and then SSH'ing in again. This might be a lambda labs thing.
Re-running ray up from the head node

Maybe this helps some people!

I feel like this stems from poor logging / error reporting from the other nodes.

olly-writes-code · 2024-10-16T22:24:24Z

Additionally I don't see any logging or log file.

Even though the instruction from poetry run ray monitor my_cluster.yaml is to find logs at

==> /tmp/ray/session_latest/logs/monitor.out <==

I don't see such a file on any of the nodes

cat /tmp/ray/session_latest/logs/monitor.out
cat: /tmp/ray/session_latest/logs/monitor.out: No such file or directory

olly-writes-code · 2024-10-17T00:06:05Z

I want to flag that the working node always gets stuck in the Pending: uninitialized state when trying to run ray up for a custom docker image from AWS ECR.

A few things to note:

My head node spins up successfully, pulling my custom docker image.
My worker node is logged into the docker, such it shouldn't be a permissions issue

Here's an example of my cluster.yaml

cluster_name: test-cluster

max_workers: 4
upscaling_speed: 1.0

docker:
  image: xyz.dkr.ecr.region.amazonaws.com/ray-worker:latest
  container_name: ray-worker
  pull_before_run: true

provider:
 type: local
 head_ip: scrubbed_ip
 worker_ips:
  - scrubbed_ip

auth:
 ssh_user: ubuntu
 ssh_private_key: ~/.ssh/keypair

min_workers: 1
max_workers: 1

setup_commands:
 - pip install ray

head_start_ray_commands:
 - ray stop
 - ray start --head --port=6379 --autoscaling-config=~/ray_bootstrap_config.yaml --dashboard-host=0.0.0.0

worker_start_ray_commands:
 - ray stop
 - ray start --address=$RAY_HEAD_IP:6379

If I try to force starting the worker node I get this

ray start --address='scrubbed_ip:6379'
Local node IP: scrubbed_ip
[2024-10-16 23:17:07,188 E 14956 14956] gcs_rpc_client.h:179: Failed to connect to GCS at address scrubbed_ip:6379 within 5 seconds.
[2024-10-16 23:17:38,201 W 14956 14956] gcs_client.cc:177: Failed to get cluster ID from GCS server: TimedOut: Timed out while waiting for GCS to become available.

ray monitor my_cluster.yaml looks like

Loaded cached provider configuration
If you experience issues with the cloud provider, try re-running the command with --no-config-cache.
2024-10-17 00:50:30,507	INFO node_provider.py:53 -- ClusterState: Loaded cluster state: ['IP_REMOVED', 'IP_REMOVED']
Fetched IP: IP_REMOVED
==> /tmp/ray/session_latest/logs/monitor.err <==

==> /tmp/ray/session_latest/logs/monitor.log <==
2024-10-17 00:45:00,544	INFO monitor.py:688 -- Starting monitor using ray installation: /usr/local/lib/python3.11/dist-packages/ray/__init__.py
2024-10-17 00:45:00,545	INFO monitor.py:689 -- Ray version: 2.30.0
2024-10-17 00:45:00,545	INFO monitor.py:690 -- Ray commit: 97c37298df9e997b86ca9efed824e27024f3bd60
2024-10-17 00:45:00,545	INFO monitor.py:691 -- Monitor started with command: ['/usr/local/lib/python3.11/dist-packages/ray/autoscaler/_private/monitor.py', '--logs-dir=/tmp/ray/session_2024-10-17_00-44-58_905796_119/logs', '--logging-rotate-bytes=536870912', '--logging-rotate-backup-count=5', '--gcs-address=IP_REMOVED:6379', '--autoscaling-config=/root/ray_bootstrap_config.yaml', '--monitor-ip=IP_REMOVED']
2024-10-17 00:45:00,554	INFO monitor.py:159 -- session_name: session_2024-10-17_00-44-58_905796_119
2024-10-17 00:45:00,556	INFO monitor.py:191 -- Starting autoscaler metrics server on port 44217
2024-10-17 00:45:00,569	INFO monitor.py:216 -- Monitor: Started
2024-10-17 00:45:00,585	INFO node_provider.py:53 -- ClusterState: Loaded cluster state: []
2024-10-17 00:45:00,586	INFO node_provider.py:114 -- ClusterState: Writing cluster state: ['IP_REMOVED', 'IP_REMOVED']
2024-10-17 00:45:00,586	INFO autoscaler.py:280 -- disable_node_updaters:False
2024-10-17 00:45:00,586	INFO autoscaler.py:288 -- disable_launch_config_check:False
2024-10-17 00:45:00,586	INFO autoscaler.py:300 -- foreground_node_launch:False
2024-10-17 00:45:00,586	INFO autoscaler.py:310 -- worker_liveness_check:True
2024-10-17 00:45:00,586	INFO autoscaler.py:318 -- worker_rpc_drain:True
2024-10-17 00:45:00,589	INFO autoscaler.py:368 -- StandardAutoscaler: {'cluster_name': 'test-cluster', 'auth': {'ssh_user': 'ubuntu', 'ssh_private_key': '~/ray_bootstrap_key.pem'}, 'upscaling_speed': 1.0, 'idle_timeout_minutes': 5, 'docker': {'image': 'xyz.dkr.ecr.region.amazonaws.com/ray-worker:latest', 'container_name': 'ray-worker', 'pull_before_run': True}, 'initialization_commands': [], 'setup_commands': ['pip install ray[default]==2.30.0'], 'head_setup_commands': ['pip install ray[default]==2.30.0'], 'worker_setup_commands': ['pip install ray[default]==2.30.0'], 'head_start_ray_commands': ['ray stop', 'ray start --head --port=6379 --autoscaling-config=~/ray_bootstrap_config.yaml --dashboard-host=0.0.0.0'], 'worker_start_ray_commands': ['ray stop', 'ray start --address=$RAY_HEAD_IP:6379'], 'file_mounts': {}, 'cluster_synced_files': [], 'file_mounts_sync_continuously': False, 'rsync_exclude': [], 'rsync_filter': [], 'max_workers': 1, 'provider': {'type': 'local', 'head_ip': 'IP_REMOVED', 'worker_ips': ['IP_REMOVED']}, 'available_node_types': {'local.cluster.node': {'node_config': {}, 'resources': {}, 'min_workers': 1, 'max_workers': 1}}, 'head_node_type': 'local.cluster.node', 'no_restart': False}
2024-10-17 00:45:00,592	INFO monitor.py:383 -- Autoscaler has not yet received load metrics. Waiting.
2024-10-17 00:45:05,606	INFO autoscaler.py:147 -- The autoscaler took 0.001 seconds to fetch the list of non-terminated nodes.
2024-10-17 00:45:05,607	INFO autoscaler.py:427 --
======== Autoscaler status: 2024-10-17 00:45:05.607296 ========
Node status
---------------------------------------------------------------
Active:
 1 local.cluster.node
Pending:
 (no pending nodes)
Recent failures:
 (no failures)

Resources
---------------------------------------------------------------
Usage:
 0.0/8.0 CPU
 0B/18.59GiB memory
 0B/9.30GiB object_store_memory

Demands:
 (no resource demands)
2024-10-17 00:45:05,611	INFO autoscaler.py:1389 -- StandardAutoscaler: Queue 1 new nodes for launch
2024-10-17 00:45:05,612	INFO autoscaler.py:470 -- The autoscaler took 0.006 seconds to complete the update iteration.
2024-10-17 00:45:05,612	INFO node_launcher.py:177 -- NodeLauncher0: Got 1 nodes to launch.
2024-10-17 00:45:05,615	INFO monitor.py:413 -- :event_summary:Resized to 8 CPUs.
2024-10-17 00:45:05,665	INFO node_provider.py:114 -- ClusterState: Writing cluster state: ['IP_REMOVED', 'IP_REMOVED']
2024-10-17 00:45:05,666	INFO node_launcher.py:177 -- NodeLauncher0: Launching 1 nodes, type local.cluster.node.
2024-10-17 00:45:10,639	INFO autoscaler.py:147 -- The autoscaler took 0.001 seconds to fetch the list of non-terminated nodes.
2024-10-17 00:45:10,640	INFO autoscaler.py:427 --
======== Autoscaler status: 2024-10-17 00:45:10.640625 ========
Node status
---------------------------------------------------------------
Active:
 1 local.cluster.node
Pending:
 IP_REMOVED: local.cluster.node, uninitialized
Recent failures:
 (no failures)

Resources
---------------------------------------------------------------
Usage:
 0.0/8.0 CPU
 0B/18.59GiB memory
 0B/9.30GiB object_store_memory

Demands:
 (no resource demands)
2024-10-17 00:45:10,647	INFO autoscaler.py:1336 -- Creating new (spawn_updater) updater thread for node IP_REMOVED.

==> /tmp/ray/session_latest/logs/monitor.out <==

olly-writes-code · 2024-10-17T00:15:40Z

Anything to help debug things would be very useful!

olly-writes-code · 2024-10-17T05:45:58Z

Fixed! I've managed to ray up the cluster from a private docker image! It looks like the Ray version on my Docker image was different from the one getting pip installed on the worker node.

jacksonjacobs1 · 2024-10-17T14:44:58Z

Thanks @olly-writes-code . Were you able to successfully tear down and re-initialize your cluster?

I ask because your ray version incompatibility issue was definitely not the case for me - I pulled the pre-built rayproject/ray docker image onto all nodes.

On the first attempt, the cluster spun up without any issues. It was only after running ray down and ray up again that the issue started.

olly-writes-code · 2024-10-17T15:57:06Z

Interesting. It seems that running ray down doesn't stop the docker container on the worker node. I had to SSH into the node and kill the docker container manually. Maybe it's related to this issue #17689

DmitriGekhtman · 2024-10-17T23:44:26Z

Local node provider is not actively maintained.
I'd recommend looking into alternative strategies for managing Ray on-prem.

olly-writes-code · 2024-10-18T05:49:05Z

Ahh, damn, okay. Is it correct to say that Ray is not recommended for running training on a cluster of GPU machines provided by someone other than AWS, Azure, or GCP?

DmitriGekhtman · 2024-10-18T06:05:54Z

Ahh, damn, okay. Is it correct to say that Ray is not recommended for running training on a cluster of GPU machines provided by someone other than AWS, Azure, or GCP?

You can use Ray in any on-prem or cloud environment, but I'd recommend figuring out another way to orchestrate the process of pulling images and running Ray start.
One strategy to run Ray on-prem (or on an unsupported cloud provider) is to first figure out how to run Kubernetes in your environment, then use KubeRay to manage Ray clusters in the Kubernetes cluster.

olly-writes-code · 2024-10-18T14:56:02Z

I see. Thanks for the clarification @DmitriGekhtman :)

For future users, it would be great if the deprecation of local clusters could be made clear in the docs.

olly-writes-code · 2024-10-18T14:59:47Z

This doc is clearly false now https://docs.ray.io/en/latest/cluster/vms/user-guides/launching-clusters/on-premises.html

"This document describes how to set up an on-premise Ray cluster, i.e., to run Ray on bare metal machines, or in a private cloud."

DmitriGekhtman · 2024-10-18T17:27:56Z

Ah, I think I might have spoken too soon on another thread about this functionality being officially deprecated.

However, based on my experience with the Ray project, this functionality is not very well maintained.

You will likely get better results by "setting up manually", i.e. running ray start on each of the machines in the cluster. If you have ssh access to each of the machines, you can write a for loop to do this.

olly-writes-code · 2024-10-18T18:10:17Z

However, based on my experience with the Ray project, this functionality is not very well maintained.

Yes, that is my experience, having worked with code. Docs are badly maintained and the code is very flaky. We have burned 3 days, battling this code, only to find out it is poorly supported.

Due to this experience and confusion over the state of deprecation, etc., we will not use any Ray (or Anyscale) anytime soon.

DmitriGekhtman · 2024-10-18T18:26:54Z

cc @anyscalesam on the poor UX here. My recommendation to the maintainers would be to officially deprecate local node provider.

The best maintained and most popular method for using Ray in the OSS is to run Ray on Kubernetes using KubeRay.

The Anyscale product is also quite stable and reliable (as it's used by paying customers of Anyscale.)

olly-writes-code · 2024-10-18T18:31:07Z

Right, but the sense is that due to the incentives at play, support for Ray OSS will weaken to encourage people to pay for Anyscale, precisely as described above. This leaves a very bitter taste, which means we won't even try Ray and, therefore, would never consider Anyscale.

flyingfalling · 2024-12-03T03:29:03Z

Yes, it is all very confusing, since local bare metal is the most "obvious" way of running ray to people from supercomputer background (like openMPI). I've written a suite of ansible scripts to start up ray in an openMPI-like fashion, and figure out sub-GPU chunks and custom resources as well (https://github.com/flyingfalling/pyraygputils). However, it is VERY hackish (am polishing it in parallel with some other related projects). It would be very unfortunate if bare metal ray were deprecated...

monsieurzhang · 2025-02-06T14:32:12Z

Same issue with "worker nodes uninitialized".
In short, for me, this problem is fixed by removing one line.
for t in T: t.start() t.join()

It changes the initialization for each worker from paralization to serialization.
Related issue: #38718

The last line in log file: monitor.log, shows:
Creating new (spawn_updater) updater thread for node xxx

In line https://github.com/ray-project/ray/blob/master/python/ray/autoscaler/_private/autoscaler.py#L717-L719
It is noted that some similar problem has been detected, but maybe for "local" nodes, it's not the focus.
Spawning these threads directly seems to cause problems

PR-5903 adds back the multi-thread processing.
So we just need to intialize worker one by one.

jmakov added bug Something that is supposed to be working; but isn't triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Sep 11, 2023

jmakov mentioned this issue Sep 12, 2023

[Ray Cluster] Ray cluster stuck after using ray up #38718

Open

jjyao added the core Issues that should be addressed in Ray Core label Sep 25, 2023

rickyyx self-assigned this Sep 25, 2023

rickyyx added this to the Autoscaler V2 milestone Sep 25, 2023

jmakov mentioned this issue Sep 30, 2023

[Core]: ray start throws ValueError: acceleratorType should match v(generation)-(cores/chips). Got . #40001

Closed

rickyyx assigned gvspraveen and unassigned rickyyx Oct 2, 2023

rickyyx added core-clusters For launching and managing Ray clusters/jobs/kubernetes and removed core Issues that should be addressed in Ray Core labels Oct 2, 2023

architkulkarni added P1 Issue that should be fixed within a few weeks and removed triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Oct 3, 2023

architkulkarni assigned architkulkarni and unassigned gvspraveen Oct 5, 2023

millefalcon mentioned this issue Jul 15, 2024

Cannot start a simple local cluster using the config.yaml - workers are not found #42128

Open

jjyao added P2 Important issue, but not time-critical and removed P1 Issue that should be fixed within a few weeks labels Oct 30, 2024

jyakaranda mentioned this issue Jan 17, 2025

[autoscaler] Fix potential dead lock in local provider #49909

Open

8 tasks

[ray local cluster] nodes marked as uninitialized #39565

[ray local cluster] nodes marked as uninitialized #39565

Comments

jmakov commented Sep 11, 2023

What happened + What you expected to happen

Versions / Dependencies

Reproduction script

Issue Severity

rkooo567 commented Sep 25, 2023

rickyyx commented Sep 25, 2023

jmakov commented Sep 26, 2023 • edited by architkulkarni Loading

ajaichemmanam commented Oct 1, 2023

rickyyx commented Oct 2, 2023

jmakov commented Oct 2, 2023 • edited Loading

rkooo567 commented Oct 2, 2023

architkulkarni commented Oct 5, 2023 • edited Loading

jmakov commented Oct 6, 2023 • edited Loading

ajaichemmanam commented Oct 9, 2023 • edited by architkulkarni Loading

ajaichemmanam commented Oct 9, 2023

jmakov commented Oct 9, 2023

ajaichemmanam commented Oct 15, 2023

architkulkarni commented Oct 19, 2023 • edited Loading

jmakov commented Oct 20, 2023

MatteoCorvi commented Jun 13, 2024 • edited Loading

jacksonjacobs1 commented Jun 13, 2024

MatteoCorvi commented Jun 13, 2024

jacksonjacobs1 commented Jun 13, 2024

Tipmethewink commented Jul 23, 2024 • edited Loading

jacksonjacobs1 commented Jul 23, 2024

Tipmethewink commented Jul 23, 2024 via email

jyakaranda commented Aug 6, 2024

olly-writes-code commented Oct 16, 2024 • edited Loading

olly-writes-code commented Oct 16, 2024

olly-writes-code commented Oct 17, 2024 • edited Loading

olly-writes-code commented Oct 17, 2024 • edited Loading

olly-writes-code commented Oct 17, 2024

jacksonjacobs1 commented Oct 17, 2024

olly-writes-code commented Oct 17, 2024

DmitriGekhtman commented Oct 17, 2024

olly-writes-code commented Oct 18, 2024

DmitriGekhtman commented Oct 18, 2024 • edited Loading

olly-writes-code commented Oct 18, 2024

olly-writes-code commented Oct 18, 2024

DmitriGekhtman commented Oct 18, 2024

olly-writes-code commented Oct 18, 2024

DmitriGekhtman commented Oct 18, 2024

olly-writes-code commented Oct 18, 2024

flyingfalling commented Dec 3, 2024

monsieurzhang commented Feb 6, 2025

jmakov commented Sep 26, 2023 •

edited by architkulkarni

Loading

jmakov commented Oct 2, 2023 •

edited

Loading

architkulkarni commented Oct 5, 2023 •

edited

Loading

jmakov commented Oct 6, 2023 •

edited

Loading

ajaichemmanam commented Oct 9, 2023 •

edited by architkulkarni

Loading

architkulkarni commented Oct 19, 2023 •

edited

Loading

MatteoCorvi commented Jun 13, 2024 •

edited

Loading

Tipmethewink commented Jul 23, 2024 •

edited

Loading

olly-writes-code commented Oct 16, 2024 •

edited

Loading

olly-writes-code commented Oct 17, 2024 •

edited

Loading

olly-writes-code commented Oct 17, 2024 •

edited

Loading

DmitriGekhtman commented Oct 18, 2024 •

edited

Loading