Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Core] ray installed from conda does not include the dashboard #30834

Closed
YarShev opened this issue Dec 1, 2022 · 14 comments
Closed

[Core] ray installed from conda does not include the dashboard #30834

YarShev opened this issue Dec 1, 2022 · 14 comments
Assignees
Labels
bug Something that is supposed to be working; but isn't core Issues that should be addressed in Ray Core dependencies Pull requests that update a dependency file P1 Issue that should be fixed within a few weeks
Milestone

Comments

@YarShev
Copy link

YarShev commented Dec 1, 2022

What happened + What you expected to happen

When initializing ray (ray.init() call) I get a bunch of outputs.

2022-12-01 14:30:35,649 ERROR services.py:1403 -- Failed to start the dashboard: Failed to start the dashboard, return code 0
 The last 10 lines of /tmp/ray/session_2022-12-01_14-30-33_539844_204848/logs/dashboard.log:
    raise ex
  File "miniconda3/envs/test-env/lib/python3.8/site-packages/ray/dashboard/http_server_head.py", line 78, in __init__
    build_dir = setup_static_dir()
  File "miniconda3/envs/test-env/lib/python3.8/site-packages/ray/dashboard/http_server_head.py", line 45, in setup_static_dir
    raise dashboard_utils.FrontendNotFoundError(
ray.dashboard.utils.FrontendNotFoundError: [Errno 2] Dashboard build directory not found. If installing from source, please follow the additional steps required to build the dashboard(cd py
thon/ray/dashboard/client && npm install && npm ci && npm run build): 'miniconda3/envs/test-env/lib/python3.8/site-packages/ray/dashboard/client/build'

2022-12-01 14:30:35,465 ERROR base_events.py:1707 -- Unclosed client session
client_session: <aiohttp.client.ClientSession object at 0x7fdaf99987f0>
2022-12-01 14:30:35,465 ERROR base_events.py:1707 -- Unclosed client session
2022-12-01 14:30:35,649 ERROR services.py:1404 -- Failed to start the dashboard, return code 0
 The last 10 lines of /tmp/ray/session_2022-12-01_14-30-33_539844_204848/logs/dashboard.log:
    raise ex
  File "miniconda3/envs/test-env/lib/python3.8/site-packages/ray/dashboard/http_server_head.py", line 78, in __init__
    build_dir = setup_static_dir()
  File "miniconda3/envs/test-env/lib/python3.8/site-packages/ray/dashboard/http_server_head.py", line 45, in setup_static_dir
    raise dashboard_utils.FrontendNotFoundError(
ray.dashboard.utils.FrontendNotFoundError: [Errno 2] Dashboard build directory not found. If installing from source, please follow the additional steps required to build the dashboard(cd py
thon/ray/dashboard/client && npm install && npm ci && npm run build): 'miniconda3/envs/test-env/lib/python3.8/site-packages/ray/dashboard/client/build'

2022-12-01 14:30:35,465 ERROR base_events.py:1707 -- Unclosed client session
client_session: <aiohttp.client.ClientSession object at 0x7fdaf99987f0>
2022-12-01 14:30:35,465 ERROR base_events.py:1707 -- Unclosed client session
Traceback (most recent call last):
  File "miniconda3/envs/test-env/lib/python3.8/site-packages/ray/_private/services.py", line 1389, in start_api_server
    raise Exception(err_msg + last_log_str)
Exception: Failed to start the dashboard, return code 0
 The last 10 lines of /tmp/ray/session_2022-12-01_14-30-33_539844_204848/logs/dashboard.log:
    raise ex
  File "miniconda3/envs/test-env/lib/python3.8/site-packages/ray/dashboard/http_server_head.py", line 78, in __init__
    build_dir = setup_static_dir()
  File "miniconda3/envs/test-env/lib/python3.8/site-packages/ray/dashboard/http_server_head.py", line 45, in setup_static_dir
    raise dashboard_utils.FrontendNotFoundError(
ray.dashboard.utils.FrontendNotFoundError: [Errno 2] Dashboard build directory not found. If installing from source, please follow the additional steps required to build the dashboard(cd py
thon/ray/dashboard/client && npm install && npm ci && npm run build): 'miniconda3/envs/test-env/lib/python3.8/site-packages/ray/dashboard/client/build'

2022-12-01 14:30:35,465 ERROR base_events.py:1707 -- Unclosed client session
client_session: <aiohttp.client.ClientSession object at 0x7fdaf99987f0>
2022-12-01 14:30:35,465 ERROR base_events.py:1707 -- Unclosed client session
2022-12-01 14:30:35,770 INFO worker.py:1528 -- Started a local Ray instance.
RayContext(dashboard_url=None, python_version='3.8.15', ray_version='2.1.0', ray_commit='{{RAY_COMMIT_SHA}}', address_info={'node_ip_address': '10.241.129.69', 'raylet_ip_address':
'10.241.129.69', 'redis_address': None, 'object_store_address': '/tmp/ray/session_2022-12-01_14-30-33_539844_204848/sockets/plasma_store', 'raylet_socket_name': '/tmp/ray/session_2022-12-01
_14-30-33_539844_204848/sockets/raylet', 'webui_url': None, 'session_dir': '/tmp/ray/session_2022-12-01_14-30-33_539844_204848', 'metrics_export_port': 50699, 'gcs_address': '10.241.129.69:
53293', 'address': '10.241.129.69:53293', 'dashboard_agent_listen_port': 52365, 'node_id': 'debb9178951b5b35bfef390be04087813790aa34678c5619e910becc'})

(raylet) [2022-12-01 14:30:43,908 E 206329 206375] (raylet) agent_manager.cc:134: The raylet exited immediately because the Ray agent failed. The raylet fate shares with the agent. This can
 happen because the Ray agent was unexpectedly killed or failed. See `dashboard_agent.log` for the root cause.
2022-12-01 14:31:13,565 WARNING worker.py:1839 -- The node with node id: debb9178951b5b35bfef390be04087813790aa34678c5619e910becc and address: 10.241.129.69 and node name: 10.241.12
9.69 has been marked dead because the detector has missed too many heartbeats from it. This can happen when a   (1) raylet crashes unexpectedly (OOM, preempted node, etc.)
        (2) raylet has lagging heartbeats due to slow network or busy workload.

Versions / Dependencies

Ray 2.1.0 (installed from conda-forge)
Python 3.8.15

The issue happens both on

NAME="Ubuntu"
VERSION="20.04.4 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.4 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal

and on

PRETTY_NAME="Ubuntu 22.04.1 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.1 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy

Reproduction script

import ray
ray.init()

Issue Severity

High: It blocks me from completing my task.

@YarShev YarShev added bug Something that is supposed to be working; but isn't triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Dec 1, 2022
@fishbone
Copy link
Contributor

fishbone commented Dec 2, 2022

Hi @YarShev we have work here to fix this issue. @SongGuyang is leading this effort.

@fishbone fishbone added core Issues that should be addressed in Ray Core and removed triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Dec 2, 2022
@fishbone fishbone added this to the Core Backlog milestone Dec 2, 2022
@fishbone fishbone added the P1 Issue that should be fixed within a few weeks label Dec 2, 2022
@SongGuyang
Copy link
Contributor

@YarShev Can you check the log of dashboard agent? The log path is /tmp/ray/session_latest/logs/dashboard_agent.log

@YarShev
Copy link
Author

YarShev commented Dec 2, 2022

2022-12-01 21:50:07,055 INFO agent.py:102 -- Parent pid is 13983
2022-12-01 21:50:07,056 INFO agent.py:128 -- Dashboard agent grpc address: 0.0.0.0:45325
2022-12-01 21:50:07,058 INFO utils.py:112 -- Get all modules by type: DashboardAgentModule
2022-12-01 21:50:07,668 WARNING tune_head.py:23 -- tune module is not available: ray.tune in ray > 0.7.5 requires 'tabulate'. Please re-run 'pip install ray[tune]' or 'pip install ray[rllib]'.
2022-12-01 21:50:07,671 INFO utils.py:145 -- Available modules: [<class 'ray.dashboard.modules.event.event_agent.EventAgent'>, <class 'ray.dashboard.modules.healthz.healthz_agent.HealthzAgent'>, <class 'ray.dashboard.modules.job.job_agent.JobAgent'>, <class 'ray.dashboard.modules.log.log_agent.LogAgent'>, <class 'ray.dashboard.modules.log.log_agent.LogAgentV1Grpc'>, <class 'ray.dashboard.modules.reporter.reporter_agent.ReporterAgent'>, <class 'ray.dashboard.modules.runtime_env.runtime_env_agent.RuntimeEnvAgent'>, <class 'ray.dashboard.modules.serve.serve_agent.ServeAgent'>]
2022-12-01 21:50:07,671 INFO agent.py:157 -- Loading DashboardAgentModule: <class 'ray.dashboard.modules.event.event_agent.EventAgent'>
2022-12-01 21:50:07,671 INFO event_agent.py:28 -- Event agent cache buffer size: 10240
2022-12-01 21:50:07,671 INFO agent.py:157 -- Loading DashboardAgentModule: <class 'ray.dashboard.modules.healthz.healthz_agent.HealthzAgent'>
2022-12-01 21:50:07,671 INFO agent.py:157 -- Loading DashboardAgentModule: <class 'ray.dashboard.modules.job.job_agent.JobAgent'>
2022-12-01 21:50:07,672 INFO agent.py:157 -- Loading DashboardAgentModule: <class 'ray.dashboard.modules.log.log_agent.LogAgent'>
2022-12-01 21:50:07,672 INFO agent.py:157 -- Loading DashboardAgentModule: <class 'ray.dashboard.modules.log.log_agent.LogAgentV1Grpc'>
2022-12-01 21:50:07,672 INFO agent.py:157 -- Loading DashboardAgentModule: <class 'ray.dashboard.modules.reporter.reporter_agent.ReporterAgent'>
2022-12-01 21:50:07,673 INFO agent.py:157 -- Loading DashboardAgentModule: <class 'ray.dashboard.modules.runtime_env.runtime_env_agent.RuntimeEnvAgent'>
2022-12-01 21:50:07,674 INFO agent.py:157 -- Loading DashboardAgentModule: <class 'ray.dashboard.modules.serve.serve_agent.ServeAgent'>
2022-12-01 21:50:07,674 INFO agent.py:162 -- Loaded 8 modules.
2022-12-01 21:50:07,678 INFO http_server_agent.py:74 -- Dashboard agent http address: 0.0.0.0:52365
2022-12-01 21:50:07,678 INFO http_server_agent.py:81 -- <ResourceRoute [GET] <PlainResource  /api/local_raylet_healthz> -> <function HealthzAgent.health_check at 0x7fe6940ef940>
2022-12-01 21:50:07,678 INFO http_server_agent.py:81 -- <ResourceRoute [OPTIONS] <PlainResource  /api/local_raylet_healthz> -> <bound method _PreflightHandler._preflight_handler of <aiohttp_cors.cors_config._CorsConfigImpl object at 0x7fe68dc9d880>>
2022-12-01 21:50:07,679 INFO http_server_agent.py:81 -- <ResourceRoute [POST] <PlainResource  /api/job_agent/jobs/> -> <function JobAgent.submit_job at 0x7fe6940a85e0>
2022-12-01 21:50:07,679 INFO http_server_agent.py:81 -- <ResourceRoute [OPTIONS] <PlainResource  /api/job_agent/jobs/> -> <bound method _PreflightHandler._preflight_handler of <aiohttp_cors.cors_config._CorsConfigImpl object at 0x7fe68dc9d880>>
2022-12-01 21:50:07,679 INFO http_server_agent.py:81 -- <ResourceRoute [POST] <DynamicResource  /api/job_agent/jobs/{job_or_submission_id}/stop> -> <function JobAgent.stop_job at 0x7fe6940a8790>
2022-12-01 21:50:07,679 INFO http_server_agent.py:81 -- <ResourceRoute [OPTIONS] <DynamicResource  /api/job_agent/jobs/{job_or_submission_id}/stop> -> <bound method _PreflightHandler._preflight_handler of <aiohttp_cors.cors_config._CorsConfigImpl object at 0x7fe68dc9d880>>
2022-12-01 21:50:07,679 INFO http_server_agent.py:81 -- <ResourceRoute [GET] <DynamicResource  /api/job_agent/jobs/{job_or_submission_id}/logs> -> <function JobAgent.get_job_logs at 0x7fe6940a8940>
2022-12-01 21:50:07,679 INFO http_server_agent.py:81 -- <ResourceRoute [OPTIONS] <DynamicResource  /api/job_agent/jobs/{job_or_submission_id}/logs> -> <bound method _PreflightHandler._preflight_handler of <aiohttp_cors.cors_config._CorsConfigImpl object at 0x7fe68dc9d880>>
2022-12-01 21:50:07,679 INFO http_server_agent.py:81 -- <ResourceRoute [GET] <DynamicResource  /api/job_agent/jobs/{job_or_submission_id}/logs/tail> -> <function JobAgent.tail_job_logs at 0x7fe6940a8af0>
2022-12-01 21:50:07,679 INFO http_server_agent.py:81 -- <ResourceRoute [OPTIONS] <DynamicResource  /api/job_agent/jobs/{job_or_submission_id}/logs/tail> -> <bound method _PreflightHandler._preflight_handler of <aiohttp_cors.cors_config._CorsConfigImpl object at 0x7fe68dc9d880>>
2022-12-01 21:50:07,679 INFO http_server_agent.py:81 -- <ResourceRoute [GET] <PlainResource  /api/ray/version> -> <function ServeAgent.get_version at 0x7fe697f248b0>
2022-12-01 21:50:07,679 INFO http_server_agent.py:81 -- <ResourceRoute [OPTIONS] <PlainResource  /api/ray/version> -> <bound method _PreflightHandler._preflight_handler of <aiohttp_cors.cors_config._CorsConfigImpl object at 0x7fe68dc9d880>>
2022-12-01 21:50:07,679 INFO http_server_agent.py:81 -- <ResourceRoute [GET] <PlainResource  /api/serve/deployments/> -> <function ServeAgent.get_all_deployments at 0x7fe697f24940>
2022-12-01 21:50:07,679 INFO http_server_agent.py:81 -- <ResourceRoute [OPTIONS] <PlainResource  /api/serve/deployments/> -> <bound method _PreflightHandler._preflight_handler of <aiohttp_cors.cors_config._CorsConfigImpl object at 0x7fe68dc9d880>>
2022-12-01 21:50:07,679 INFO http_server_agent.py:81 -- <ResourceRoute [GET] <PlainResource  /api/serve/deployments/status> -> <function ServeAgent.get_all_deployment_statuses at 0x7fe697f24af0>
2022-12-01 21:50:07,679 INFO http_server_agent.py:81 -- <ResourceRoute [OPTIONS] <PlainResource  /api/serve/deployments/status> -> <bound method _PreflightHandler._preflight_handler of <aiohttp_cors.cors_config._CorsConfigImpl object at 0x7fe68dc9d880>>
2022-12-01 21:50:07,679 INFO http_server_agent.py:81 -- <ResourceRoute [DELETE] <PlainResource  /api/serve/deployments/> -> <function ServeAgent.delete_serve_application at 0x7fe697f24ca0>
2022-12-01 21:50:07,679 INFO http_server_agent.py:81 -- <ResourceRoute [PUT] <PlainResource  /api/serve/deployments/> -> <function ServeAgent.put_all_deployments at 0x7fe697f24e50>
2022-12-01 21:50:07,679 INFO http_server_agent.py:81 -- <ResourceRoute [OPTIONS] <PlainResource  /api/serve/deployments/> -> <bound method _PreflightHandler._preflight_handler of <aiohttp_cors.cors_config._CorsConfigImpl object at 0x7fe68dc9d880>>
2022-12-01 21:50:07,679 INFO http_server_agent.py:81 -- <ResourceRoute [GET] <StaticResource  /logs -> PosixPath('/tmp/ray/session_2022-12-01_21-50-03_516652_13936/logs')> -> <bound method StaticResource._handle of <StaticResource  /logs -> PosixPath('/tmp/ray/session_2022-12-01_21-50-03_516652_13936/logs')>>
2022-12-01 21:50:07,679 INFO http_server_agent.py:81 -- <ResourceRoute [OPTIONS] <StaticResource  /logs -> PosixPath('/tmp/ray/session_2022-12-01_21-50-03_516652_13936/logs')> -> <bound method _PreflightHandler._preflight_handler of <aiohttp_cors.cors_config._CorsConfigImpl object at 0x7fe68dc9d880>>
2022-12-01 21:50:07,679 INFO http_server_agent.py:82 -- Registered 21 routes.
2022-12-01 21:50:07,707 ERROR event_agent.py:53 -- Connect to dashboard failed.
Traceback (most recent call last):
  File "miniconda3/envs/test-env/lib/python3.8/site-packages/ray/dashboard/modules/event/event_agent.py", line 44, in _connect_to_dashboard
    dashboard_rpc_address = dashboard_rpc_address.decode()
AttributeError: 'NoneType' object has no attribute 'decode'

@SongGuyang
Copy link
Contributor

No more other logs after the traceback?

@SongGuyang
Copy link
Contributor

SongGuyang commented Dec 2, 2022

ray.dashboard.utils.FrontendNotFoundError: [Errno 2] Dashboard build directory not found.

Another issue is about this. I think the dashboard frontend should be contained in your installed package if you use a released package by conda. @rkooo567 Do you know the reason about this exception in user side?

@YarShev
Copy link
Author

YarShev commented Dec 2, 2022

No more other logs after the traceback?

Unfortunately, no.

@YarShev
Copy link
Author

YarShev commented Dec 7, 2022

Just to let you know that if I install ray-default=2.1.0 from PyPI, everything is okay.

@YarShev
Copy link
Author

YarShev commented Dec 7, 2022

I wonder when a new release happens in conda-forge?

@SongGuyang
Copy link
Contributor

Just to let you know that if I install ray-default=2.1.0 from PyPI, everything is okay.

@iycheng Who can help to check it? Seems an issue about releasing.

@rkooo567
Copy link
Contributor

rkooo567 commented Dec 8, 2022

The issue seems like the frontend is not properly packaged for conda-forge release. I am looking for the owner of the conda-forge release now.

@mattip
Copy link
Contributor

mattip commented Dec 9, 2022

The dashboard is packaged separately: you need to install ray-dashboard.

$ conda create -c conda-forge -n ray_test python=3.8
$ conda activate ray_test
$ conda install -c conda-forge ray
# fails, there is no meta-package named ray
$ conda install -c conda-forge ray-default
# installs ray-core, ray-default but not the dashboard
$ python -c "import ray; ray.init(include_dashboard=True)"
# fails with messsages about "missing dashboard"
$ conda install -c conda-forge ray-dashboard
$ python -c "import ray; ray.init(include_dashboard=True)"
# succeeds

The master documentation is wrong, there is no conda package called ray, and installing ray-default does not install the dashboard. See conda-forge/ray-packages-feedstock#82

@mattip mattip changed the title [Core] The raylet exited immediately because the Ray agent failed [Core] ray installed from conda does not include the dashboard Dec 9, 2022
@mattip
Copy link
Contributor

mattip commented Dec 9, 2022

I changed the issue title to reflect the root cause of the failure.

@scv119 scv119 added the dependencies Pull requests that update a dependency file label Feb 16, 2023
@SongGuyang
Copy link
Contributor

@mattip So, we only need to explain this in ray documentation and close this issue?

@mattip
Copy link
Contributor

mattip commented Feb 22, 2023

I will close this. This was fixed in #30237. The latest docs (2.2.0) correctly refer to conda install -c conda-forge "ray-default".

@mattip mattip closed this as completed Feb 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something that is supposed to be working; but isn't core Issues that should be addressed in Ray Core dependencies Pull requests that update a dependency file P1 Issue that should be fixed within a few weeks
Projects
None yet
Development

No branches or pull requests

6 participants