Skip to content

[Dashboard] tail_job_logs() exits normally when WebSocket connection is lost unexpectedly #57002

@machichima

Description

@machichima

What happened + What you expected to happen

During the execution of tail_job_logs() after the job submission, if the ray head connection breaks, the tail_job_logs() will not raise any error. I think the error should be raised?

Versions / Dependencies

ray: 3.0.0.dev0 (install with editable)

Reproduction script

  1. start ray head with ray start --head
  2. Run following script for submit job and tailing logs
  3. Run ray stop, then you will see that tail_job_log() function exit without error

Script:

import asyncio
from ray.job_submission import JobSubmissionClient


async def test_tail_job_logs_with_restart():
    """Test the tail_job_logs function behavior when dashboard restarts"""

    try:
        # Create job submission client
        client = JobSubmissionClient("http://127.0.0.1:8265")

        # Submit a long-running job
        job_id = client.submit_job(
            entrypoint='python -c \'import time; print("Job started"); time.sleep(60); print("Job completed")\''
        )
        print(f"Job submitted with ID: {job_id}")

        # Start tailing logs
        print("Starting to tail job logs...")
        log_lines = []

        async for line in client.tail_job_logs(job_id):
            print(f"LOG: {line.strip()}")
            log_lines.append(line)

            job_info = client.get_job_info(job_id)
            print(f"job status: {job_info.status}")

        print("tail_job_logs() returned normally (no exception)")

    except Exception as e:
        print(f"Exception during tail_job_logs: {e}")


if __name__ == "__main__":
    asyncio.run(test_tail_job_logs_with_restart())

Result screenshot:

Image

Issue Severity

None

Metadata

Metadata

Assignees

Labels

P1Issue that should be fixed within a few weeksbugSomething that is supposed to be working; but isn'tcommunity-backlogcoreIssues that should be addressed in Ray CoreobservabilityIssues related to the Ray Dashboard, Logging, Metrics, Tracing, and/or Profilingstabilityusability

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions