-
Notifications
You must be signed in to change notification settings - Fork 7k
Closed
Labels
P1Issue that should be fixed within a few weeksIssue that should be fixed within a few weeksbugSomething that is supposed to be working; but isn'tSomething that is supposed to be working; but isn'tcommunity-backlogcoreIssues that should be addressed in Ray CoreIssues that should be addressed in Ray CoreobservabilityIssues related to the Ray Dashboard, Logging, Metrics, Tracing, and/or ProfilingIssues related to the Ray Dashboard, Logging, Metrics, Tracing, and/or Profilingstabilityusability
Description
What happened + What you expected to happen
During the execution of tail_job_logs() after the job submission, if the ray head connection breaks, the tail_job_logs() will not raise any error. I think the error should be raised?
Versions / Dependencies
ray: 3.0.0.dev0 (install with editable)
Reproduction script
- start ray head with
ray start --head - Run following script for submit job and tailing logs
- Run
ray stop, then you will see thattail_job_log()function exit without error
Script:
import asyncio
from ray.job_submission import JobSubmissionClient
async def test_tail_job_logs_with_restart():
"""Test the tail_job_logs function behavior when dashboard restarts"""
try:
# Create job submission client
client = JobSubmissionClient("http://127.0.0.1:8265")
# Submit a long-running job
job_id = client.submit_job(
entrypoint='python -c \'import time; print("Job started"); time.sleep(60); print("Job completed")\''
)
print(f"Job submitted with ID: {job_id}")
# Start tailing logs
print("Starting to tail job logs...")
log_lines = []
async for line in client.tail_job_logs(job_id):
print(f"LOG: {line.strip()}")
log_lines.append(line)
job_info = client.get_job_info(job_id)
print(f"job status: {job_info.status}")
print("tail_job_logs() returned normally (no exception)")
except Exception as e:
print(f"Exception during tail_job_logs: {e}")
if __name__ == "__main__":
asyncio.run(test_tail_job_logs_with_restart())Result screenshot:
Issue Severity
None
Metadata
Metadata
Assignees
Labels
P1Issue that should be fixed within a few weeksIssue that should be fixed within a few weeksbugSomething that is supposed to be working; but isn'tSomething that is supposed to be working; but isn'tcommunity-backlogcoreIssues that should be addressed in Ray CoreIssues that should be addressed in Ray CoreobservabilityIssues related to the Ray Dashboard, Logging, Metrics, Tracing, and/or ProfilingIssues related to the Ray Dashboard, Logging, Metrics, Tracing, and/or Profilingstabilityusability