-
Notifications
You must be signed in to change notification settings - Fork 7k
[core] add entrypoint log for jobs #58300
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| # Open in append mode to avoid overwriting runtime_env setup logs for the | ||
| # supervisor actor, which are also written to the same file. | ||
| with open(logs_path, "a") as logs_file: | ||
| self._logger.info(f"Running entrypoint for job {self._job_id}: {self._entrypoint}\n") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't touched this code in a long while, but I believe self._logger goes to a log file for this supervisor actor, not the log file for the job itself. You'll want to write this to the logs_file that was opened a line above instead.
Signed-off-by: Chris Fellowes <[email protected]>
Signed-off-by: chrisfellowes <[email protected]> Signed-off-by: Chris Fellowes <[email protected]>
Signed-off-by: Chris Fellowes <[email protected]>
Signed-off-by: Chris Fellowes <[email protected]>
3724561 to
b7afa9a
Compare
edoakes
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM assuming tests pass. I just added the go label, which will run the full CI tests.
| # Open in append mode to avoid overwriting runtime_env setup logs for the | ||
| # supervisor actor, which are also written to the same file. | ||
| with open(logs_path, "a") as logs_file: | ||
| logs_file.write(f"Running entrypoint for job {self._job_id}: {self._entrypoint}\n") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
| logs_file.write(f"Running entrypoint for job {self._job_id}: {self._entrypoint}\n") | |
| logs_file.write(f"Running entrypoint for job '{self._job_id}': {self._entrypoint}\n") |
|
Linter is failing: https://buildkite.com/ray-project/microcheck/builds/30180#019a35ad-f7f5-4910-941d-7a244b4f1c05/186-305 Instructions for local linting here: https://docs.ray.io/en/latest/ray-contribute/getting-involved.html#lint-and-formatting |
Signed-off-by: Chris Fellowes <[email protected]>
…sfellowes-anyscale/master
Signed-off-by: Edward Oakes <[email protected]>
Signed-off-by: Edward Oakes <[email protected]>
Signed-off-by: Chris Fellowes <[email protected]>
Head branch was pushed to by a user without write access
Signed-off-by: Chris Fellowes <[email protected]>
Signed-off-by: Chris Fellowes <[email protected]>
this helps prevent an edge case when using file based log exporters like vector that use fingerprinting [ref](https://vector.dev/docs/reference/configuration/sources/file/#fingerprint) to identify unique files. example edge case that this fixes: two jobs are submitted to a cluster and begin executing at the same time, they both contain an invalid entrypoint that references a nonexistant file before fix: - both jobs have the identical "Runtime env is setting up" log with identical timestamps - both jobs have identical entrypoint failure logs as a result, the log files for these jobs are identical, so vector will only export one. after fix: - both jobs have the identical "Runtime env is setting up" log with identical timestamps - each job has a **unique** entrypoint log containing its job_id - both jobs have identical entrypoint failure logs vector can differentiate between these two files, so both will be exported --------- Signed-off-by: Chris Fellowes <[email protected]> Signed-off-by: chrisfellowes <[email protected]> Signed-off-by: Edward Oakes <[email protected]> Co-authored-by: Edward Oakes <[email protected]>
this helps prevent an edge case when using file based log exporters like vector that use fingerprinting [ref](https://vector.dev/docs/reference/configuration/sources/file/#fingerprint) to identify unique files. example edge case that this fixes: two jobs are submitted to a cluster and begin executing at the same time, they both contain an invalid entrypoint that references a nonexistant file before fix: - both jobs have the identical "Runtime env is setting up" log with identical timestamps - both jobs have identical entrypoint failure logs as a result, the log files for these jobs are identical, so vector will only export one. after fix: - both jobs have the identical "Runtime env is setting up" log with identical timestamps - each job has a **unique** entrypoint log containing its job_id - both jobs have identical entrypoint failure logs vector can differentiate between these two files, so both will be exported --------- Signed-off-by: Chris Fellowes <[email protected]> Signed-off-by: chrisfellowes <[email protected]> Signed-off-by: Edward Oakes <[email protected]> Co-authored-by: Edward Oakes <[email protected]>
this helps prevent an edge case when using file based log exporters like vector that use fingerprinting [ref](https://vector.dev/docs/reference/configuration/sources/file/#fingerprint) to identify unique files. example edge case that this fixes: two jobs are submitted to a cluster and begin executing at the same time, they both contain an invalid entrypoint that references a nonexistant file before fix: - both jobs have the identical "Runtime env is setting up" log with identical timestamps - both jobs have identical entrypoint failure logs as a result, the log files for these jobs are identical, so vector will only export one. after fix: - both jobs have the identical "Runtime env is setting up" log with identical timestamps - each job has a **unique** entrypoint log containing its job_id - both jobs have identical entrypoint failure logs vector can differentiate between these two files, so both will be exported --------- Signed-off-by: Chris Fellowes <[email protected]> Signed-off-by: chrisfellowes <[email protected]> Signed-off-by: Edward Oakes <[email protected]> Co-authored-by: Edward Oakes <[email protected]> Signed-off-by: Aydin Abiar <[email protected]>
this helps prevent an edge case when using file based log exporters like vector that use fingerprinting [ref](https://vector.dev/docs/reference/configuration/sources/file/#fingerprint) to identify unique files. example edge case that this fixes: two jobs are submitted to a cluster and begin executing at the same time, they both contain an invalid entrypoint that references a nonexistant file before fix: - both jobs have the identical "Runtime env is setting up" log with identical timestamps - both jobs have identical entrypoint failure logs as a result, the log files for these jobs are identical, so vector will only export one. after fix: - both jobs have the identical "Runtime env is setting up" log with identical timestamps - each job has a **unique** entrypoint log containing its job_id - both jobs have identical entrypoint failure logs vector can differentiate between these two files, so both will be exported --------- Signed-off-by: Chris Fellowes <[email protected]> Signed-off-by: chrisfellowes <[email protected]> Signed-off-by: Edward Oakes <[email protected]> Co-authored-by: Edward Oakes <[email protected]>
Description
this helps prevent an edge case when using file based log exporters like vector that use fingerprinting ref to identify unique files.
example edge case that this fixes:
two jobs are submitted to a cluster and begin executing at the same time, they both contain an invalid entrypoint that references a nonexistant file
before fix:
as a result, the log files for these jobs are identical, so vector will only export one.
after fix:
vector can differentiate between these two files, so both will be exported
Related issues
Additional information