Skip to content

Conversation

@ericl
Copy link
Contributor

@ericl ericl commented Sep 13, 2021

Why are these changes needed?

Currently, core worker ERROR and FATAL logs are sent to python-core-worker-* files in /tmp/ray. This makes it hard to detect error conditions and check failures, as the user only gets a message to "check logs".

This change configures the spdlog to also send err+ message unconditionally to the stderr, which means the normal log routing machinery picks it up.

Related issue number

Closes #12893

uint32_t delay = RayConfig::instance().task_retry_delay_ms();
RAY_LOG(ERROR) << "Will resubmit task after a " << delay
<< "ms delay: " << spec.DebugString();
RAY_LOG(INFO) << "Will resubmit task after a " << delay
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid excess log spam since we print a message elsewhere on retry already.

@ericl ericl merged commit 3e0ae38 into ray-project:master Sep 14, 2021
edoakes added a commit to edoakes/ray that referenced this pull request Sep 14, 2021
edoakes added a commit that referenced this pull request Sep 14, 2021
ericl added a commit to ericl/ray that referenced this pull request Sep 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Logging] Core worker ERRORs/check failures not streamed to drivers.

5 participants