Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there a way to add information to the error/warning #52

Closed
1beb opened this issue Mar 17, 2020 · 12 comments
Closed

Is there a way to add information to the error/warning #52

1beb opened this issue Mar 17, 2020 · 12 comments

Comments

@1beb
Copy link
Contributor

1beb commented Mar 17, 2020

Great package!

I'm wondering if there's an easy way to add additional information to the error / warning message. Here's some psuedo code for example:

pid <- "Some process!
tryCatchLog({
  warning("Hello this is a warning!")
}, warning = function(w) paste(w, "- from -", pid)) 

And the pseudo although desired result would be:

WARN [2020-03-16 23:12:11] [WARN] Hello this is a warning - from - Some process!
@aryoda
Copy link
Owner

aryoda commented Mar 17, 2020

Great idea!

The "default" logging of tryCatchLog is not able to do this and just meant as a basic implementation ("don't reinvent the wheel") but it should be simple to add the PID (which is really important during parallel processing).

Do you have more use cases for additional text to be added since I would "inject" the process ID
via the logging framework normally (not tryCatchLog). Logging frameworks have support for custom logging headers.

@1beb
Copy link
Contributor Author

1beb commented Mar 17, 2020

My PR has functioning example code of adding this to tryCatchLog. It was pretty straightforward to do at the tryCatchLog level.

I was trying to figure out how to inject via futile.logger but nothing obvious came to mind. Do you have an example of how you might approach this given your extra experience with fl?

@aryoda
Copy link
Owner

aryoda commented Mar 17, 2020

Thanks for doing the work and sending a PR! I have done a first short review (just for myself so far) to understand your needs.

Just to be sure I understand your requirements right:

  1. Is it OK to use Sys.getpid to log the process ID? It is unique if two R sessions run simultaneously but may possibly be the same for two R sessions running at different times!

  2. Do you always want to log the PID or do you also have other examples of writing other "IDs" or "labels" into the log file? Background: For always logging the PID it could suffice to do that internally in tryCatchLog or configure the logging framework used (instead of passing a PID via an argument to tryCatchLog()).

BTW: I can remember I already had an issue regarding process IDs when writing dump files: #39
So your PR gives me the chance to reconsider tryCatchLog for better overall handling parallel processes. Thanks again for your commitment :-)

PS: Tomorrow I will add examples on how to write the PID into the logging output by using just logging frameworks...

@1beb
Copy link
Contributor Author

1beb commented Mar 17, 2020

For my use case the PID is useful, but the ability to add some type of generic text is what I really need. I can see a tonne of reasons for this and how it might be better than just a PID.

foreach(i = 1:100) %dopar% {
  result = tryCatchLog({ 
    some_fun() 
  }
 )
}

# ERROR [datetime] Some error occured
#
# callstack

In the case above, knowing the pid is useful, but knowing the value of i allows you to reproduce the error. So my idea, generally, is that arbitrary text should be able to pass through warnings/errors into the output. The code adjustments that I made in the PR #53 allow for this:

foreach(i = 1:100) %dopar% {
  result = tryCatchLog({ 
    some_fun() 
  }, pid = i
 )
}

# ERROR [datetime] Some error occured during run 99
#
# callstack

Now the error message gives you a callstack AND a debug start point which is perfect for paralllel operations where the things that fail are always edgecases that you only find during processing.

@aryoda
Copy link
Owner

aryoda commented Mar 17, 2020

THX I see + you convinced me :-)

Give me a few days to settle

  • if there is a another possibly simpler or more generic way to do this
  • about the API (eg. the argument name "pid" should be named differently then, for example additional.log.msg or execution.context, execution.context.msg, tag... to indicate the usage more intuitively)

@aryoda
Copy link
Owner

aryoda commented Mar 17, 2020

Regarding futile.logger and logging the PID there is "simple" solution which requires an update of futile.logger from github since the CRAN version is too old:

# The CRAN version of futile.logger is quite old (v1.4.3 from 2016-07-10 as of today/March 17, 2020):
# The github version has quite more features. To install it use:
# devtools::install_github("zatonovo/futile.logger")  # installs version 1.4.4
library(futile.logger)
library(tryCatchLog)
flog.layout(layout.simple.parallel)     # Use a default format with a process id
flog.info(paste0("PID=", Sys.getpid())) # The logged PID should be the R PID
tryCatchLog(warning("Something is strange..."), include.full.call.stack = FALSE, include.compact.call.stack = FALSE)

Output:

> flog.info(paste0("PID=", Sys.getpid()))
INFO [2020-03-17 21:33:11 30423] PID=30423
> tryCatchLog(warning("Something is strange..."), include.full.call.stack = FALSE, include.compact.call.stack = FALSE)
WARN [2020-03-17 21:33:11 30423] [WARN] Something is strange...

Warning message:
In withCallingHandlers(expr, error = cond.handler, warning = cond.handler,  :
  Something is strange...

@1beb
Copy link
Contributor Author

1beb commented Mar 17, 2020

I think it's important to remember that the log is updated concurrently by tasks in parallel execution. So your info logging may happen interstitially with other task messages. This would work in sequential exectution but it would be pretty unreliable for a fast task queue. It really needs to be appened to the message for it to work.

Let's take 3 processes where 1 is successful and 2 and 3 error but take different amounts of concurrent time. You log might look like this:

info 2 pid # (parallel execution is not exactly ordered)
info 1 pid
warn from 2 
info 3 pid
error from 3 

@aryoda
Copy link
Owner

aryoda commented Mar 17, 2020

BTW: For a list of R logging frameworks and their popularity see: #42

@1beb
Copy link
Contributor Author

1beb commented Mar 17, 2020

I used your analysis to just "go with fultile.logger"! Thanks for that.

@aryoda
Copy link
Owner

aryoda commented Mar 17, 2020

futile.logger is a good choice! If have just updated the statistics of the top n logging framework packages but futile.logger is still the winner (but also debugme a good newcomer)...

Regarding "interstitial" logging: THX for mentioning this use case, it is a key argument to have an "ID" in the tryCatchLog() function + log output!

@aryoda aryoda added the work in progess is currently in the implementation phase label Mar 17, 2020
@aryoda aryoda added implemented and removed work in progess is currently in the implementation phase labels Apr 13, 2020
@aryoda
Copy link
Owner

aryoda commented Apr 13, 2020

Implemented via PR #53 into master for final user tests. RfC.

@aryoda aryoda closed this as completed Apr 20, 2020
@aryoda
Copy link
Owner

aryoda commented Nov 25, 2020

@1beb Since today this feature is contained in the new CRAN version (took very long, sorry!)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants