-
Notifications
You must be signed in to change notification settings - Fork 78
Logging framework #134
Comments
I think starting with the built-in logging makes sense. It works pretty well, though there are some rough edges when dealing with multiple processes. |
Thanks, will look into customizing it! |
Let's keep this issue open until we've figure it out. |
Sounds good. I just closed it as you answered my question. |
Right now our hack in the dycore is to redirect stdout to Would be better to have something like Could be useful to also have the info statement "registered" in some way so that we know what is dumping the message (atmosdycore, oceandycore, landmodel, etc.). Something along the lines of
I remember thinking a while ago that this should all be possible with a fairly thin layer on top of |
Hey guys, I just saw this issue and being the author of the standard logging frontend I think it's flexible enough to do what you want (yes I am biased ;-) ) If it's not I think we should tweak it to make it so! To have truly composable logging I think it's critical that all packages should agree on a common front end for emitting logs, with pluggable backends depending on the use case. The lack of functionality in stdlib
Fairly rich metadata is already present with each log record, including the originating module. You just need to customize the printing (the It's all about composability. |
It seems that this open issue handles what was mentioned at our meeting yesterday.
Can memento be setup to extract these details automatically? |
My $0.02: the standard logging front-end is the right way forward. An info dump of the details of the run as suggested by Simone should certainly be standardized, but it is orthogonal to the logging mechanism. For the back-end though, please do note that as soon as you get up to 64 ranks or more, you essentially have to turn logging off, or redirect to file and then grep through a bunch of log files. As you get to higher rank counts, you will overwhelm whatever parallel file system with the number of log files. And remember that debugging at high levels of parallelism pretty much requires logs. We have dealt with these issues in the past using Dlog. It does require some helper tools, but nothing else we tried worked at scale. I propose to implement this as a Logging back-end which is selected above some threshold of ranks. Below that threshold, a simple MPI-awareness tweak to the standard back-end will suffice. |
A custom backend sounds ideal for big distributed jobs and Dlog seems nice and simple (I suppose with sparse files a large It should be easy and scalable to install any runtime filtering rules on the task that produces the logs but I haven't thought about scalability in sinking the resulting records. It might also be handy to sink detailed logs using Dlog but additionally send a low volume selection of high-priority warn/error/progress messages across the network for real time monitoring. |
Notes from discussion with @vchuravy @leios @charleskawczynski and @skandalaCLIMA: We can build an "MPI Logger" on top of MPI I/O: specifically open a shared file pointer, and write messages to the file via We would need to expose Some commercial/3rd party tools we might want to explore:
|
See also the related discussion about Distributed logging at JuliaLogging/TerminalLoggers.jl#22 A question for you: will you |
We could: MPI I/O is just writing bytes (or any other MPI datatype), but we would need to define some sort of header for each message. |
Doesn't the Serialization stdlib do most of that work for you? julia> using Serialization
io = IOBuffer()
serialize(io, Any[("message", (x=1.0,i=10,s="blah"))])
data = take!(io)
# put data in a file or whatever
julia> deserialize(IOBuffer(data))
1-element Array{Any,1}:
("message", (x = 1.0, i = 10, s = "blah")) |
I meant that we would need a header to denote what MPI rank the message originated from. |
I think
Right 👍 If it's on a per log record basis you could alternatively just tag each message coming from the rank with an |
Ah, cool. Is there a list of packages which implement different logging backend somewhere? I'd be keen to see what others do. |
We've got some brand new tooling for REPL-based use (see TerminalLoggers.jl) which I'm excited about, but I'd say we're still figuring out what's possible and desirable for "serious projects". I've got a rough list of tooling at https://github.com/c42f/MicroLogging.jl/#julia-logging-ecosystem but this may not be very complete. The CLIMA case may be the first large scale HPC application using this stuff. Just in the last week I've been chatting to @tanmaykm about backends for some Julia Computing internal stuff (JuliaTeam/JuliaRun) which I think is another interesting case study and shares some of the same "serious application" needs. Though it runs on completely different infrastructure from what you'll have. Here's a snippet which would allow you to tag all log messages from a given using LoggingExtras
# Something like this tool should go in a package somewhere...
function tag_logs(f; kws...)
function add_keys(log)
merge(log, (kwargs = merge(values(log.kwargs), values(kws)),))
end
with_logger(f, TransformerLogger(add_keys, current_logger()))
end Then for example (with the standard REPL logger backend): julia> function run_simulation()
@info "Starting"
@warn "A thing"
end
run_simulation (generic function with 1 method)
julia> run_simulation()
[ Info: Starting
┌ Warning: A thing
└ @ Main REPL[8]:3
julia> mpi_rank = 10; # MPI.Comm_rank(...)) I guess
julia> tag_logs(mpi_rank=mpi_rank) do
run_simulation()
end
┌ Info: Starting
└ mpi_rank = 10
┌ Warning: A thing
│ mpi_rank = 10
└ @ Main REPL[8]:3 Note that |
The silent mode for running the model would be useful for me when generating the tutorial markdown pages. For example here I just want to showcase the plots. The |
posting some of my thoughts from slack (really a bump of my previous comment: #134 (comment)) Would certainly be nice to have more fine grained control than
Also making it MPI friendly with something like @inforoot vs @info where the later only dumps data on the root rank (what we currently do) as opposed to all ranks (not possible right now). Awesome bonus would be module specific debug, e.g., dump debug info from the driver but not the mesh. To make all this work we need to be in the habit of curating output statements added to PRs. |
There are a couple of different ways we might want to print messages:
There are also fancy things we can do where some output goes to One question is how this should be exposed at the driver level? |
Maybe more of a question for @charleskawczynski @simonbyrne but any suggestions for a logging library? It's probably a good software practice to move away from
print
/@printf
statements to a logging framework that provides detailed logs for debugging (and for users, with a lower severity).I quite like Memento.jl but maybe it's better to rely on Julia's built-in logging? The default format isn't very good for detailed logs but I'm sure we can set a custom format with timestamps, line numbers, etc.
The text was updated successfully, but these errors were encountered: