Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unified logging infrastructure #1426

Merged
merged 15 commits into from
Dec 17, 2024
Merged

unified logging infrastructure #1426

merged 15 commits into from
Dec 17, 2024

Conversation

cyrush
Copy link
Member

@cyrush cyrush commented Nov 21, 2024

No description provided.

@cyrush
Copy link
Member Author

cyrush commented Dec 4, 2024

New logging infrastructure handles both log messages as well as performance annotations. When caliper is on, we will get caliper output files. When logging is on, we will also get timing info in the log files. The logs are yaml based and support nesting. (An evolution of the VTK-h timing logs)

Here are the current Ascent options related to logging and timing info:

  • messages - controls if info messages echoed (std out) (values quiet | verbose)
  • exceptions - controls if exceptions are thrown or captured (values: forward | catch)
  • timings - enables timers for flow filters (values: true |false)

With the new logging infrastructure, we are adding log files to the mix.

Default case will still be no log file, the messages and exceptions options still apply.

The logger allows you to control the level of echo messages independent of what is written to to a log file. The message levels are: debug, info, warn, error.

We can provide more control with the new system, here is a strawman for options:

messages:
    echo_threshold: all | debug |  info | warn | error | none

Also, for the mpi case - we can limit which tasks actually echo, another strawman:

messages:
    echo_threshold: all | debug | info | warn | error |  none
    ranks: all | root

I suggest that:

messages: verbose equates to:

messages:
    echo_threshold: all
    ranks: all

and

messages: quiet equates to:

messages:
    echo_threshold: none

Of the existing logging macros, INFO and ERROR are used the most. There are a handful of WARN cases (and a few should be ERROR. We can add DEBUG cases for extra verbose output.

To support the log files, we should add new logging options:

Here are some examples:

logging:
    file_pattern: "ascent_log_out.yaml"
    log_threshold: info
logging:
    file_pattern: "ascent_log_out_{rank}.yaml"
    log_threshold: all
logging:
    file_pattern: "ascent_log_out_{rank:05d}.yaml"

With this PR, I don't plan to change the "timings" behavior.

VTK-h and Devil ray have their own timing modes, and those will be ported to the new logging infrastructure over time.

We also want a simple way to enable logging with sensible defaults, along the lines of:

logging: true

which would be equivalent to:

logging:
    file_pattern: "somethign_we_pick.yaml | somethign_we_pick_{rank:05d}.yaml"
    log_threshold: info

@nicolemarsaglia
Copy link
Contributor

I like the logging levels you suggested.

Thinking ahead to caliper usage, do we want to apply logging levels to caliper? Ex: caliper + logging with info will get you caliper annotations, whereas caliper + logging with debug will also give you caliper loops and iterations. I think I favor the alternative of: caliper is on and you get what we give you, which is everything. But i can be convinced otherwise.

@cyrush
Copy link
Member Author

cyrush commented Dec 9, 2024

thanks @nicolemarsaglia!

We can explore what type of controls caliper provides to see if we can and gate messages based on level.
Caliper has a good set of runtime controls, so I expect it's possible.

I do think 99% of the time most users will want all or nothing.

We as developers may want to dial it in.

@cyrush
Copy link
Member Author

cyrush commented Dec 9, 2024

I am at a good place with initial support for the new logging.

The new macros are not wired up in Ascent yet, but we are creating logs and the macros are ready to go.

I think we should merge this before we do more refactoring.

Please checkout what I have and let me know if you have comments.

Thanks!

@cyrush
Copy link
Member Author

cyrush commented Dec 12, 2024

For large scale runs, we should support options to control which mpi tasks produce logs:

logging:
    ...
    ranks: all | root | [list of numbers]
logging:
    ...
    ranks: 
      stride: 10 # rank % 10 == 0, create a log

We can also add these to the messages options

@cyrush
Copy link
Member Author

cyrush commented Dec 12, 2024

I also want to add actions that control logging to give users more control than just params in ascent::open

start logging at any time:

action: open_log # open ascent's logging
file_pattern: (support all params described above)

flush logs so they can be seen before ascent is closed out:

action: flush_log # flush current log streams to disk

stop logging and flush current logs:

action: close_log # close current log streams

We will add these actions in a follow on PR, here is an issue to track this:

#1431

ASCENT_LOG_OPEN( file_pattern ) // serial
#else
ASCENT_LOG_OPEN_RANK( file_pattern, par_rank ) // mpi par
#endif
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

switch?


        #if defined(ASCENT_MPI_ENABLED)
            ASCENT_LOG_OPEN_RANK( file_pattern, par_rank ) // mpi par   
        #else
            ASCENT_LOG_OPEN( file_pattern ) // serial 
        #endif

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you for finding this, I am working on tests right now

@cyrush
Copy link
Member Author

cyrush commented Dec 16, 2024

@nicolemarsaglia @emily-howell I think this PR is ready, let me know if you have any more comments or suggestions

conduit::Node n;
n.load(lfname);
n.print();

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any check for this test?

Copy link
Contributor

@nicolemarsaglia nicolemarsaglia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

@cyrush cyrush merged commit b054d6d into develop Dec 17, 2024
22 checks passed
@cyrush cyrush deleted the task/2024_11_unified_logging branch December 17, 2024 20:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants