Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enrich logging through context vars #452

Merged
merged 86 commits into from
Jan 19, 2024
Merged

Enrich logging through context vars #452

merged 86 commits into from
Jan 19, 2024

Conversation

ankona
Copy link
Contributor

@ankona ankona commented Jan 10, 2024

The experiment previously wrote to sys.stdout and sys.stderr. This change makes experiment driver scripts using the Experiment API write additional logs for experiment tracking. The public "factory API" of the Experiment has added the behavior of also logging to individual experiment log files.

The implementation makes use of python contextvars.ContextVar to store experiment-specific state. The state is used to dynamically modify experiment-level logging.

For example, this driver:

exp1 = smartsim.Experiment('exp-1')
rs1 = exp1.create_runsettings(...)
model1 = exp1.create_model(..., rs1)

exp2 = smartsim.Experiment('other-exp')
rs2 = exp2.create_runsettings(...)
model2 = exp2.create_model(..., rs2)

exp1.start(model1)
exp1.start(model2)

Results in each experiment dynamically registering logging.FileHandler instances that write logs to separate files:

  • /exp-1/.telemetry/smartsim/smartsim.out
  • /other-exp/.telemetry/smartsim/smartsim.out

Key changes:

  1. Decorated experiment API w/contextualizer to enrich log context
  2. Create/Use ContextThread to ensure threads include current context information
  3. Create/Use ContextAwareLogger to dynamically add file handlers for experiment logs
  4. Updated manifest serialization to include paths to experiment-specific log files
  5. Added LowPassFilter to enable splitting experiment logs across xxx.out and xxx.err

Additional minor changes:

  1. Moved serialize.TELMON_SUBDIR constant to Config.telemetry_subdir to make it more universally available

Copy link
Member

@MattToast MattToast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of additional small typing things to consider while decieding what to do with the from __future__ import annotations line.

Otherwise, looks about ready to go on my end!!

smartsim/log.py Show resolved Hide resolved
smartsim/log.py Outdated Show resolved Hide resolved
smartsim/log.py Show resolved Hide resolved
@ankona ankona requested a review from MattToast January 16, 2024 23:07
Copy link
Collaborator

@al-rigazzi al-rigazzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything looks semantics-wise good! Just two requests:

  • can you please add/complete the missing docstring parameter lists including :return: and :rtype: fields, at least in the more user-facing files such as log.py?
  • consider running make style and then make check-style to make sure the future GH checks will pass

smartsim/log.py Show resolved Hide resolved
Copy link
Member

@MattToast MattToast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two small typos, but nothing worth holding this PR up over.

LGTM!! Thanks for all the hard work on this one!

smartsim/log.py Outdated Show resolved Hide resolved
smartsim/log.py Show resolved Hide resolved
Copy link
Collaborator

@al-rigazzi al-rigazzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for doing this!

@ankona ankona merged commit f683521 into CrayLabs:develop Jan 19, 2024
26 checks passed
ashao pushed a commit to ashao/SmartSim that referenced this pull request Jan 27, 2024
The implementation makes use of python `contextvars.ContextVar` to store
experiment-specific state. The state is used to dynamically modify
experiment-level logging.

For example, this driver:

```py
exp1 = smartsim.Experiment('exp-1')
rs1 = exp1.create_runsettings(...)
model1 = exp1.create_model(..., rs1)

exp2 = smartsim.Experiment('other-exp')
rs2 = exp2.create_runsettings(...)
model2 = exp2.create_model(..., rs2)

exp1.start(model1)
exp1.start(model2)
```

Results in each experiment dynamically registering `logging.FileHandler`
instances that write logs to separate files:

- `/exp-1/.telemetry/smartsim/smartsim.out`
- `/other-exp/.telemetry/smartsim/smartsim.out`

### Key changes:

1. Decorated experiment API w/contextualizer to enrich log context
2. Create/Use `ContextThread` to ensure threads include current context
information
3. Create/Use `ContextAwareLogger` to dynamically add file handlers for
experiment logs
4. Updated manifest serialization to include paths to
experiment-specific log files
5. Added `LowPassFilter` to enable splitting experiment logs across
`xxx.out` and `xxx.err`

### Additional minor changes:

1. Moved `serialize.TELMON_SUBDIR` constant to `Config.telemetry_subdir`
to make it more universally available

---------

Co-authored-by: Matt Drozt <[email protected]>
Co-authored-by: Matt Drozt <[email protected]>

[ committed by @ankona ]
[ reviewed by @al-rigazzi @MattToast  ]
@ankona ankona deleted the 563ctx branch April 4, 2024 19:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants