-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CU-8694wh3d5 track usage #458
Conversation
…th is correctly monitored
Task linked: CU-8694wh3d5 Track model usage |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
setting this via an env var i.e. MEDCAT_LOGS
If this is False or 0, no logs / usage at all
Also we need logs to persist outside of process exit
@@ -36,6 +36,9 @@ def setUpClass(cls) -> None: | |||
cls.cdb.config.linking.train = True | |||
cls.cdb.config.linking.disamb_length_limit = 5 | |||
cls.cdb.config.general.full_unlink = True | |||
cls._temp_logs_folder = tempfile.TemporaryDirectory() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The dir and all contents will be destroyed on interpreter process stop(?) We want the logs to persist outside of the process running.
For linux this could be configurable via an env var, set within MedCATService or CogStack-ModelServe or something.
By default for *nix we want something like:
~/.local/share/medcat/logs/
windows:
C:\Users\%USERNAME%\.cache\medcat\logs\
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, of course we want the logs to persist!
But not during test time. We don't want tests to create too many extra files (mostly full of junk) locally (they already do, actually).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
got it - yes, didn't read module name
Do we want dynamic changes to the environmental variables to be reflected in medcat? Or would it be sufficient to check once upon model init / load? What I'm thinking is something along the following lines:
As for persistent - the current implementation (outside the tests) is persistent. |
Sounds good - happy for |
Add relevant documentation to log_folder that it's not used on 'auto'
medcat/utils/usage_monitoring.py
Outdated
from medcat.config import UsageMonitor as UsageMonitorConfig | ||
|
||
|
||
LOGS_ENV = "MEDCAT_LOGS" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
worth renaming to MEDCAT_USAGE_LOGS and MEDCAT_USAGE_LOGS_LOCATION ?
Don't want this to be confused with dev logs(?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Add a usage monitor.
What it does is monitor:
The usage monitor is configured such that it logs on file every time 10 (configurable) inferences are logged (to avoid constant IO). It also logs the rest of the buffer when the user monitor is dereferenced.
This PR also adds the relevant config part (
config.general.usage_monitor
). The usage monitoring can be disabled if needed by changing the config.