CU-8694wh3d5 track usage #458

mart-r · 2024-06-21T16:12:06Z

Add a usage monitor.

What it does is monitor:

The input text length
The input preprocessed text length
The number of output entities

The usage monitor is configured such that it logs on file every time 10 (configurable) inferences are logged (to avoid constant IO). It also logs the rest of the buffer when the user monitor is dereferenced.

This PR also adds the relevant config part (config.general.usage_monitor). The usage monitoring can be disabled if needed by changing the config.

…th is correctly monitored

tomolopolis · 2024-06-21T16:12:11Z

Task linked: CU-8694wh3d5 Track model usage

tomolopolis

setting this via an env var i.e. MEDCAT_LOGS
If this is False or 0, no logs / usage at all
Also we need logs to persist outside of process exit

tomolopolis · 2024-07-10T14:13:21Z

tests/test_cat.py

@@ -36,6 +36,9 @@ def setUpClass(cls) -> None:
        cls.cdb.config.linking.train = True
        cls.cdb.config.linking.disamb_length_limit = 5
        cls.cdb.config.general.full_unlink = True
+        cls._temp_logs_folder = tempfile.TemporaryDirectory()


The dir and all contents will be destroyed on interpreter process stop(?) We want the logs to persist outside of the process running.

For linux this could be configurable via an env var, set within MedCATService or CogStack-ModelServe or something.

By default for *nix we want something like:
~/.local/share/medcat/logs/

windows:
C:\Users\%USERNAME%\.cache\medcat\logs\

Yes, of course we want the logs to persist!
But not during test time. We don't want tests to create too many extra files (mostly full of junk) locally (they already do, actually).

got it - yes, didn't read module name

mart-r · 2024-07-10T15:07:51Z

setting this via an env var i.e. MEDCAT_LOGS If this is False or 0, no logs / usage at all Also we need logs to persist outside of process exit

Do we want dynamic changes to the environmental variables to be reflected in medcat? Or would it be sufficient to check once upon model init / load?
I think it would make sense to be able to change the behaviour dynamically. However, since os.environ is a snapshot of the environmental variables when the process was started, we'd need to somehow retrieve these changes (don't know if this is trivial).

What I'm thinking is something along the following lines:

Allow enabled to be True, False, or auto
When set to False, no logging (obviously)
When set to True, the config-specified files are used
When set to auto, the behaviour is automatic
- Based on the environmental variable
- The environmental variable is checked periodically (i.e no more often than 10 seconds)
- The file location is picked automatically based on OS
  - I.e ~/.local/share/medcat/logs/ for *nix and C:\Users\%USERNAME%\.cache\medcat\logs\ for Windows

As for persistent - the current implementation (outside the tests) is persistent.

tomolopolis · 2024-07-16T15:48:35Z

Sounds good - happy for auto just make sure the docstrings explain the behaviour.

Add relevant documentation to log_folder that it's not used on 'auto'

…riables

tomolopolis · 2024-07-18T10:54:35Z

medcat/utils/usage_monitoring.py

+from medcat.config import UsageMonitor as UsageMonitorConfig
+
+
+LOGS_ENV = "MEDCAT_LOGS"


worth renaming to MEDCAT_USAGE_LOGS and MEDCAT_USAGE_LOGS_LOCATION ?
Don't want this to be confused with dev logs(?)

Good point!

…ore desciptive

tomolopolis

lgtm

mart-r added 5 commits June 21, 2024 15:56

CU-8694wh3d5: Add config for usage monitor

ecbfe19

CU-8694wh3d5: Add buffered Usage Monitor along with relevant tests

c3df659

CU-8694wh3d5: Add Usage Monitor to CAT.__call__

9fcf352

CU-8694wh3d5: Add tests for usage monitoring to CAT tests

f651040

CU-8694wh3d5: Add tests for usage monitor to make sure the input leng…

9bad965

…th is correctly monitored

mart-r added 3 commits July 1, 2024 13:18

CU-8694wh3d5: Disable usage monitor by default

2f5f77f

CU-8694wh3d5: Enable usage monitor during test time

63b40bc

CU-8694wh3d5: Use correct entities when using nested entities

9fcad45

tomolopolis requested changes Jul 10, 2024

View reviewed changes

mart-r added 4 commits July 17, 2024 13:02

CU-8694wh3d5: Allow 'auto' for usage monitor enable status in config

d642a53

Add relevant documentation to log_folder that it's not used on 'auto'

CU-8694wh3d5: Update config documentation to include environmental va…

205d15b

…riables

CU-8694wh3d5: Add automatic usage monitoring

9b660f8

CU-8694wh3d5: Add relevant tests to automatic usage monitoring

166d933

tomolopolis reviewed Jul 18, 2024

View reviewed changes

mart-r added 2 commits July 18, 2024 12:26

CU-8694wh3d5: Rename usage monitoring environmental variables to be m…

6dcaf70

…ore desciptive

Merge branch 'master' into CU-8694wh3d5-track-usage

5a409f2

tomolopolis approved these changes Jul 23, 2024

View reviewed changes

mart-r merged commit 96706c8 into master Jul 23, 2024
8 checks passed

mart-r deleted the CU-8694wh3d5-track-usage branch August 12, 2024 12:50

mart-r mentioned this pull request Aug 22, 2024

Use the loaded model hash for usage monitor instead of recalculating it #477

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CU-8694wh3d5 track usage #458

CU-8694wh3d5 track usage #458

mart-r commented Jun 21, 2024 •

edited

Loading

tomolopolis commented Jun 21, 2024

tomolopolis left a comment

tomolopolis Jul 10, 2024

mart-r Jul 10, 2024

tomolopolis Jul 16, 2024

mart-r commented Jul 10, 2024

tomolopolis commented Jul 16, 2024

tomolopolis Jul 18, 2024

mart-r Jul 18, 2024

tomolopolis left a comment

		from medcat.config import UsageMonitor as UsageMonitorConfig


		LOGS_ENV = "MEDCAT_LOGS"

CU-8694wh3d5 track usage #458

CU-8694wh3d5 track usage #458

Conversation

mart-r commented Jun 21, 2024 • edited Loading

tomolopolis commented Jun 21, 2024

tomolopolis left a comment

Choose a reason for hiding this comment

tomolopolis Jul 10, 2024

Choose a reason for hiding this comment

mart-r Jul 10, 2024

Choose a reason for hiding this comment

tomolopolis Jul 16, 2024

Choose a reason for hiding this comment

mart-r commented Jul 10, 2024

tomolopolis commented Jul 16, 2024

tomolopolis Jul 18, 2024

Choose a reason for hiding this comment

mart-r Jul 18, 2024

Choose a reason for hiding this comment

tomolopolis left a comment

Choose a reason for hiding this comment

mart-r commented Jun 21, 2024 •

edited

Loading