Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: lru_cache issues + meta info missing #72

Merged
merged 3 commits into from
Aug 11, 2023

Commits on Aug 10, 2023

  1. fix: lru_cache issues + meta info missing

    Context: codecov/engineering-team#119
    
    So the real issue with the meta info is fixed in codecov/shared#22.
    spoiler: reusing the report details cached values and _changing_ them is not a good idea.
    
    However in the process of debuging that @matt-codecov pointed out that we were not using lru_cache correctly.
    Check this very well made video: https://www.youtube.com/watch?v=sVjtp6tGo0g
    
    So the present changes upgrade shared so we fix the meta info stuff AND address the cache issue.
    There are further complications with the caching situation, which explain why I decided to add the cached value in the
    `obj` instead of `self`. The thing is that there's only 1 instance of `ArchiveField` shared among ALL instances of
    the model class (for example, all `ReportDetail` instances). This kinda makes sense because we only create an instance
    of `ArchiveField` in the declaration of the `ReportDetail` class.
    
    Because of that if the cache is in the `self` of `ArchiveField` different instances of `ReportDetails` will have dirty cached value of other `ReportDetails` instances and we get wrong values. To fix that I envision 3 possibilities:
    1. Putting the cached value in the `ReportDetails` instance directly (the `obj`), and checking for the presence of that value.
    If it's there it's guaranteed that we put it there, and we can update it on writes, so that we can always use it. Because it is
    for each `ReportDetails` instance we always get the correct value, and it's cleared when the instance is killed and garbage collected.
    
    2. Storing an entire table of cached values in the `self` (`ArchiveField`) and using the appropriate cache value when possible. The problem here is that we need to manage the cache ourselves (which is not that hard, honestly) and probably set a max value. Then we will populate the cache and over time evict old values. The 2nd problem is that the values themselves might be too big to hold in memory (which can be fixed by setting a very small value in the cache size). There's a fine line there, but it's more work than option 1 anyway.
    
    3. We move the getting and parsing of the value to outside `ArchiveField` (so it's a normal function) and use `lru_cache` in that function. Because the `rehydrate` function takes a reference to `obj` I don't think we should pass that, so the issue here is that we can't cache the rehydrated value, and would have to rehydrate every time (which currently is not expensive at all in any model)
    
    This is an instance cache, so it shouldn't need to be cleaned for the duration of the instance's life
    (because it is updates on the SET)
    
    closes codecov/engineering-team#119
    giovanni-guidini committed Aug 10, 2023
    Configuration menu
    Copy the full SHA
    f3f34c9 View commit details
    Browse the repository at this point in the history

Commits on Aug 11, 2023

  1. fix: update archive field cache

    * Handle case were a single model uses multiple archived fields
    (dynamic archived field cached property name)
    * Concentrate getting/setting cache in `__get__` and `__set__` methods
    giovanni-guidini committed Aug 11, 2023
    Configuration menu
    Copy the full SHA
    7ecd515 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    6528bd0 View commit details
    Browse the repository at this point in the history