Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] - Fix update of epoch_tags store #1521

Closed
wants to merge 6 commits into from
Closed

[BUG] - Fix update of epoch_tags store #1521

wants to merge 6 commits into from

Conversation

TomDonoghue
Copy link
Contributor

@TomDonoghue TomDonoghue commented Jul 26, 2022

Motivation

This addresses #1449

The way that epoch_tags was updated before didn't play well with strings, which should be a supported input. Since epoch_tags is a set, adding a string gets treated as an iterable, which then separates and adds each character.

This PR fixes this section to check the types of the tags, and deal with them if they are a string.

Note that the string processing line is a direct copy of pynwb.epoch line 47, to make sure strings are processed the same, and that epoch_tags properly supports the possible input of something like 'label1, label2'.

How to test the behavior?

There is example code that shows the problem in #1449. I check that this same code when tested against this PR works as it should.

Checklist

  • Did you update CHANGELOG.md with your changes?
  • Have you checked our Contributing document?
  • Have you ensured the PR clearly describes the problem and the solution?
  • Is your contribution compliant with our coding style? This can be checked running flake8 from the source directory.
  • Have you checked to ensure that there aren't other open Pull Requests for the same change?
  • Have you included the relevant issue number using "Fix #XXX" notation where XXX is the issue number? By including "Fix #XXX" you allow GitHub to close issue #XXX when the PR is merged.

@codecov
Copy link

codecov bot commented Jul 26, 2022

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 90.53%. Comparing base (d75cc82) to head (ca31f15).
Report is 244 commits behind head on dev.

Additional details and impacted files
@@            Coverage Diff             @@
##              dev    #1521      +/-   ##
==========================================
+ Coverage   90.52%   90.53%   +0.01%     
==========================================
  Files          25       25              
  Lines        2460     2463       +3     
  Branches      456      458       +2     
==========================================
+ Hits         2227     2230       +3     
  Misses        148      148              
  Partials       85       85              
Flag Coverage Δ
integration 69.87% <75.00%> (-0.01%) ⬇️
unit 83.96% <100.00%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

oruebel
oruebel previously approved these changes Jul 30, 2022
@oruebel
Copy link
Contributor

oruebel commented Jul 30, 2022

Thanks for the bug fix!

@oruebel
Copy link
Contributor

oruebel commented Jul 31, 2022

Merging of this PR is currently blocked because flake8 checks are failing in file not modified by this PR. This appears to be due to tests running a newer version of flake8. I've created #1524 to fix those issues. Once that PR has been merged we can sync this PR with dev and merge.

@rly
Copy link
Contributor

rly commented Aug 2, 2022

Looking at epoch_tags with a closer eye, I do not fully understand why this instance variable exists and whether it works as intended. As far as I can tell, epoch_tags is a cache of the unique tag values in the NWBFile.epochs table. That may be convenient but

  1. it can become out of sync, especially if a user calls nwbfile.epochs.add_interval(...) instead of nwbfile.add_epoch(...), and
  2. it is not populated when loading data from a file -- so it is only cached on write.

Separately, the update method is called on epoch_tags which works only if epoch_tags is a set, but the docval for epoch_tags allows the variable to be a tuple or a list.

@TomDonoghue are you using this field, and if so, how are you using it?

@TomDonoghue
Copy link
Contributor Author

TomDonoghue commented Aug 3, 2022

@rly - I am not actively using this field - I stumbled onto this bug a while ago when I was exploring different ways to store event information (I ended up not actively using epochs). As far as I understand how it's working (having not gone too deep into this), I agree with your notes here that epoch_tags is a potentially useful idea, but that it's not guaranteed to be consistent or populated.

In terms of type, I'm not clear on if / how epoch_tags could end up anything other than a set in practice and calling update is done in the current version of the code in both add_epoch and add_epoch_column, so it seems to me that for this the code expects a set and the documentation is perhaps wrong?

@rly
Copy link
Contributor

rly commented Aug 3, 2022

OK. Good to know, thanks.

In NWBFile.__init__:

pynwb/src/pynwb/file.py

Lines 317 to 318 in d75cc82

{'name': 'epoch_tags', 'type': (tuple, list, set),
'doc': 'A sorted list of tags used across all epochs', 'default': set()},

epoch_tags is allowed to be a tuple or list, but as you mention, the code expects a set, so at the very least we need to update this to allow only sets and rewrite the docstring.

However, given its limited usefulness and potential confusion, I am inclined to raise a warning when it is used to say that 1) this variable is not stored on disk, 2) it is not populated on read, 3) it may be incorrect if epochs are added not through methods on NWBFile, and 4) that it will be deprecated in a future version of PyNWB. @oruebel @bendichter what do you think?

@TomDonoghue
Copy link
Contributor Author

@rly if epoch_tags is supposed to be a shortcut for getting all unique tags that are present, it sorta feels like this might be better accomplished by a property, something like:

@property
def epoch_tags(self):
    return set(self.epochs.tags[:])

I think the above would address both of the concerns with the current approach?
It does read tags every time it's called, but I'm assuming that the list of epochs / tags is unlikely to get so big that this becomes slow / an issue.

@TomDonoghue
Copy link
Contributor Author

@rly - just wanted to follow up here - is this still relevant / useful, or should I close this PR as out of date?

@stephprince stephprince mentioned this pull request Jul 18, 2024
6 tasks
@stephprince
Copy link
Contributor

I believe this was addressed with this PR: #1935. Please reopen if not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants