-
Notifications
You must be signed in to change notification settings - Fork 251
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NeuralynxIO: support List[str] for filename parameter #1440
Conversation
I guess one question I would have overall is why overload the filename argument? Unfortunately I don't think we would change (like change filename->filenames) since it is a core part, but why not add another argument include_filenames to mirror the exclude_filenames. I'll ping @samuelgarcia here so he can decide if he is okay with the overloading |
Maybe w could use filenames to avoid confusion ? even if it makes the signature a bit heavy... |
I think having Or are you suggesting switch filename to filenames and have some broken code? |
I don't feel it is necessary to create separate variables for single and multiple input files in the arguments. Maybe we can use The local |
I agree that's the better overall solution, but changing the argument name could break users' code. So my question for Sam was whether he was okay to do a breaking change at the api level or if he wanted to work around the current api to protect the old way of doing things. |
Maybe add a deprecate warning? |
I think this is perhaps an opportunity to formalize IO constructor arguments across all IO classes, before we get to the 1.0 release. I think it will be confusing to have some IO classes that accept a list of things as the first argument, and others only a single thing. To solve the current issue, I support @zm711's suggestion of adding |
Yeah, that makes sense. Shall we ignore |
I don't use NeuralynxIO very much, so I will defer to those who do, but would suggest:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple comments about the current implementation.
Hi, I am trying to run pytest, but it seems the environment cannot be correctly set up. Neither setup.py nor the pyproject.toml works for me. |
I routinely set up new environments building neo from source. What is the exact error you're getting. Note that we are still trying to work on making sure all tests work on Windows so testing may not be perfect on Windows machines (outside of core tests). |
I am running the code on a Mac (M1 chip). The error when I use poetry to install the package is:
I can use pip to install the packages with
I guess this is also due to the incorrect |
If you want to run full tests you should do
You are missing a package. This is fine if you don't want to download a bunch of extra packages, but then rather than run pytest at the root you'll need to run individual tests instead. We don't have a poetry section in our code and I don't use poetry so I'm not sure how to install with poetry. |
Hi, Sorry for the delay. I think this PR is ready to review. Let me know if there are additional comments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another round of comments @NxNiki.
It is very close. I think we would want to simplify the logic a bit in a few places where I indicated. You want to try those? and then I'll do another read. Just ping me again when you're ready.
Hi @zm711, thanks for giving valuable suggestions. The PR is ready for another round of review. |
I'll give it another full read this weekend! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One last suggestion and one last discussion point, but then I think we are basically there.
elif isinstance(include_filenames, str): | ||
include_filenames = [include_filenames] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the one I'm still a little torn on. I would prevent to force include_filenames to be a list, but I think it is okay if we accept a string in the case of only wanting one file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Under the hood, include_filenames
has to be a list, or the iteration at line 220 would loop over each string character.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I wasn't clear. I was saying why should we accept a string at all (so delete line 165 and 166). Since our new argument is filenames
shouldn't we just always make the users give a list (or any iterable)? Why should we accept a single file rather than a list of a single file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I asked on Chatgpt and it seems quite common to allow for flexible input types in this case. I am open to both ways though.
Here is the answer from chatgpt:
In Python, when designing an I/O interface to read a list of files, it's generally user-friendly to allow for flexible input types. Here are the considerations for both approaches you mentioned:
Allowing a string or list as input:
Pros: More convenient for users who often deal with a single file. It reduces the overhead of wrapping a filename in a list, making the function easier to use in scripts or interactive sessions where only one file is processed.
Cons: The function implementation needs to handle different types of input, checking whether the input is a string or a list. This can make the function slightly more complex.
Strictly requiring a list of strings:
Pros: Consistency in input type can simplify the function's logic, as it always expects a list. This can make the code simpler and potentially less error-prone when dealing with multiple files.
Cons: Less convenient for cases where only a single file is involved, as the user must wrap the filename in a list even if it's just one.
Given these considerations, a common practice is to design the interface to accept both strings and lists. This approach enhances usability without significantly complicating the implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@apdavison did you have opinions on this? I prefer to be more restrictive, but I know other parts of Neo's API is relatively permissive. I think you prefer more flexibility too @samuelgarcia, so I'm fine with being outvoted here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's a tough call, but I can't see that accepting a string will cause major problems, so I'm happy to go with the current interface.
Co-authored-by: Zach McKenzie <[email protected]>
@NxNiki, we have a conference that a bunch of us are attending next week, but we will try to get final review done the week after that! |
Thanks for the update. I hope you have a nice trip! |
@apdavison did you want to check out the argument reorganization on this one before we have our meeting? |
@apdavison, just wanted to ping you again for giving this one a read! |
Hi, any updates? Let me know if there are additional comments or concerns. |
Many thanks @NxNiki for all your work on this, and sorry for the delayed response from my side. |
NP! This is a great learning opportunity for me. Thanks for the discussions and comments! |
@@ -183,33 +213,24 @@ def _parse_header(self): | |||
event_annotations = [] | |||
|
|||
if self.rawmode == "one-dir": | |||
filenames = sorted(os.listdir(self.dirname)) | |||
dirname = self.dirname | |||
filenames = os.listdir(self.dirname) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am realising only now that this little has an bi big impact because the file are unsorted and so channel are not ordered
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See this SpikeInterface/spikeinterface#3248
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You think this is Anthony's issue? This was just merged yesterday, so I guess we could confirm that the issue showed up, but makes sense that could be the problem. You plan to just open the PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, sorry for missing this issue. Do we also want to sort include_filenames
? One edge case would be the user using os.listdir
to collect the files and pass it to include_filenames
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No wory. @Antho2422 did a quick fix.
For include_filenames, I think we do not need the sorter() because it makes that the user can providde a list in the desired order. no ?
Add a feature to allow multiple files as filename.
Now user can create a list of filenames and assign it to the argument
include_filenames
ofneo.io.NeuralynxIO(dirname='xxx', include_filenames=session_data)
to read the specific files.If there are multiple files for a single channel, they should be consistent for all channels.