-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add video preprocessing (denoising) feature #83
base: main
Are you sure you want to change the base?
Conversation
Pull Request Test Coverage Report for Build 12208523886Details
💛 - Coveralls |
1385223
to
bc9a268
Compare
omfg you already did it!?!??!?!?! u are so fast. cant wait to check this out!!!!! |
f051300
to
87e8a44
Compare
amazing!!!! Thanks, Takuya for working on this already!
|
agree, I think we want to drop data rather than copy data - it's scientific data after all and the numbers matter, so duplicating a frame is effectively fabricating data (even though this is not a bad strategy if we were streaming a movie and didn't care about the pixel values as much)
this is a decent pair of thoughts that sort of highlights that we might want to separate cosmetic/display-oriented processing stages from signal repair stages: the frequency filtering is repairing the image, while doing the minimum projection is just for display purposes (right? i'm assuming in analysis people will want to have the full signal). been working through backlog of issues after the holiday and my first big chunk of work will be on mio so i'll review this and stuff this week. thanks again for this Takuya :) |
Thanks for the comments! I'm stuck with other stuff at the moment but will make it ready for review and ping you guys hopefully next week. Something I kind of want to know from @sneakers-the-rat is whether the high-level structure seems mergeable to the pipeline refactor. No intention to make this really compatible because we need this now and this is an independent module but the code structure is pretty arbitrary so it might be good if there is something I can easily anchor to. And yes, Phil mentioned too dropping broken frames will be good so that's planned (might just add an option of dropping because masking is already there). I was just too lazy to look in how processing pipelines handle this and track dropped buffers. For the FFT and minimum projection, these are just modules so I think these just should be available whenever needed. The frequency mask might be used in actual preprocessing before CMNE pipelines but I'm not sure. |
adding a bug here for the buffer patching files used to get this bug: invivo-20241203-1-003, invivo-20241203-3.avi using commit: |
mio/process/video.py
Outdated
""" | ||
|
||
@staticmethod | ||
def denoise( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems like there's an opportunity for improvement in terms of SOLID design, especially the single responsibility principle. This denoise
method for example appears to:
- check for module
- read video
- write video
- noise patch
- freq. filtering
- interactive display
- image stacking
This will make it incredibly painful to debug, maintain, and extend the code. if we want to add a new processing step, for instance:
a. we wouldn't know where to add the new step
b. the whole processing function now has to be modified
c. the test for the entire processing step has to be updated
d. if something goes wrong we wouldn't know where it went wrong cause we modified the entire processing step.
and if we want to modify this function, everything downstream that uses this function that may be completely irrelevant to the changes has to be also modified. if we want to remove one of the processing steps for the saved video, but keep seeing it in the interactive display, the entire function has to be completely overhauled.
The rule of thumb normally is that if there are more than 2-3 if
statements in a non-low level logic function, it's a symptom that it can be improved for readability and maintainability. :)
I'd hazard a guess that a lot of these if
blocks can be packaged into different classes and files with clearer names and more consistent formatting across methods for better recyclability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed. shouldn't be too tricky to split each of these into separate classes with an __init__
that takes config, process()
that applies the transform, and finish()
(or whatever we want to call it) that handles writing output if requested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, as you can see, it's literally just a raw script disguised as some class now. I'll try to be more explicit that these aren't awaiting formatting review or avoid uploading PRs in draft status so we can work efficiently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, thanks for patience, finally got time to review this.
Three changes I think we should for sure make before merging this:
- split out each of the processing methods into separate classes with a common API and then make the video processor class generic over those classes, details in comments.
- unify the existing
Frame
andFrames
classes with the newNamedFrame
class, making separate classes for a single frame vs. collections of frames. - have tests for all of this
The rest are perf improvements and style comments
whether the high-level structure seems mergeable to the pipeline refactor. No intention to make this really compatible because we need this now and this is an independent module but the code structure is pretty arbitrary so it might be good if there is something I can easily anchor to.
If we split up each of these processing stages into separate classes with a single API then i can handle putting them in pipeline system once it drops.
.gitignore
Outdated
!user_dir/.gitkeep | ||
|
||
~/.config/miniscope_io/logs/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not too much harm in having extra entries in .gitignore, but this should hopefully be unnecessary (not in the repo directory, so shouldn't have an effect anyway)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh this is just an accident don't know how it got in
.pdm-python | ||
user_dir/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what are these directories? we should ideally keep everything we create underneath the single user_dir
set up in the config
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're talking about the user_dir? I just wanted somewhere to keep data while I work in the repo for my convenience. I'll get rid of them before getting out of draft state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I forgot user_dir is used for config. I might keep a user_data dir just for convenience in dev.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
up to you where you want to put it, no problem having an extra line in .gitignore.
if you wanted this to persist you would set user_dir
in the global config for your machine (mio config global path
)
@@ -19,6 +19,107 @@ | |||
from mio.types import ConfigSource | |||
|
|||
|
|||
class VideoWriter: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sort of an odd object with a single staticmethod, esp since this one returns a cv2.VideoWriter
and the the VideoReader
class wraps it. fine for now bc it's on me to get the pipeline system in place where we can make this a predictably structured Sink
class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah just pulled this out from stream daq without thinking much hre
mio/process/video.py
Outdated
|
||
if config.frequency_masking.enable: | ||
freq_filtered_frame, frame_freq_domain = processor.apply_freq_mask( | ||
img=patched_frame, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if noise_patch.enable == False
this won't be defined. we want to pass a single frame through several processing stages, and mutating that frame is a reasonable expectation within the processing pipeline. making a new copy for each processing stage would get pretty dang memory and perf intensive pretty fast.
mio/process/video.py
Outdated
minimum_projection = VideoProcessor.get_minimum_projection(output_frames) | ||
|
||
subtract_minimum = [(frame - minimum_projection) for frame in output_frames] | ||
|
||
subtract_minimum = VideoProcessor.normalize_video_stack(subtract_minimum) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
definitely a separate class as well
mio/process/video.py
Outdated
freq_filtered_frames.append(freq_filtered_frame) | ||
output_frame = freq_filtered_frame | ||
output_frames.append(output_frame) | ||
finally: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same thing here, if each of the processing classes had a finish()
(or whatever name) method that handled its output routine then all we would do here is
for processor in processors:
processor.finish()
mio/process/video.py
Outdated
if config.interactive_display.enable: | ||
videos = [ | ||
raw_video, | ||
noise_patch, | ||
patched_video, | ||
freq_filtered_video, | ||
freq_domain_video, | ||
min_proj_frame, | ||
freq_mask_frame, | ||
] | ||
VideoPlotter.show_video_with_controls( | ||
videos, | ||
start_frame=config.interactive_display.start_frame, | ||
end_frame=config.interactive_display.end_frame, | ||
) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here's a good example of why we want all these things to be separable functions/classes/etc. that can work framewise - ideally this would work on demand, where we just open the UI and compute it as requested in the context of a GUI (not something mio
should try and provide) rather than needing to precompute everything and set up a UI for it afterwards.
mio/process/video.py
Outdated
return min_projection | ||
|
||
@staticmethod | ||
def normalize_video_stack(image_list: list[np.ndarray]) -> list[np.ndarray]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
either should be a separate class like the other processor classes or a function, strange to have it here in a different form than all the other video processing methods.
Also would like to add that while I'm also a frequent violator of this 😭 , each PR should ideally consist of ~10 commits, with fewer than 500 lines of codes. @t-sasatani: Here's something that might help that @sneakers-the-rat and I presented during the lab meeting that you couldn't attend (link) |
Thanks. I'll look at these later, but could you guys hold off on formatting reviews while the PR is marked draft? I commented in this thread last week that I'll change it to ready for review and ping you. It's not like I'm bothered or anything, but I do prefer using the draft as draft so we can discuss experimental prototypes before thinking about formatting or structure. |
And thanks for the info, @raymondwjang. I think I attended that meeting, but I sent a view request because I couldn't access the slides. I can squash commit counts later if you prefer, but I guess it is unlikely that PRs for refactoring or adding new modules (like this one) will fit in 500 lines of code. That rule makes sense for minor additions or fixes, but is it considered better for everything? |
shared! obviously there are reasonable violations to the rule, but simply seeing 1000+ lines on a PR is pretty threatening as a reviewer. even within enterprise level large projects, massive PRs are rare unless there's a sweeping feature change or a complete refactor, as they always introduce exponentially larger review overheads, bug risks, rollback costs, etc. Even with new features it is not super common to see PRs approaching 1000 and above, and that's including the tests and documentations. Also remember that these repos are generally hosted with an extensive suite of CI/CD tools that make sure the codes are healthy. (which we don't currently have) at present this PR is 1000 lines, but with tests and documentations it might approach 2000. that can turn into hours of reviewing and still missing bugs and edge cases. i had a similar pr that grew uncontrollably, and ended up having to chop it up into 5 different PRs so that it's somewhat reviewable. as @sneakers-the-rat can attest, i often fail to keep things nicely packaged for reviewers and future me as well. but from what i've learned, it seems like it's always recommended to keep things small, nimble and modular. do as i say not as i do lmao |
no problem! i was just getting started working on mio again and wanted to catch up here, draft status totally understood, giving my feedback that hopefully might help with some ideas on different ways to finalize it :) and Raymond super appreciate the advocacy of small PRs. There are a few gigantic refactors that need to get done so i'll be guilty of a few very large PRs myself here soon enough, but once we get into a roughly stable code structure we should be able to iterate much more cleanly :). |
Ok I'll try to keep these short, though I'll probably violate it sometimes. It might be easier for me to overlook review comments for draft PRs, and sry for that in advance. |
Co-authored-by: Jonny Saunders <[email protected]>
af4feac
to
41e43b6
Compare
41e43b6
to
80f3a4f
Compare
173fea0
to
10ab69b
Compare
from ..conftest import DATA_DIR, CONFIG_DIR | ||
|
||
def test_interactive_display_config(): | ||
config = DenoiseConfig.from_id("denoise_test").interactive_display |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sneakers-the-rat, it might be in docs somewhere, but could you let me know how to find these configurations in the test dir? I'm putting it into the same directory as the test config for stream DAQ, but getting this. Do you need some registering or anything?
No config with id denoise_test found in /path/to/somewhere/Application Support/mio/config
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just leaving a note here so I don't forget I'm skipping these config tests for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general I don't think we need these kinda tests (if we wanted to ensure that these values don't change, we could just copy it to tests and assert that the copy in tests was equal to the one in the package), but the fact that it isn't being found would be a bug with the config discovery methods or how we are monkeypatching them to include the tests/data/config
directory during testing. i'll check this out in the morning
I'm switching this to ready for review as I'm done for today. It's ready structure-wise, and I changed the "patch" frame function to "drop" frames and added a tracking/export thing. I haven't gone through the review comments yet, more documents/tests need to be completed, and the interfaces, like the naming of the export paths, are unsettled, so I'll return to them later this week. In case anyone wants to take over or make a significant change in the meantime, feel free to do so. |
return mask | ||
|
||
|
||
class ZStackHelper: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i like this idea :)
self.height = height | ||
self.width = width | ||
|
||
def split_by_length(self, array: np.ndarray, segment_length: int) -> list[np.ndarray]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
something like this seems pretty general purpose, but we don't have a good place to put 'general array manipulation functions' like a utils junk drawer would be. i like this helper class idea, bundling some operations together with some staticmethods. i get where this is coming from being split off from the previous implementation, so i'm fine with this, but in the future i might want to play with this helper class idea, make it its own module, and make some generic "ArrayHelper"/"BufferHelper" kind of classes
This PR adds a denoising module for neural recording videos. It can be used offline, and with some updates, it could be used in real time with the
streamDaq
.It would be better to quickly merge this to the main branch to add other processing features in a relatively stable location. Also, nothing depends on this now anyway.
Processing features
Remove broken buffers by copying in the same position buffer in the previous frame.Following is an example frame that detects noisy regions and patches it with the prior frame buffer:
Interface
You can run this with
mio process denoise
using an example video.The denoising parameters and export settings can be defined with a YAML file.
Minor changes
VideoWriter
toio
module because it's pretty generic, and I wanted to use it for the denoising.📚 Documentation preview 📚: https://miniscope-io--83.org.readthedocs.build/en/83/