-
Notifications
You must be signed in to change notification settings - Fork 225
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize cut_into_windows
for long cuts
#1150
Optimize cut_into_windows
for long cuts
#1150
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great catch!! Perhaps we can simplify the code to always build the tree, I doubt the overhead with small cuts would be significant.
@pzelasko I actually would ultimately opt for always having the supervisions (and tracks in MixCut, and alignments inside a supervision) stored in an interval tree, maybe a thin wrapper around one from the library which can fall back to linear splits for small tree sizes... But that would be a major codebase overhaul, which should be further discussed, though I anticipate that such an approach would make lots of operations more efficient while making lots of code much simpler. In the meanwhile, I removed the condition, and the IntervalTree is built every time. |
Interesting point. Do you see any immediate benefits of transitioning to interval tree to hold the supervisions? Except for arrays/tensors, I generally wanted to keep the APIs free of third-party classes (they're practically limited to built-in types and a few Lhotse basic types). |
@pzelasko I do not see particular benefits of keeping limited to Python built-in types. Tbh, I would rather migrate everything related to manifests and their sets into some kind of performant pure immutable persistent trees and sequences with laziness by default (i.e. rewrite everything in Haskell lol). To keep it real, I think it would be quite nice to have a kind of Split is more simple and fundamental than If any type of manifest supports |
At some point I contemplated moving basic data types to C++ structs for smaller memory footprint and faster (de)serialization, and maybe actual immutability, but it would have drastically complicated maintenance and further development of the library and limited the number of people who could help out. I like the idea in principle but it's likely not worth the trouble.
I understand the technical solution, but what would be the impact/motivation? |
@pzelasko The motivation is to unify and heavily simplify all logic around cutting stuff recursively, throwing out large amounts of repetitive code and stopping caring about performance optimizations on a case-by-case basis. I.e. imagine if
and
where
For the
Imagine how much code could be thrown out and how much smaller the codebase could become. |
I think splitting very long utterances into windows is kind of an unusual use-case that shouldn't drive the overall design. |
Even though the ability to optimize
truncate
calls usingIntervalTree
index on supervisions has been present for quite a while, it hasn't been used in thecut_into_windows
method... I don't know why.I used this optimization when the numbers of windows and supervisions were not very small and achieved dramatic (~50x) improvements for cutting sets of very long recordings into windows:
(Tested on Intel Xeon Gold 5315Y CPU @ 3.20GHz)