Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cache for KaldiReader #1004

Merged
merged 4 commits into from
Mar 22, 2023
Merged

Conversation

david20181
Copy link

Make KaldiReader faster by using cache of readers

)


@lru_cache(maxsize=None)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also update

def close_cached_file_handles() -> None:

to clear the objects cached by this function?

Also maybe I'm wrong but I think we once supported caching for kaldi_io/kaldi_native_io but for very large Kaldi data dirs it took a lot of memory.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion on clearing cache. I updated accordingly.

I've tested on some fairly large data dirs and have not observed memory issue.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for confirming that! LGTM, please just fix the formatting.

@pzelasko pzelasko merged commit 55da8b4 into lhotse-speech:master Mar 22, 2023
@pzelasko pzelasko added this to the v1.13 milestone Mar 22, 2023
@pzelasko
Copy link
Collaborator

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants