-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance impact with sparse matrix #68
Comments
Thanks for reporting this issue together with a reproducible example and the benchmark code! My first thought is that the established condition handlers could cost a lot of time but fired conditions may not always be visible (eg. custom condtions). Which version of |
FYI: I can reproduce the behaviour on the current release candidate of If I call the underlying
|
First analysis: The performance loss happens in the function Somehow the print function in call 7 and 8 of the above full call stack has a huge string length ( The goal is now to avoid creating the huge strings (just to cut them later) but this is done in C code inside of ( Edit 1: The R-internal function |
Thank you for checking this and finding the source! Sorry about the missing version number. Here is
When I first ran into this issue while using Thanks for your help with this! |
…t tests not yet fixed (three do fail)
Bug fix implementation detailsImplementation see the new branch: https://github.com/aryoda/tryCatchLog/tree/bug/68_performance_with_matrix How to install the bug fix branch for testing
Minimal reproducible exampleYou can use this code snippet to reproduce the problem and set a breakpoint into the
Background
Each
Now you can find the complete data within
Since
Implemented fixTo avoid this the conversion of the full Luckily the implementation of Now it is easy to change the implementation to extract only the first line (
Open problemsThe stack trace output is shorter now than before because the internal implementation in C uses a different See R's internal implementation:
Expected best solution: With width.cutoff = 60 (R's default) nine unit tests fail! I will hard code the internal width.cutoff (500) in There are "only" three failing unit tests in Next steps
|
This is really amazing! I confirm that the update fixes the issue on my end. I see no lag or hang with large sparse matrices now. Here is the output from the original benchmark test on this new branch:
Times are now on par with and without Thank you for the fix and the thorough explanation! I've learned a ton about R internals from your posts. Very clever to use |
I have analyzed the three failing unit tests now (all related to the call stack output). Analysis results:There are two reasons for the failing unit tests:
Problem 1 is OK, reducing the number or code lines in the stack trace is intentional. Problem 2 is due inconsistent default arguments in the implementation of A direct The SolutionI will apply the following changes:
This is reasonable since
DetailsThe function call path of R's
The
Compare: The stack trace of an error is printed with different Call path:
deparse1WithCutoff(call, FALSE, DEFAULT_Cutoff, backtick, DEFAULTDEPARSE, /* nlines = */ 1); |
BTW:
|
This is really awesome, continuing to learn a ton of R here. Everything looks good from my initial usage of the latest updates on this branch, including the initial case with the large sparse matrix. |
I am preparing the CRAN release new (planned end of October 2021) |
BTW: Core R had a similar performance problem in https://stat.ethz.ch/pipermail/r-devel/2021-October/081159.html |
This is great, I can't thank you enough for it! Been using the fix daily without any issues. Glad Core R has benefited from a similar fix as well! |
FYI: Released at CRAN today as version 1.3.1: https://cran.r-project.org/package=tryCatchLog @yoda-vid Many thanks for your productive and valuable contribution and support! |
Installed the latest release, and it's working great. Thank you again for this fix (and your detailed analysis) and putting out such an excellent and useful package! |
First off, thanks for this package, which has saved me countless hours of debugging in R.
I've encountered a perplexing issue where accessing sparse matrices appears to be slower or even to hang within a
tryCatchLog
block. I can reproduce it by generating a somewhat large sparse matrix with column names, which prints fine on its own but appears to hang when printing it within atryCatchLog
block.For some reason, I only see this issue when including column names (the situation where I first encountered this). Even stranger is if I load a library such as
Seurat
(where I discovered this issue), the problem is only apparent when printing the matrix within a list. Here is a comparison with and withouttryCatchLog
for matrices of increasing size, where the time differences become more dramatic as size increases.Outputs for increasing matrix sizes (in minutes):
Would this have something to do with serializing the matrix in the
tryCatchLog
block?Thanks for your help and for this package!
sessionInfo()
output:The text was updated successfully, but these errors were encountered: