-
Notifications
You must be signed in to change notification settings - Fork 503
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce GC'able cache for Roo::Excelx::Extractor#doc #456
Introduce GC'able cache for Roo::Excelx::Extractor#doc #456
Conversation
Additionally, we can look if this optimization can be used elsewhere in the repo. If so, we can extract instance_cache code and optimize other places. But, I think it will be better to do that in separate PR. |
@chopraanmol1 The bottom of the Since we only REALLY support MRI 2.0+, shouldn't we just use that rather than pull in an external dependency? That and I don't see much activity on the WeakRef repo in the past year. |
Memory Profiling script: ``` MemoryProfiler.report{Roo::Excelx.new('test/files/Bibelbund.xlsx').tap{|x|(2..x.last_row).each{|i| x.row(i)}}}.pretty_print ``` Benchmark Script: ``` puts Benchmark.measure{ Roo::Excelx.new('test/files/Bibelbund.xlsx').tap{|x|(2..x.last_row).each{|i| x.row(i)}} } ``` Result on Master: ``` Total allocated: 37131810 bytes (517026 objects) Total retained: 5562913 bytes (103010 objects) allocated memory by gem ----------------------------------- 19288066 roo/lib 11049821 nokogiri-1.8.4 6792403 rubyzip-1.2.2 1304 tmpdir 216 other retained memory by gem ----------------------------------- 5560934 roo/lib 782 rubyzip-1.2.2 725 nokogiri-1.8.4 296 tmpdir 176 other 0.720000 0.020000 0.740000 ( 0.733750) ``` Result after patch: ``` Total allocated: 34561842 bytes (504998 objects) Total retained: 5563553 bytes (103026 objects) allocated memory by gem ----------------------------------- 19254338 roo/lib 8513101 nokogiri-1.8.4 6792403 rubyzip-1.2.2 1304 tmpdir 320 weakref 216 other 160 ref-2.0.0 retained memory by gem ----------------------------------- 5561094 roo/lib 782 rubyzip-1.2.2 725 nokogiri-1.8.4 320 weakref 296 tmpdir 176 other 160 ref-2.0.0 0.610000 0.010000 0.620000 ( 0.618642) ```
c939c01
to
feee367
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great
Discussed in roo-rb#457 (comment) No noticeable performance or memory benefits. I think roo-rb#456 must have solved most performance and memory issue regarding extract_hyperlinks.
pkgsrc change: add "USE_LANGUAGES= # none". ## [2.8.0] 2019-01-18 ### Fixed - Fixed inconsistent column length for CSV [375](roo-rb/roo#375) - Fixed formatted_value with `%` for Excelx [416](roo-rb/roo#416) - Improved Memory consumption and performance [434](roo-rb/roo#434) [449](roo-rb/roo#449) [454](roo-rb/roo#454) [456](roo-rb/roo#456) [458](roo-rb/roo#458) [462](roo-rb/roo#462) [466](roo-rb/roo#466) - Accept both Transitional and Strict Type for Excelx's worksheets [441](roo-rb/roo#441) - Fixed ruby warnings [442](roo-rb/roo#442) [476](roo-rb/roo#476) - Restore support for URL as file identifier for CSV [462](roo-rb/roo#462) - Fixed missing location for Excelx's links [482](roo-rb/roo#482) ### Changed / Added - Drop support for ruby 2.2.x and lower - Updated rubyzip version for fixing security issue. Now minimal version is 1.2.1 - Roo::Excelx::Coordinate now inherits Array [458](roo-rb/roo#458) - Improved Roo::HeaderRowNotFoundError exception's message [461](roo-rb/roo#461) - Added `empty_cell` option which by default disable allocation for Roo::Excelx::Cell::Empty [464](roo-rb/roo#464) - Added support for variable number of decimals for Excelx's formatted_value [387](roo-rb/roo#387) - Added `disable_html_injection` option to disable html injection for shared string in `Roo::Excelx` [392](roo-rb/roo#392) - Added image extraction for Excelx [414](roo-rb/roo#414) [397](roo-rb/roo#397) - Added support for `1e6` as scientific notation for Excelx [433](roo-rb/roo#433) - Added support for Integer as 0 based index for Excelx's `sheet_for` [455](roo-rb/roo#455) - Extended `no_hyperlinks` option for non streaming Excelx methods [459](roo-rb/roo#459) - Added `empty_cell` option to disable Roo::Excelx::Cell::Empty allocation for Excelx [464](roo-rb/roo#464) - Added support for Integer with leading zero for Roo:Excelx [479](roo-rb/roo#479) - Refactored Excelx code [453](roo-rb/roo#453) [477](roo-rb/roo#477) [483](roo-rb/roo#483) [484](roo-rb/roo#484) ### Deprecations - Roo::Excelx::Sheet#present_cells is deprecated [454](roo-rb/roo#454) - Roo::Utils.split_coordinate is deprecated [458](roo-rb/roo#458) - Roo::Excelx::Cell::Base#link is deprecated [457](roo-rb/roo#457)
Memory Profiling script: ``` MemoryProfiler.report{Roo::Excelx.new('test/files/Bibelbund.xlsx').tap{|x|(2..x.last_row).each{|i| x.row(i)}}}.pretty_print ``` Benchmark Script: ``` puts Benchmark.measure{ Roo::Excelx.new('test/files/Bibelbund.xlsx').tap{|x|(2..x.last_row).each{|i| x.row(i)}} } ``` Result on Master: ``` Total allocated: 37131810 bytes (517026 objects) Total retained: 5562913 bytes (103010 objects) allocated memory by gem ----------------------------------- 19288066 roo/lib 11049821 nokogiri-1.8.4 6792403 rubyzip-1.2.2 1304 tmpdir 216 other retained memory by gem ----------------------------------- 5560934 roo/lib 782 rubyzip-1.2.2 725 nokogiri-1.8.4 296 tmpdir 176 other 0.720000 0.020000 0.740000 ( 0.733750) ``` Result after patch: ``` Total allocated: 34561842 bytes (504998 objects) Total retained: 5563553 bytes (103026 objects) allocated memory by gem ----------------------------------- 19254338 roo/lib 8513101 nokogiri-1.8.4 6792403 rubyzip-1.2.2 1304 tmpdir 320 weakref 216 other 160 ref-2.0.0 retained memory by gem ----------------------------------- 5561094 roo/lib 782 rubyzip-1.2.2 725 nokogiri-1.8.4 320 weakref 296 tmpdir 176 other 160 ref-2.0.0 0.610000 0.010000 0.620000 ( 0.618642) ```
…b#459) Discussed in roo-rb#457 (comment) No noticeable performance or memory benefits. I think roo-rb#456 must have solved most performance and memory issue regarding extract_hyperlinks.
Memory Profiling script:
Benchmark Script:
Result on Master:
Result after patch:
Note:
Ruby does have native implementation of WeakRef. But I've choosed to go with 'ref' gem for this following reason:
https://github.com/ruby-concurrency/ref#problems-with-weakref