Drop offsets when tokenizing with RegularParser #111

funkjedi · 2024-04-23T04:39:49Z

Testing using my sample string that was causing memory issues for me memory usage was reduced by 46% using refactor.

Super simple test, showing peak usage after processing (aka. memory_get_peak_usage()) minus peak prior to processing.

new code	existing code
110MB	258MB

Also referenced in #110

Note on changes. Reversed order of cases in switch as when for example marker is matched any subsequent captures (ex, separator) won't be present in matches array.

thunderer force-pushed the memory-usage-optimization branch 3 times, most recently from 4e3a622 to 65f14e2 Compare June 3, 2024 18:14

Drop offsets when tokenizing with RegularParser

08c2e6b

thunderer force-pushed the memory-usage-optimization branch from 65f14e2 to 08c2e6b Compare December 20, 2024 19:21

thunderer merged commit 8d66350 into thunderer:master Dec 20, 2024
15 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Drop offsets when tokenizing with RegularParser #111

Drop offsets when tokenizing with RegularParser #111

funkjedi commented Apr 23, 2024

Drop offsets when tokenizing with RegularParser #111

Drop offsets when tokenizing with RegularParser #111

Conversation

funkjedi commented Apr 23, 2024