Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON.dump: write directly into the provided IO #686

Merged
merged 4 commits into from
Nov 21, 2024
Merged

Conversation

byroot
Copy link
Member

@byroot byroot commented Nov 5, 2024

Ref: #524

Fix: #538
Fix: #524

Rather than to buffer everything in memory.

Unfortunately Ruby doesn't provide an API to write into and IO without first allocating a string, which is a bit wasteful.

FYI: @headius

@headius
Copy link
Contributor

headius commented Nov 20, 2024

Java side is implemented atop this PR in #538.

@headius
Copy link
Contributor

headius commented Nov 20, 2024

We need some benchmarks!

byroot and others added 3 commits November 21, 2024 09:33
Ref: ruby#524

Rather than to buffer everything in memory.

Unfortunately Ruby doesn't provide an API to write into
and IO without first allocating a string, which is a bit
wasteful.
When an IO is given, we should try to write directly to it. This
patch moves that direction by always doing JSON dumping into an
OutputStream, which can be implemented on top of a given IO or by
producing a ByteList via a ByteArrayOutputStream.
This connects up the OutputStream-based generator logic to the
incoming IO parameter (an IO or nil). Also included here are some
small changes to support the new state.rb:

* Require the state.rb file at the end of ext init.
* Move State#configure to _configure.
* Add State#strict? as an alias for strict.
@byroot
Copy link
Member Author

byroot commented Nov 21, 2024

I cherry picked you commits.

Now TruffleRuby head seem to crash for some reason, but I'm having trouble understanding Truffle crash reports, so I'm going to assume it's just a bug in Truffle. FYI: @eregon

Also the TruffleRuby::Generator is still buffering everything in a string before flushing, like other implementations used to.

@byroot byroot merged commit 871e13e into ruby:master Nov 21, 2024
34 of 36 checks passed
@byroot byroot deleted the io-flush branch November 21, 2024 08:51
@eregon
Copy link
Member

eregon commented Nov 21, 2024

Re the TruffleRuby crash, @andrykonchin can you look at that?

@headius
Copy link
Contributor

headius commented Nov 21, 2024

@byroot Thanks! I have some optimizations I'll do in a separate PR now.

@andrykonchin
Copy link
Member

@eregon The issue is fixed in oracle/truffleruby@567aa27.

@byroot
Copy link
Member Author

byroot commented Nov 22, 2024

Urk, I didn't realize rb_gc_mark_locations was using rb_gc_mark_maybe under the hood. I probably should use a loop instead.

@eregon
Copy link
Member

eregon commented Nov 22, 2024

Yes, I've never seen rb_gc_mark_locations() before, I think a simple loop would be better/safer and as fast or faster.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Streaming enhancements for dumping
4 participants