Non-single-logical-flow (multiple pulls) #30

Fishrock123 · 2019-06-06T18:42:40Z

It seems that newer network protocols like QUIC desire multiple chunks of data to be in-flight at once (besides consider re-sending).

This probably violates these two core design ideas:

One-to-one: The protocol assumes a one-to-one relationship between producer and consumer.

In-line errors and EOF: Errors, data, and EOF ("end") should flow through the same call path.

It may also unleash zalgo? lol.

Anyways, I think it is possible to still keep things simple and "pretend" that things are multiplexed, by doing slightly more waiting at the network sink end. I'm not really sure that perf would be considerably impacted in most cases?

Edit: See #30 (comment) for updated thoughts.

jasnell · 2019-06-06T18:53:30Z

I don't actually think QUIC's design violates these ideas. The data flow is still one-to-one between producer and consumer, and in-line errors, data, and EOF are still in the same call path. The only difference that QUIC introduces is this idea that a chunk of data that I've already seen might need to be called again. At most, we may need to differentiate terminal states such that we separate There-is-no-more-data-to-give-you-go-away vs. There-is-no-additional-data-to-give-you-but-you-can-re-request-data-you-asked-for-before.

Fishrock123 · 2019-06-06T22:56:09Z

Ok so I just had a wild thought... what if, instead of having multiple pulls, implementations requiring this just use multiple streams?

Maybe this isn't easy if there's no limit to the multiplexing but hear me out...

The fact that currently only one request/response for data can be in flight in the stream at any one time is a big parts of what reduces the need for almost any state, especially for error handling. This makes the whole thing a lot cheaper. So, if you can, say, share a file descriptor... you could open N number of streams to it, which would be able to do the work similarly cocurrently while being much more simple logically, and not really much extra overhead. From my past musings, I am quite certain that making split/join transforms would also be pretty easy to do logically correct, which would allow such a system to talk to a single stream endpoint if necessary. Additionally... maybe that kind of thing could be threaded easier in C++?

Idk, lmk what you think. I can prototype out a split/join.

jasnell · 2019-06-07T01:15:17Z

Hmm that could work. Let me stew on it.

Fishrock123 · 2019-06-11T01:13:58Z

Should receiving additional pull() and/or next()s from components which an error has not yet bubbled to... matter?

i.e. component has errored, error passed along, but data is still flowing around at the same time. It seems to me that preventing post-mortem flow would require a decent amount of extra work.

How does QUIC handle this situation?

Fishrock123 · 2019-07-22T19:06:54Z

So I've had returning thoughts to this, mostly from two places:

@mcollina's request for multi-buffer support

I can't really think of a pleasant way to fit MB support into the existing api - it never really makes sense to me, within the whole modal multiple buffers, I think, should be separate responses.

However, adding multi-pull support could alleviate this, possibly? If the sink can accept multiple chunks, it could pull multiple times and then get responses accordingly?

(One note on that, we may have to an additional status.ended to deal with pulls that are after end? Extra complexity.)

My experience writing crc-transform

I was comparing against my built in crc32 cli command (which runs TCL and Perl in some combination talking to zlib's native C crc calculator) and wasn't happy with the numbers.

Getting it "close" to native numbers was much harder than expected and the best I could do still took 50% longer (and much more cpu). While thinking on potential optimizations I realized that it would be ideal to be making another async filesystem request at the same time as you are currently processing one, necessitating something like multiple pulls via setImmediate()s. It could be tricky to 'get right' but the payoff could be pretty big? Again, more complexity...

Raynos · 2019-08-06T09:07:05Z

My understanding of the sink api is that is not very friendly for writing to, it needs a source and it's back pressure mechanism is to pull() when it wants more data.

For multiple buffer support you can implement nextv() api which just writes multiple buffers to the sink, it can then.

One piece of contention in the current design of the sink API is whom allocates the buffer. If I have data already in buffers that needs to be written to a socket or disk it doesn't make sense for the sink to allocate a write buffer and tell me to copy values into it.

Raynos · 2019-08-06T09:51:59Z

Getting it "close" to native numbers was much harder than expected and the best I could do still took 50% longer (and much more cpu). While thinking on potential optimizations I realized that it would be ideal to be making another async filesystem request at the same time as you are currently processing one, necessitating something like multiple pulls via setImmediate()s. It could be tricky to 'get right' but the payoff could be pretty big? Again, more complexity...

I think it's fine for the implementation of a sink to pre-emptively call pull() immediately once next() is called and then to start processing CPU bound stuff.

It will need a boolean field to gaurd against re-entry if pull() calls next() synchronously and it will need a queue of pending buffers to process etc.

Actually it needs to keep a counter of pending pull operations in case pull calls next synchronously which calls pull synchronously etc, causing the entire source to be pulled before any processing at all.

dominictarr · 2019-12-11T22:51:35Z

I tried allowing multiple calls in pull-stream, but decided it added too much complexity. Is this something that needs to be supported along the entire pipeline or something that can just be an internal detail of the QUIC implementation? On the write side it's easy - just accept multiple writes at a time. on the read side, since BOB has the reader pass in the buffer, it's not gonna work. (my gut feeling is that that's too complicated anyway, an object pool for buffers would have the same advantages but would decouple it from streams, allowing object streams, which make streams more useful)

jasnell · 2019-12-12T01:26:13Z

The more I think about it the more I think we won't need the multiple reads for quic. So I think we can completely avoid the complexity in that case. The basic protocol just works.

This was referenced Jul 23, 2019

Progress 23/07/2019 - July #40

Closed

Organize a meeting #43

Open

Fishrock123 changed the title ~~Non-single-logical-flow (multiple pull requests)~~ Non-single-logical-flow (multiple pulls) Sep 23, 2019

This was referenced Sep 23, 2019

Buffer allocation hints #52

Open

Streams3 adaptor(s) #35

Closed

Managed state #54

Open

Fishrock123 mentioned this issue Nov 8, 2019

Where do object streams fit in? #55

Closed

Fishrock123 mentioned this issue Dec 11, 2019

potential stack overflows #57

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Non-single-logical-flow (multiple pulls) #30

Non-single-logical-flow (multiple pulls) #30

Fishrock123 commented Jun 6, 2019 •

edited

Loading

jasnell commented Jun 6, 2019

Fishrock123 commented Jun 6, 2019

jasnell commented Jun 7, 2019

Fishrock123 commented Jun 11, 2019 •

edited

Loading

Fishrock123 commented Jul 22, 2019 •

edited

Loading

Raynos commented Aug 6, 2019

Raynos commented Aug 6, 2019

dominictarr commented Dec 11, 2019

jasnell commented Dec 12, 2019

Non-single-logical-flow (multiple pulls) #30

Non-single-logical-flow (multiple pulls) #30

Comments

Fishrock123 commented Jun 6, 2019 • edited Loading

jasnell commented Jun 6, 2019

Fishrock123 commented Jun 6, 2019

jasnell commented Jun 7, 2019

Fishrock123 commented Jun 11, 2019 • edited Loading

Fishrock123 commented Jul 22, 2019 • edited Loading

Raynos commented Aug 6, 2019

Raynos commented Aug 6, 2019

dominictarr commented Dec 11, 2019

jasnell commented Dec 12, 2019

Fishrock123 commented Jun 6, 2019 •

edited

Loading

Fishrock123 commented Jun 11, 2019 •

edited

Loading

Fishrock123 commented Jul 22, 2019 •

edited

Loading