Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Summit Topic: Node.js Streams promise support #216

Closed
1 of 3 tasks
mcollina opened this issue Dec 1, 2019 · 39 comments
Closed
1 of 3 tasks

Summit Topic: Node.js Streams promise support #216

mcollina opened this issue Dec 1, 2019 · 39 comments
Assignees
Labels
Collaborator Summit Montreal 2019 Session Proposal A session proposal for the Collaboration Summit

Comments

@mcollina
Copy link
Member

mcollina commented Dec 1, 2019

Topic of the session

We have added async iterators to stream a quite successfully! It's time to think of what's missing.

Type of the session

  • Collaborate
  • Workshop
  • Talk

Follow-up / Set-up sessions (if any)

List the sessions that are related.

Level

  • [] Beginner
  • [] Intermediate
  • [] Advanced

Pre-requisite knowledge

List pre-requisite knowledge that it would be required for participants to have.

Describe the session

Session facilitator(s) and Github handle(s)

Additional context (optional)

@Fishrock123
Copy link

I’d wager this is probably “intermediate” and should list my session as a follow-up

@evahowe
Copy link
Contributor

evahowe commented Dec 5, 2019

@mcollina How is this different that the other session on Streams?

@WaleedAshraf
Copy link
Member

@evahowe If you check the current draft here
I have listed them under different titles.

@evahowe
Copy link
Contributor

evahowe commented Dec 6, 2019

@WaleedAshraf - sounds good
@Fishrock123 - made note
@mcollina - How much time?

@mcollina
Copy link
Member Author

mcollina commented Dec 6, 2019

From 30 minutes to 1 hour

@jonchurch
Copy link

jonchurch commented Dec 13, 2019

Promisify writable.write()

Key Constraints:

highWaterMark support

“Proper” Error handling

for await (let chunk of stream) {
  await ???
}

@benjamingr
Copy link

benjamingr commented Dec 13, 2019

Very rough idea:

No promises writeable.write, instead we use duality and do the async iterator API.

Async iterators do:

{
next(value): Promise<nextValueResult>
return(value): Promise<nextIfNoYieldOrFinallyOrLastValueResult>
throw(err): Promise<nextValueResult>
}

You can make a direct dual interface (moving parts around):

{
  next(nextValueResult): Promise<value>
  return(nextIfNoYieldOrFinallyOrLastValueResult): Promise<value>
  throw(nextValueResult): MaybeRejectedPromsie<err>
}

The first thing you will notice is that async iterators as an interface is already a dual interface. We haven't actually changed anything here in orxcer to get the API - since async iterators are bidiractional and kinda-dual already.

That is, if you had an API that did:

writeable.next(value) that returns a promise for the value and have that promise resolve fulfill when the value is ready that would be a nice API - you would need to support encoding and controlling whether or not it needs to wait for the highWatermark (since you have no drain). I would recommend always waiting for the highWatermark and if someone doesn't want that they can just set it to zero.

Error handling would work like in async iterator.

@benjamingr
Copy link

benjamingr commented Dec 13, 2019

What do we put in HTTP.promises? nothing.

Another idea is service workers - but that is not really a server HTTP API.

I would recommend we just promisify the building blocks of HTTP (streams and event emitters).

For the client: probably fetch once we resolve eventemitter vs eventtarget etc.

@davidmarkclements
Copy link

davidmarkclements commented Dec 13, 2019

proof of concept:

Writable.prototype.write[promisify.custom] = function writeProm () {
  return (chunk) => new Promise((resolve, reject) => {
    let isError = false
    const handleErr = (err) => {
      reject(err)
      isError = true
    }
    this.once('error', handleErr)
    const drain = this.write(chunk) === false
    if (drain) {
      this.once('drain', () => {
        this.removeEventListener(handleErr)
        resolve()
      })
      return
    }
    waitForTheRightMoment(() => {
      this.removeEventListener(handleErr)
      resolve()
    })
  })
}

example usage:

const { promisify } = require('util')
const write = promisify(MyWritableStream)
async function run () {
  for await (const item of iterable) {
    await write(item)
  }
}

run.catch((err) => console.error(err))

@ggoodman
Copy link

ggoodman commented Dec 13, 2019

Using a WriteController to hold state for any number for chunks being written to the sink:

const writeController = new WriteController();

for await (const chunk of source) {
  const buffered = sink.write(chunk, { writeController });

  await somethingRisky();

  // Produces a promise that represents the successful flushing of all previous
  // chunks written to the sink and associated with the `WriteController`.
  await writeController.next();
}

If you do not want to block on each chunk but only are interested in knowing if all chunks have been collectively flushed:

const writeController = new WriteController();

for await (const chunk of source) {
  const buffered = sink.write(chunk, { writeController });
  // Maybe interested in knowing about highWaterMark
  await somethingRisky();
}

// Since we're outside the `for await` loop, we can also easily defer waiting for flushing
// here.
await writeController.next();

The idea here could also live totally outside of Writable as well:

const writeController = new WriteController();

for await (const chunk of source) {
  const whouldWait = writeController.write(sink, chunk);

  if (shouldWait) {
    await writeController.draining();
  }
}

// I want to make sure all my chunks have flushed
await writeController.flushed();

@Ethan-Arrowood
Copy link

Ethan-Arrowood commented Dec 13, 2019

Might be incorrect, but this could be what http.promises.get with the response http.IncomingMessage stream could look like:

const http = require('http')
const { promisify } = require('util')

http.promises.get = promisify(http.get)

(async () => {
  try {
    const res = await http.promises.get('localhost:3000')
    for await (let chunk of res) {
      await ...
    }
  } catch (err) {
    console.log(err)
    process.exit(1)
  }
})()

@developit
Copy link
Contributor

Potential ClientRequest.response async getter that essentially returns what the callback is currently given:

const req = request('https://example.com');
const res = await req.response;
for await (let chunk of res.body) {
  console.log(chunk);
}

@benjamingr
Copy link

benjamingr commented Dec 13, 2019

@Ethan-Arrowood fwiw that's basically the fetch API:

import { fetch } from 'http';
const res = await fetch('http://localhost:3000'); // we can use top-level await
for await (let chunk of res) {
  await ...
}

@developit
Copy link
Contributor

Easy one: promisify Server.listen():

const server = http.createServer(() => {   });
const address = await server.listen(0);

@jasnell
Copy link
Contributor

jasnell commented Dec 13, 2019

var chunks = [a,b,c]

for (chunk : chunks) {
  // Returns a promise that resolves when we can write the next chunk (equivalent to true/false return)
  const c = w.write(chunk, () => {})
  // Resolves when writing the chunk has been completed (equivalent to write callback)
  c.done.then(() => {})
  await c;
}

@Fishrock123
Copy link

Using thenables for Writable.write():

async function run () {
  for await (const item of iterable) {
    const queued = w.write(item)
    // await queued.then()
    // await queued.flushed.then()
  }
}

run.catch((err) => console.error(err))

For errors, I think that an error on the chunk should be called to the queued first, and then emitted.

@Ethan-Arrowood
Copy link

Ethan-Arrowood commented Dec 13, 2019

@Ethan-Arrowood fwiw that's basically the fetch API:

this is what i was thinking too. But also .get returns a ClientRequest so maybe that needs to be incorporated in the resolve

@a0viedo
Copy link

a0viedo commented Dec 13, 2019

A few constraints I can think of for promisifying http.request

In the case of a premature connection close before the response is received, the following events will be emitted in the following order:

'socket'
'error' with an error with message 'Error: socket hang up' and code 'ECONNRESET'
'close'

I think the sequence of events is very similar for successful requests and those that errored and if you were to offer a Promise API then you would be able to return control to the caller just once, either with a rejection or resolved Promise.

Issue 1

If you were to reject a promise returned by http.request when the equivalent event emitter for the 'error' event (or "close") then you wouldn't be offering the same parity and missing the 'socket' event.

Issue 2

In the case of a premature connection close after the response is received then the callback API still offers the result of the request through the 'response' event but even if you would have the data to be able to include in the reject it is considered an anti-pattern to include such information in a promise rejection.

@Fishrock123
Copy link

For http.promises, the following code is always a trap:

for await (const connection of http.createServer()) {
   // ...
}

Anything in this block that uses the natural async function flow (via await) will always block the next connection. The only way to avoid this is by creating a callback off of some async primitive.

@developit
Copy link
Contributor

const server = http.createServer((req, res) => {
  let chunks = [];
  for await (let data of req.data) {
    chunks.push(data)
  }
  // or if you only care about the full body, not chunks
  await req.dataEnd();
})

Plus similar async iterator methods for any .on methods that fire repeatedly, or Promise-returning methods for any .on methods that fire once.

@BridgeAR
Copy link

Is it possible to stream this session by any chance? (As in open a zoom session)

@benjamingr
Copy link

@Fishrock123 why would they have to be thenables and not regular promises.

@BridgeAR it recently ended I believe.

@davidmarkclements
Copy link

For errors, I think that an error on the chunk should be called to the queued first, and then emitted.

yeah agreed, I've updated #216 (comment) with "waitForTheRightMoment" to symbolize the order of events. We can use internals to establish the "right moment" and adapt that to a more internally coupled implementation

@BridgeAR
Copy link

@benjamingr yes, it was in parallel with the releases one, so I could not join earlier but I thought maybe they where still ongoing. Thanks though.

@benjamingr
Copy link

Sure thing see you in March

@Fishrock123
Copy link

@benjamingr to avoid unhandled promises when people do not actually need to wait for one or the other.

Other options:

  • Regular promises on a "queued" object
  • Events on a "queued" object

@evahowe
Copy link
Contributor

evahowe commented Dec 13, 2019 via email

@benjamingr
Copy link

@evahowe note that this session has ended a while ago and the next one is streaming :]

@apapirovski
Copy link

apapirovski commented Dec 13, 2019

I feel like something like this works reasonably well, whether you care about errors or not:

Writable.prototype.write[promisify.custom] = () =>
  function promisifiedWrite(chunk, encoding) {
    let syncErr;
    const needDrain = !this.write(chunk, encoding, (err) => {
      syncErr = err
    });

    const stream = this;
    return {
      then() {
        if (syncErr) {
          throw syncErr;
        }

        if (needDrain) {
          return new Promise((resolve, reject) => {
            stream.once('error', reject);
            stream.once('drain', () => {
              stream.removeEventListener(reject);
              resolve();
            });
          });
        }
      }
    };
  };

Wouldn't leave it as a promisified function though, ideally it's first-class.

@benjamingr
Copy link

benjamingr commented Dec 13, 2019

apapirovski did you mean to make that then a getter? If so it needs to return the then of tthe internal promise (and return a rejected promise and not throw an error). If not then the signature of then is then(onFulfilled, onRejected) and both of them perform recursive assimilation. Any reason this can't be?

Writable.prototype.write[promisify.custom] = () => function (chunk, encoding) {
   let syncErr;
   const needDrain = !this.write(chunk, encoding, (err) => {
     syncErr = err
   });
   if (syncErr) return Promise.reject(err); // does this always happen synchronously?
   if (!needDrain) return Promise.resolve(); 
   return EventEmitter.once(this, 'drain'); // once handles errors
};

@apapirovski
Copy link

apapirovski commented Dec 13, 2019

@benjamingr nope, meant that to be as is...

The implementation you've posted will reject a promise regardless. Avoiding that was an explicit design decision.

But also to be fair the throw is kinda sketchy. I don't think that might even work correctly lol. (Although seems to in my testing...) Either way, that line could be swapped for return Promise.reject(syncErr)

Anyway, sounds like you're suggesting something more like this...

Writable.prototype.write[promisify.custom] = () =>
  function promisifiedWrite(chunk, encoding) {
    let syncErr;
    const needDrain = !this.write(chunk, encoding, (err) => {
      syncErr = err
    });

    const stream = this;
    return {
      get then() {
        if (syncErr) {
          return Promise.reject(syncErr);
        }

        if (needDrain) {
          return new Promise((resolve, reject) => {
            stream.once('error', reject);
            stream.once('drain', () => {
              stream.removeEventListener(reject);
              resolve();
            });
          });
        }
      }
    };
  };

Similar in behavior but the downside is that inspecting the returned object in any way can cause the promises to be created.

Either way, my proposal technically can leak memory if someone awaits the promise too late and it required draining... Which I guess brings you to something more like this:

Writable.prototype.write[promisify.custom] = () =>
  function promisifiedWrite(chunk, encoding) {
    let syncErr;
    const needDrain = !this.write(chunk, encoding, (err) => {
      syncErr = err
    });

    let drainFn;
    if (needDrain) {
      stream.once('drain', () => {
        needDrain = false;
        if (drainFn) {
          drainFn();
        }
      });
    }

    const stream = this;
    return {
      then() {
        if (syncErr) {
          return Promise.reject(syncErr);
        }

        if (needDrain) {
          return new Promise((resolve, reject) => {
            stream.once('error', reject);
            drainFn = () => {
              stream.removeEventListener(reject);
              resolve();
            };
          });
        }
      }
    };
  };

@benjamingr
Copy link

The Promises/A+ spec (and thus the ECMAScript spec) explicitly requires calling then as a getter exactly once when chaining a promise though I see a few issues (like adding a listener on each call and also - probably return the then of the returned promises bound to them rather than the promise itself (since then is a function and not a promise).

I think lazy promises are mostly confusing (since promises are not actions) - and I agree that if we explore it we should not execute the action on the then getter but on invocation.


If what we want is to not create a rejection unless someone is listening and to not listen to certain events until someone is listening (which is arguably confusing) - maybe something like:

const lazy = fn => class extends Promise { // so we get catch, finally etc that all call `then`
  #promise = null;
  then(onFulfilled, onRejected) {
    this.#promise = this.#promise || fn();
    return this.#promise.then(onFulfilled, onRejected);
  }
};

Writable.prototype.write[promisify.custom] = () => function (chunk, encoding) {
   let syncErr;
   const needDrain = !this.write(chunk, encoding, (err) => {
     syncErr = err
   });
   if (syncErr) {
     let writeError = Promise.reject(err); 
     writeError.catch(() => {}); // don't care about this rejection no need to track it 
     return  writeError;
   }
   if (!needDrain) return Promise.resolve();  // no effects
   return lazy(() => EventEmitter.once(this, 'drain'));
};

@apapirovski
Copy link

That's much neater haha

which is arguably confusing

Yeah, I was just taking off from where Jeremiah left it. I can see why some people would want to have the option to not get unhandled rejections in certain scenarios when using the promisified API.

(Also need to handle the error event which I forgot to do, but yeah...)

@trivikr
Copy link
Contributor

trivikr commented Dec 14, 2019

Tweet thread of the session https://twitter.com/trivikram/status/1205512468954521600

@christian-bromann
Copy link
Member

@mcollina are there any notes or other artifacts for this session? If so can you share them here? I am happy to make a PR and add them to the summit directory or feel free to raise one by yourself. Thanks!

@mcollina
Copy link
Member Author

mcollina commented Jan 6, 2020

Most of the discussion is captured in this issue. I plan to open a couple of issues on Node in the coming weeks to keep the discussion going.

@christian-bromann
Copy link
Member

Ok, gonna close this issue then.

@ronag
Copy link

ronag commented Jan 15, 2020

This is my take:

Writable.prototype.write[promisify.custom] = () => function (chunk, encoding) {
  return new Promise((resolve, reject) => {
    let complete = false;
    const needDrain = !this.write(chunk, encoding, (err) => {
      complete = true;
      if (err) {
        reject(err)
      } else {
        resolve();
      }
      this.off('error', reject);
    });
    if (complete) {
      // Do nothing
    } else if (needDrain) {
      this.on('error', reject);
    } else {
      resolve();
    }
  });
};

Of course all of these has the downside of:

let bytesSuccessfullyWritten = 0;
await w.write(buf);
bytesSuccessfullyWritten += buf.length; // BUG
// Coming here does NOT mean that the above write was successful

I think an API like this would make more semantic sense:

let bytesSuccessfullyWritten = 0;
let bytesBuffered = 0;
for await (const chunks of source) {
  bufferedBytes += chunk.length;
  if (!w.write(chunk)) {
    await w.flush();
    bytesSuccessfullyWritten += bytesBuffered;
    bytesBuffered = 0;
  }
}
await w.end();

Which would avoid confusion caused by the natural assumption that a successfully resolved promise from write means that the write succeeded.

@ggoodman
Copy link

ggoodman commented Jan 15, 2020

I think the main problem with any variation of Writable.prototype.write[promisify.custom] is that this means that each call to write introduces a Promise even when one isn't really necessary. Further, if this api is designed to return a Promise, it must systematically be awaited even when such waiting is unnecessary. I think the performance impact of such an API might make it a non-starter.

I was thinking that some of the challenges around tracking which writes have been flushed and which haven't. Imagine scenarios in which there are multiple logical bits of logic 'concurrently' writing to the same Writable. In that world, you might be more interested in whether 'your' chunks have been flushed or not and less so in the total, stateful 'flush state' of the `Writable.

In that scenario, I imagine a Promise-friendly Writer object whose role is to effect writes to a Writable and track their flush state. This means the Writer can be a fully Promise-friendly api and no changes need to be made to the Writable interface.


Hypothetical interface

interface Writer {
  /**
   * Create a Writer for a given Writable stream.
   * 
   * _Note: No absolute need for this to be a class-style api. It could
   * easily be `require('stream').createWriter(writable)`._
   */
  new(writable: Stream.Writable): Writer;

  /**
   * Wait for the stream to be drained.
   * 
   * Moral equivalent to `require('events').once(this.writable, 'drain')`.
   */
  drained(): Promise<void>;

  /**
   * Wait for all chunks written via this `Writer` to be flushed.
   */
  flushed(): Promise<void>;

  /**
   * Write a chunk of data to the stream and produce a `WriteResult`.
   * 
   * @param chunk The chunk of data to be written
   * @param encoding Optional chunk encoding
   */
  write(chunk: string | Buffer, encoding?: string): WriteResult;
}

interface WriteResult {
  /**
   * Return a Promise for when this chunk of data is flushed
   * 
   * _Note: Could also be a getter. Important bit is that a promise is
   * only produced when this api is *used*._
   */
  flushed(): Promise<void>;

  /**
   * Indication of whether the consumer should block on the `Promise` returned by `Writer#drained`
   */
  shouldWaitForDrain: boolean;
}

Usage example:

const writer = new Writer(sink);

for await (const chunk of source) {
  const result = writer.write(chunk);

  if (result.shouldWaitForDrain) {
    await writer.drained();
  }

  // If I was interested in making sure each individual chunk was
  // flushed, I could uncomment the following:
  //
  // await result.flushed();
}

// Now that all chunks from source have been written to sink, we want to
// make sure that all those chunks have actually been flushed.
// We're not interested in each specific chunk's flush state but instead
// that _ALL_ chunks have been flushed here so we're not looking at
// `WriteResult#flushed`.
await writer.flushed();

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Collaborator Summit Montreal 2019 Session Proposal A session proposal for the Collaboration Summit
Projects
None yet
Development

No branches or pull requests