stream.unshift - TypeError: Argument must be a buffer #27192

marcosc90 · 2019-04-11T18:26:36Z

Version: v10.15.3
Platform: Ubuntu 18.04
Subsystem:

The stream.unshift documentation states:

chunk <Buffer> | <Uint8Array> | <string> | <any> Chunk of data to unshift onto the read queue. For streams not operating in object mode, chunk must be a string, Buffer or Uint8Array. For object mode streams, chunk may be any JavaScript value other than null.

The issue occurs when the stream encoding is not passed, or is not set using setEncoding. When that happens state.decoder is not set and fromList function may use state.buffer.concat (when state.buffer.length > 1) , which internally uses copyBuffer, that requires the source & target to be a Buffer or Uint8Array.

When stream.unshift is called with a string (which the documentation states that is a valid argument), in some cases where the buffer is filled, and state.buffer.concat is triggered, an error will be thrown:

buffer.js:636
    return _copy(this, target, targetStart, sourceStart, sourceEnd);
           ^

TypeError: argument must be a buffer

Here's the script to reproduce the error:

It's the parseHeader example from the documentation, but instead of converting remaining to a Buffer, I pass it directly to .unshift

class ArrayReader extends Readable {
  constructor(opt) {
    super(opt);
    const numbers = new Array(16384 * 2).fill(0).map(String);
    this.buffer = ['header', '\n\n', ...numbers];
  }
  _read(size) {

    while (this.buffer.length) {
      const chunk = this.buffer.shift();
      if (!this.buffer.length) {
        this.push(chunk);
        this.push(null);
        return true;
      }
      if (!this.push(chunk))
        return;
    }
  }
}

// Pull off a header delimited by \n\n
// use unshift() if we get too much
// Call the callback with (error, header, stream)
const { StringDecoder } = require('string_decoder');
function parseHeader(stream, callback) {
  stream.on('error', callback);
  stream.on('readable', onReadable);
  const decoder = new StringDecoder('utf8');
  let header = '';
  function onReadable() {
    let chunk;
    while (null !== (chunk = stream.read())) {
      const str = decoder.write(chunk);
      if (str.match(/\n\n/)) {
        const split = str.split(/\n\n/);
        header += split.shift();
        const remaining = split.join('\n\n');
        // const buf = Buffer.from(remaining, 'utf8');
        stream.removeListener('error', callback);
        stream.removeListener('readable', onReadable);
        if (remaining.length) 
          stream.unshift(remaining);

        // Now the body of the message can be read from the stream.
        callback(null, header, stream);
      } else {
        // still reading the header.
        header += str;
      }
    }
  }
}
parseHeader(new ArrayReader(), (err, header, stream) => {
  stream.once('data', chunk => {
    // console.log(stream._readableState.buffer);
  });
});

This can be fixed, either by converting the chunk to Buffer similar to how it's done in .push, and maybe adding an encoding argument too. Or by changing the documentation to state that a string can only be passed if the stream encoding is set.

Readable.prototype.unshift = function(chunk, encoding) {

  const state = this._readableState;
  let skipChunkCheck = false;
  // to avoid repetition between `.push` & `.unshift` this can be check inside readableAddChunk
  if (!state.objectMode) {
    if (typeof chunk === 'string') {
      encoding = encoding || state.defaultEncoding;
      if (encoding !== state.encoding) {
        chunk = Buffer.from(chunk, encoding);
        encoding = '';
      }
      skipChunkCheck = true;
    }
  } else {
    skipChunkCheck = true;
  }

  return readableAddChunk(this, chunk, encoding, true, skipChunkCheck);
};

The text was updated successfully, but these errors were encountered:

addaleax · 2019-04-11T20:20:03Z

@marcosc90 Would you be interested in opening a PR, since it seems that you already have a solution available?

marcosc90 · 2019-04-11T20:26:31Z

Yes, I have the solution ready, I can submit the PR in a couple of hours once I finish working.

Just to confirm, a new argument must be added to unshift, the documentation change should be added in the same commit?

addaleax · 2019-04-11T20:37:03Z

@marcosc90 I’m not sure about the new argument – if the expectation is that the data was previously read from the stream, the encoding should match the encoding used for reading data from the stream (i.e. state.encoding, rather than state.defaultEncoding – the naming clash is unfortunate here).

marcosc90 · 2019-04-11T21:11:37Z

state.encoding will be null, in all cases where this bug occurs. I can do Buffer.from(chunk, state.encoding) but that will always default to utf8. So unless you set the encoding beforehand, where the error does not occur, if you .unshift a string, that string will always be treated as an utf8 one.

Is this the behaviour that you want? I believe if .unshift needs to support strings, an encoding argument is recommended.

In any case, I can submit the PR which whatever functionality you prefer.

I'm working on a few snippets to see if not adding encoding is going to be problematic. (I know that in most cases when working with strings, utf8 is going to be the encoding)

marcosc90 · 2019-04-11T22:06:34Z

Take the following example, using hex as encoding instead of utf8

function parseHeader(stream, callback) {
  stream.on('error', callback);
  stream.on('readable', onReadable);
  const decoder = new StringDecoder('hex');
  let header = '';
  function onReadable() {
    let chunk;
    while (null !== (chunk = stream.read())) {
      const str = decoder.write(chunk);
      if (str.match(/\n\n/)) {
        const split = str.split(/0a0a/);
        header += split.shift();
        const remaining = split.join('0a0a');
        stream.removeListener('error', callback);
        stream.removeListener('readable', onReadable);
        if (remaining.length) 
          stream.unshift(remaining);

        // Now the body of the message can be read from the stream.
        callback(null, header, stream);
      } else {
        // still reading the header.
        header += str;
      }
    }
  }
}
parseHeader(new ArrayReader(), (err, header, stream) => {
  stream.once('data', chunk => {
     console.log(chunk.toString('utf8'))
  });
});

That will output: 3030303030303030303030303030303030303030

Instead of: 00000000000000000000, since the pushed chunk has the wrong encoding, and there isn't a way to set it.

With the fix I'm proposing, using: stream.unshift(remaining, 'hex') I get the correct output: 00000000000000000000

So right now every string unshifted, is coerced to utf8.

`readable.unshift` can take a string as an argument, but that string wasn't being converted to a Buffer, which caused a <TypeError: Argument must be a buffer> in some cases. Also if a string was passed, that string was coerced to utf8 encoding. A second optional argument `encoding` was added to `unshift` to fix the encoding issue. Fixes: #27192 PR-URL: #27194 Reviewed-By: Anna Henningsen <[email protected]> Reviewed-By: Matteo Collina <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Rich Trott <[email protected]>

addaleax added the stream Issues and PRs related to the stream subsystem. label Apr 11, 2019

marcosc90 mentioned this issue Apr 11, 2019

stream: convert string to Buffer when calling unshift(<string>) #27194

Merged

4 tasks

Trott closed this as completed in #27194 Jun 2, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

stream.unshift - TypeError: Argument must be a buffer #27192

stream.unshift - TypeError: Argument must be a buffer #27192

marcosc90 commented Apr 11, 2019 •

edited by addaleax

Loading

addaleax commented Apr 11, 2019

marcosc90 commented Apr 11, 2019

addaleax commented Apr 11, 2019

marcosc90 commented Apr 11, 2019 •

edited

Loading

marcosc90 commented Apr 11, 2019 •

edited

Loading

stream.unshift - TypeError: Argument must be a buffer #27192

stream.unshift - TypeError: Argument must be a buffer #27192

Comments

marcosc90 commented Apr 11, 2019 • edited by addaleax Loading

addaleax commented Apr 11, 2019

marcosc90 commented Apr 11, 2019

addaleax commented Apr 11, 2019

marcosc90 commented Apr 11, 2019 • edited Loading

marcosc90 commented Apr 11, 2019 • edited Loading

marcosc90 commented Apr 11, 2019 •

edited by addaleax

Loading

marcosc90 commented Apr 11, 2019 •

edited

Loading

marcosc90 commented Apr 11, 2019 •

edited

Loading