Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crypto: fix default encoding of LazyTransform #8611

Closed
wants to merge 2 commits into from

Conversation

lmoe
Copy link

@lmoe lmoe commented Sep 17, 2016

Checklist
  • make -j4 test (UNIX), or vcbuild test nosign (Windows) passes
  • tests and/or benchmarks are included
  • commit message follows commit guidelines
Affected core subsystem(s)

crypto

Description of change

Change the default encoding from latin1 to utf8
and extend the test-crypto tests.

@@ -53,7 +53,6 @@ assert.throws(function() {
crypto.createHash('sha1').update({foo: 'bar'});
}, /buffer/);


Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Superfluous line removal

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please address this.

Copy link
Member

@addaleax addaleax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good so far, thanks for doing this!

/cc @indutny @mscdex ?

});

hash.on('end', () => {
assert.equal(hashValue, assertionHash);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use assert.strictEqual instead of assert.equal here, and maybe wrap the end listener in common.mustCall()?

const assertionHashUtf8 =
'4f53d15bee524f082380e6d7247cc541e7cb0d10c64efdcc935ceeb1e7ea345c';

// Hash of "öäü" in ascii format
Copy link
Member

@addaleax addaleax Sep 17, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are no umlauts in ASCII, but the hash below is valid for latin-1 (aka ISO-8859-1), so I’d recommend naming everything as latin1 here:

$ echo -n öäü | iconv -f utf8 -t iso-8859-1 | sha256sum 
cd37bccd5786e2e76d9b18c871e919e6eb11cc12d868f5ae41c40ccff8e44830  -

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have trouble figuring out your message. What I would do is renaming the ascii part to latin1 but I'm not sure if that's your point. You have marked the Utf8 part and the comment of the ascii format which kind of makes it hard to understand.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lmoe yeah, I think you got what I was trying to say. That you see the Utf8 part too is just github’s UI, usually the PR comments just apply to single lines (so, yeah, I’m talking about the part which is currently labelled ascii)

@addaleax addaleax added crypto Issues and PRs related to the crypto subsystem. semver-major PRs that contain breaking changes and should be released in the next major version. labels Sep 17, 2016
@addaleax
Copy link
Member

Also, I’ve tentatively labelled this as semver-major. It aligns the behaviour with our docs, so whether this is actually semver-major is up for debate.

@Fishrock123
Copy link
Contributor

fix default encoding of LazyTransform

I'm not clear on what this fixes? Is there an issue? If not could we please get a description in the commit message? :D

@addaleax
Copy link
Member

I'm not clear on what this fixes? Is there an issue? If not could we please get a description in the commit message? :D

Not really, I just noticed this while going through or code base. There’s #5522 and #5500, which changed the default encoding for strings that are passed to crypto functions to utf8, but only did so for the non-streaming API… that makes .update() and .write() behave weirdly incongruently. I assumed that to be an oversight, and if I understood @indutny correctly today, he agrees.

A bit on that would be cool for the commit message, yes.

Copy link
Member

@addaleax addaleax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM


if (!this._options || !this._options.defaultEncoding) {
this._writableState.defaultEncoding = crypto.DEFAULT_ENCODING;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The this._options.defaultEncoding is not used yet :(

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to be working for me though.
I will look into this near the weekend.

Copy link
Member

@jasnell jasnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with green CI.

@addaleax
Copy link
Member

@jasnell
Copy link
Member

jasnell commented Oct 6, 2016

@nodejs/crypto ... any objections on this?

Copy link
Member

@indutny indutny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One style nit, otherwise LGTM

@rvagg rvagg force-pushed the master branch 2 times, most recently from c133999 to 83c7a88 Compare October 18, 2016 17:02
Copy link
Member

@indutny indutny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, if CI is green.

@lmoe
Copy link
Author

lmoe commented Jan 5, 2017

@indutny @addaleax
What should I do about this merge conflict? Just to know it in the future. It seems it occurred a month later or so. Should I resolve this conflict on my own, or will your merge tools handle it on their own when you make a release?

@sam-github
Copy link
Contributor

@lmoe Please resolve and re-push, it distributes the maintenance work, and if there are any complexities, we may ask you to do so anyhow.

@addaleax addaleax added this to the 8.0.0 milestone Feb 3, 2017
@addaleax
Copy link
Member

addaleax commented Feb 3, 2017

bump @lmoe – Could you rebase this against master to resolve the conflict?

@addaleax
Copy link
Member

Ping @lmoe again? Otherwise I’ll pick this up in the next couple of days.

Lukas Möller added 2 commits February 16, 2017 10:00
Change the default encoding from latin1 to utf8
and extend the test-crypto tests.
PullRequest nodejs#5522 and nodejs#5500 described the change
of the default encoding into UTF8 in crypto functions.

This however was only changed for the non-streaming API.
The streaming API still used binary as the default encoding.

This commit will change the default streaming API encoding to UTF8
to make both APIs behave the same.

It will also add tests to validate the behavior.
@lmoe
Copy link
Author

lmoe commented Feb 16, 2017

@addaleax Sorry, I totally forgot about the mail. I hope the branch is corrected now.

Copy link
Member

@addaleax addaleax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, still looks good! Going to land this soon.

@addaleax
Copy link
Member

addaleax commented Feb 16, 2017

CI: https://ci.nodejs.org/job/node-test-commit/7947/
CITGM: https://ci.nodejs.org/view/Node.js-citgm/job/citgm-smoker/580/

Edit: The electron-prebuilt failure is real and related, I’m reaching out and fixing it.
Edit²: malept/sumchecker#5

addaleax added a commit to addaleax/sumchecker that referenced this pull request Feb 16, 2017
Up until now, the default encoding for this is `binary`, but
this is scheduled to change in Node 8. Since this module is
currently always passing string input to the `Hash` object,
it needs to account for that.

A better fix would likely involve dropping all strings handling
and treating binary data using Buffer objects, but I kept
this change minimal to avoid any breakage.

Refs: nodejs/node#8611
addaleax added a commit to addaleax/sumchecker that referenced this pull request Feb 21, 2017
Up until now, the default encoding for this is `binary`, but
this is scheduled to change in Node 8. Since this module is
currently always passing string input to the `Hash` object,
it needs to account for that.

A better fix would likely involve dropping all strings handling
and treating binary data using Buffer objects, but I kept
this change minimal to avoid any breakage.

Refs: nodejs/node#8611
@addaleax
Copy link
Member

addaleax commented Mar 10, 2017

Fresh CI:

CI: https://ci.nodejs.org/job/node-test-commit/8359/
CITGM: https://ci.nodejs.org/view/Node.js-citgm/job/citgm-smoker/637/
Edit: New CITGM: https://ci.nodejs.org/view/Node.js-citgm/job/citgm-smoker/638/ (should be a bit better since the electron-prebuilt failure should be fixed now)

I would like to land this unless any new surprises show up. The electron-prebuilt failure is just waiting on a dependency update, and we should definitely have this in Node 8.

(why? because:)
'use strict';
const { createHash } = require('crypto');

const hash1 = createHash('sha256');
hash1.on('data', out => console.log('hash1 => ', out.toString('hex')));
hash1.end('💩');

const hash2 = createHash('sha256');
hash2.on('data', out => console.log('hash2 => ', out.toString('hex')));
hash2.end('⨽⪩');
// Surprise! Different input, same hash.

@addaleax addaleax added the notable-change PRs with changes that should be highlighted in changelogs. label Mar 11, 2017
@addaleax
Copy link
Member

CITGM failures seem to be known/unrelated. 🎉

Landed in 443691a, thanks for the PR and for bearing with our process!

@addaleax addaleax closed this Mar 11, 2017
addaleax pushed a commit that referenced this pull request Mar 11, 2017
PullRequest #5522 and #5500 described the change
of the default encoding into UTF8 in crypto functions.

This however was only changed for the non-streaming API.
The streaming API still used binary as the default encoding.

This commit will change the default streaming API encoding to UTF8
to make both APIs behave the same.

It will also add tests to validate the behavior.

Refs: #5522
Refs: #5500
PR-URL: #8611
Reviewed-By: Fedor Indutny <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Anna Henningsen <[email protected]>
jungx098 pushed a commit to jungx098/node that referenced this pull request Mar 21, 2017
PullRequest nodejs#5522 and nodejs#5500 described the change
of the default encoding into UTF8 in crypto functions.

This however was only changed for the non-streaming API.
The streaming API still used binary as the default encoding.

This commit will change the default streaming API encoding to UTF8
to make both APIs behave the same.

It will also add tests to validate the behavior.

Refs: nodejs#5522
Refs: nodejs#5500
PR-URL: nodejs#8611
Reviewed-By: Fedor Indutny <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Anna Henningsen <[email protected]>
@jasnell jasnell mentioned this pull request Apr 4, 2017
tniessen added a commit to tniessen/node that referenced this pull request Aug 13, 2023
This only affects the writable side of LazyTransform and should not
change the behavior of any LazyTransform streams (Cipher, Decipher,
Cipheriv, Decipheriv, Hash, Hmac).

If the user does not set defaultEncoding when creating a transform
stream, WritableState uses 'utf8' by default. Only LazyTransform
overwrites this with the return value of getDefaultEncoding(). This
was necessary when crypto.DEFAULT_ENCODING still existed. Now that
DEFAULT_ENCODING has been removed, getDefaultEncoding() always returns
'buffer'. The writable side of LazyTransform appears to treat 'utf8'
and 'buffer' in exactly the same way. Therefore, there seems to be no
need to overwrite _writableState.defaultEncoding at this point.

Nevertheless, because Node.js has failed to hide implementation details
such as _writableState from the ecosystem, we may want to consider this
a breaking change.

Refs: nodejs#47182
Refs: nodejs#8611
tniessen added a commit to tniessen/node that referenced this pull request Aug 20, 2023
This only affects the writable side of LazyTransform and should not
change the behavior of any LazyTransform streams (Cipher, Decipher,
Cipheriv, Decipheriv, Hash, Hmac).

If the user does not set defaultEncoding when creating a transform
stream, WritableState uses 'utf8' by default. Only LazyTransform
overwrites this with 'buffer' for strict backward compatibility. This
was necessary when crypto.DEFAULT_ENCODING still existed. Now that
DEFAULT_ENCODING has been removed, defaultEncoding is always 'buffer'.
The writable side of LazyTransform appears to treat 'utf8' and 'buffer'
in exactly the same way. Therefore, there seems to be no need to
overwrite _writableState.defaultEncoding at this point.

Nevertheless, because Node.js has failed to hide implementation details
such as _writableState from the ecosystem, we may want to consider this
a breaking change.

Refs: nodejs#47182
Refs: nodejs#8611
nodejs-github-bot pushed a commit that referenced this pull request Aug 27, 2023
This only affects the writable side of LazyTransform and should not
change the behavior of any LazyTransform streams (Cipher, Decipher,
Cipheriv, Decipheriv, Hash, Hmac).

If the user does not set defaultEncoding when creating a transform
stream, WritableState uses 'utf8' by default. Only LazyTransform
overwrites this with 'buffer' for strict backward compatibility. This
was necessary when crypto.DEFAULT_ENCODING still existed. Now that
DEFAULT_ENCODING has been removed, defaultEncoding is always 'buffer'.
The writable side of LazyTransform appears to treat 'utf8' and 'buffer'
in exactly the same way. Therefore, there seems to be no need to
overwrite _writableState.defaultEncoding at this point.

Nevertheless, because Node.js has failed to hide implementation details
such as _writableState from the ecosystem, we may want to consider this
a breaking change.

Refs: #47182
Refs: #8611
PR-URL: #49140
Reviewed-By: Benjamin Gruenbaum <[email protected]>
Reviewed-By: Rafael Gonzaga <[email protected]>
Reviewed-By: Matteo Collina <[email protected]>
alexfernandez pushed a commit to alexfernandez/node that referenced this pull request Nov 1, 2023
This only affects the writable side of LazyTransform and should not
change the behavior of any LazyTransform streams (Cipher, Decipher,
Cipheriv, Decipheriv, Hash, Hmac).

If the user does not set defaultEncoding when creating a transform
stream, WritableState uses 'utf8' by default. Only LazyTransform
overwrites this with 'buffer' for strict backward compatibility. This
was necessary when crypto.DEFAULT_ENCODING still existed. Now that
DEFAULT_ENCODING has been removed, defaultEncoding is always 'buffer'.
The writable side of LazyTransform appears to treat 'utf8' and 'buffer'
in exactly the same way. Therefore, there seems to be no need to
overwrite _writableState.defaultEncoding at this point.

Nevertheless, because Node.js has failed to hide implementation details
such as _writableState from the ecosystem, we may want to consider this
a breaking change.

Refs: nodejs#47182
Refs: nodejs#8611
PR-URL: nodejs#49140
Reviewed-By: Benjamin Gruenbaum <[email protected]>
Reviewed-By: Rafael Gonzaga <[email protected]>
Reviewed-By: Matteo Collina <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
crypto Issues and PRs related to the crypto subsystem. notable-change PRs with changes that should be highlighted in changelogs. semver-major PRs that contain breaking changes and should be released in the next major version.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants