Reading 24 bytes possibly results in incorrect chunks #13

dlsgusrn7577 · 2019-11-01T00:48:22Z

I apologize in advance if this issue is a false alarm.

When reading 24 bytes in the middle and end of buffer (for method 'getEncodingSync'),
isn't it possible to get incorrectly chunked bytes?

For instance, let's say a file is consisted of 75 characters of '한', which is a Korean character represented as 3 bytes for utf8 encoding.
If we want to read middle 24 bytes, then

   chunkBegin =  Math.max(0, Math.floor(75 * 3 / 2) - chunkLength) = 88
   chunkEnd =  Math.min(75 * 3, 88 + chunkLength) = 112

However, since 88 is not multiple of 3, buffer.toString(encoding, 88, 112) will not result in the list of Korean characters '한'.

Please let me know if I am misunderstanding.
Thank you very much!

The text was updated successfully, but these errors were encountered:

balupton · 2019-11-05T09:57:00Z

Seems plausible. Can you submit a PR that has a test that showcases this problem? Then you can update the PR with the fix (I guess flooring to the nearest 3 divisible) and we can do a release.

sainthkh · 2021-06-01T03:44:52Z

I'm currently working on this. I found error strings like:

12345678901234567890123Ф
1234567890123456789012안

It might take some time because I need a complete list of error cases.

This was referenced Apr 12, 2021

[7.0.1]Bug: cy.intercept POST terminates cypress process cypress-io/cypress#15901

Closed

fix: do not treat utf8 requests as binary cypress-io/cypress#15946

Merged

sainthkh mentioned this issue May 31, 2021

cy.intercept request.body is ArrayBuffer for JSON requests with long Cyrillic strings cypress-io/cypress#16292

Closed

sainthkh mentioned this issue Jun 2, 2021

Fix false negatives caused by multibyte utf8 characters. #214

Merged

balupton closed this as completed in #214 Jul 31, 2021

balupton added a commit that referenced this issue Jul 31, 2021

v6.0.0 - release #214, close #13, speak multibyte chars

0ee4d89

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reading 24 bytes possibly results in incorrect chunks #13

Reading 24 bytes possibly results in incorrect chunks #13

dlsgusrn7577 commented Nov 1, 2019

balupton commented Nov 5, 2019

sainthkh commented Jun 1, 2021

Reading 24 bytes possibly results in incorrect chunks #13

Reading 24 bytes possibly results in incorrect chunks #13

Comments

dlsgusrn7577 commented Nov 1, 2019

balupton commented Nov 5, 2019

sainthkh commented Jun 1, 2021