Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

strip BOM #5

Open
FranklinYu opened this issue May 31, 2018 · 7 comments
Open

strip BOM #5

FranklinYu opened this issue May 31, 2018 · 7 comments

Comments

@FranklinYu
Copy link

If a response body starts with BOM, the JSON decoder would throw exception. That is pretty common in reality (although Content-Type is the right way for HTTP). Please strip the BOM before feeding it to parser.

@JacobReynolds
Copy link
Contributor

Hey @FranklinYu I think that's a great idea. I'm not too familiar with BOM, would it be valuable to add it back into the request body when they issue the request?

@FranklinYu
Copy link
Author

It is possible, but then we need to remember which request has BOM. Given that information, simply prepend those several bytes to the request.

@JacobReynolds
Copy link
Contributor

I'm having some trouble reproducing this. I'm using the unicode BOM character \uFEFF in this string, but it's still able to beautify in the extension.

The string I'm using: {"BOM": "test"}

It also looks like Google's GSON parser has handling for this https://github.com/google/gson/blob/master/gson/src/main/java/com/google/gson/stream/JsonReader.java#L1298

Could you send me an example string that causes this error?

@FranklinYu
Copy link
Author

FranklinYu commented Jun 8, 2018

It may take me some time to find the Burp record (I can do it during weekend), but I remember it was a UTF-8 BOM. According to the source you cite, it seems like only UTF-16 is handled by GSON.

@aph3rson
Copy link

aph3rson commented Dec 27, 2018

Coming back to this, @FranklinYu was right, as I'm having the same issue. A UTF-8 BOM is not handled by the GSON parser - this appears as bytes EF BB BF at the head of the content.

This could be mitigated with an issue in google/gson which gets pulled downstream, or with a mitigation here. The former is likely a better scenario.

@FranklinYu
Copy link
Author

My try with Gson: google/gson#1481

@FranklinYu
Copy link
Author

FranklinYu commented Mar 7, 2019

Gson team doesn’t seem to like BOM detection as part of their library (and I kind of agree with that). I think related logic is in

JsonParser jp = new JsonParser();
JsonElement je = jp.parse(new String(requestResponseBody));
json = gson.toJson(je);

Given that, and assuming:

  1. We only support UTF-8, UTF-16 BE, and UTF-16 LE.
  2. We don’t write the BOM back when user modify it.

I can come up with some simple (naive) solution.

@FranklinYu FranklinYu changed the title [feature request] strip BOM strip BOM Mar 24, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants