Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Binary detection (AWS API Gateway) #258

Open
Guichaguri opened this issue Aug 14, 2024 · 1 comment
Open

Binary detection (AWS API Gateway) #258

Guichaguri opened this issue Aug 14, 2024 · 1 comment
Labels
enhancement New feature or request

Comments

@Guichaguri
Copy link

Feature Request

Is your feature request related to a problem? Please describe.

Currently, you have to manually specify which content types are considered "binary" in order to use them with the API Gateway. Not specifying them causes the response to be converted into a UTF-8 string. This is responsible for a data loss that corrupts the file, and it's not clear to the user what happened.

Describe the solution you'd like

Considering that encoding text data into binary does not corrupts the data, but encoding binary data into UTF-8 does, I think the library should consider everything as binary by default, and have a list of text content types instead of binary ones.

Encoding text data as base64 is inefficient, but it still works.
To work around the inefficiency, the library can have a list of common text mime types by default.

Describe alternatives you've considered

Adding to every common binary mime type the DEFAULT_BINARY_CONTENT_TYPES, although it does not solves the unclear indication of the cause, it softens the problem as it will be faced on fewer cases.

Are you willing to resolve this issue by submitting a Pull Request?

Yes, I have the time, and I know how to start.

@Guichaguri Guichaguri added the enhancement New feature or request label Aug 14, 2024
@H4ad
Copy link
Owner

H4ad commented Aug 14, 2024

For a minor release, we can add every common binary mime type to DEFAULT_BINARY_CONTENT_TYPES.

Considering that encoding text data into binary does not corrupts the data, but encoding binary data into UTF-8 does, I think the library should consider everything as binary by default, and have a list of text content types instead of binary ones.

We can do this but it will require a major release since I will invert the way I check the binary.

Also, I think we can rewrite the internal code to handle response body as buffer instead of (buffer->string->buffer) every time, this will not be a breaking change (I think) and it will probably speedup this lib.

Well, feel free to open PRs for each suggestion, the first and the last one I can merge and release as minor, but the second one will require a major release, so I will delay it a little bit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants