-
-
Notifications
You must be signed in to change notification settings - Fork 382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incompatible character encodings: ASCII-8BIT and UTF-8 in EmailReplyFilter #229
Comments
I asked around and I suspect it may be due to
It seems strange that it would affect some emails, but not others. cc @brianmario in case I missed anything. |
@jch thanks for asking around. Most of the replies we're working with only use standard ASCII chars so there aren't any issues. This only seems to happen when replies include extended ASCII chars, like the ellipsis above. I think you're right about forcing the encoding to binary being the problem. What's the status on github/email_reply_parser#36? |
@rymohr nice. Thanks for finding the PR. I think it's stalled, but I'm not clear who the owner is on it. |
Fixed by github/email_reply_parser#36 |
We've been running into random encoding errors with email replies for a while now. While I still haven't been able to get to the bottom of it, I've at least been able to reduce it down to a simple test case:
The incoming emails are delivered by Mandrill as JSON and the content appears to be valid UTF-8. Not sure what's going on.
It's worth noting that the error goes away if I inline the reply in the test case instead of reading it from disk. In that case I get the following difference in byte sequence:
The text was updated successfully, but these errors were encountered: