Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid timezone offsets while parsing dates in old messages. #89

Open
dwsteele opened this issue Oct 19, 2024 · 4 comments · May be fixed by #95
Open

Invalid timezone offsets while parsing dates in old messages. #89

dwsteele opened this issue Oct 19, 2024 · 4 comments · May be fixed by #95

Comments

@dwsteele
Copy link

I have a bunch of rather old messages that I am running through the parser and some of them are giving strange results.

For example, Date: Thu, 10 Jul 1997 14:53:31 EST5EDT' parses as DateTime { year: 1997, month: 7, day: 10, hour: 14, minute: 53, second: 31, tz_before_gmt: false, tz_hour: 50, tz_minute: 0 }`. An offset of 50 hours is pretty clearly off.

Another example: Date: Sun, 27 Oct 2002 23:57:07 EST yields { year: 2002, month: 10, day: 27, hour: 23, minute: 57, second: 7, tz_before_gmt: false, tz_hour: 0, tz_minute: 0 }. In this case the offset is 0 -- less weird bit still incorrect.

Any chance these are correctable? Unfortunately I have a fair number of messages with this issue. For now I have written some code to detect unparsable timezones and thrown an error. Maybe the crate could at least throw an error if it cannot parse the timezone?

@mdecimus
Copy link
Member

This library never returns errors unless the message is completely unparseable. For everything else it makes a best effort to extract as much information as possible from non-conformant messages. As the README explains:

In general, this library abides by the Postel's law or Robustness Principle which states that an implementation must be conservative in its sending behavior and liberal in its receiving behavior. This means that mail-parser will make a best effort to parse non-conformant e-mail messages as long as these do not deviate too much from the standard.

@dwsteele
Copy link
Author

For everything else it makes a best effort to extract as much information as possible from non-conformant messages

OK, fair enough, but 50 is not even a valid timezone offset. If the offset was kept within valid bounds it would make the parsed time at least more accurate. Also, Thunderbird is able to correctly parse these timezones so it seems possible to do so.

@mdecimus
Copy link
Member

I will keep the issue open and will look into it as soon as I have a chance. Thanks.

@sftse sftse linked a pull request Dec 13, 2024 that will close this issue
@sftse
Copy link
Contributor

sftse commented Dec 13, 2024

Can you check if #95 fixes the issue? The EST timezone is specified in RFC5322 but no idea about the EST5EDT you mentioned, that looks malformed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants