Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow limiting Sieve :regex execution time #4767

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ksmurchison
Copy link
Contributor

@ksmurchison ksmurchison commented Dec 8, 2023

This will about from regexec(), and return an error code so that execution of the script will fail (message still delivered to INBOX)
The lmtpd will be terminated after the message has been delivered to all recipients.

Copy link
Contributor

@elliefm elliefm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My setjmp(3) man page says this about calling longjmp and siglongjmp from a signal handler, like we're doing here.

POSIX.1-2008 Technical Corrigendum 2 adds longjmp() and siglongjmp() to the list of async-signal-safe functions. However, the standard recommends avoiding the use of these functions from signal handlers and goes on to point out that if these functions are called from a signal handler that interrupted a call to a non-async-signal-safe function (or some equivalent, such as the steps equivalent to exit(3) that occur upon a return from the initial call to main()), the behavior is undefined if the program subsequently makes a call to a non-async-signal-safe function. The only way of avoiding undefined behavior is to ensure one of the following:

• After long jumping from the signal handler, the program does not call any non-async-signal-safe functions and does not return from the initial call to main().

• Any signal whose handler performs a long jump must be blocked during every call to a non-async-signal-safe function and no non-async-signal-safe functions are called after returning from the initial call to main().

It's not clear to me whether regcomp is a non-async-signal-safe function, and it probably varies depending on which regex library Cyrus is linked against anyway. If we suppose that regcomp is non-async-signal-safe, then it might not be safe to allow lmtpd to finish processing the rest of the recipients after one of them trips a regex timeout -- especially if any of the subsequent recipients also have regex in their sieve, which would definitely call the non-async-signal-safe regcomp again, which this documentation explicitly calls out as undefined.

sieve/comparator.c Outdated Show resolved Hide resolved
sieve/comparator.c Show resolved Hide resolved
@elliefm
Copy link
Contributor

elliefm commented Dec 12, 2023

https://datatracker.ietf.org/doc/html/rfc2033.html#section-5:

The server SHOULD send each reply as soon as possible. If it is
going to spend a nontrivial amount of time handling delivery for the
next recipient, it SHOULD flush any outgoing LMTP buffer, so the
reply may be quickly received by the client.

The client SHOULD process the replies as they come in, instead of
waiting for all of the replies to arrive before processing any of
them. If the connection closes after replies for some, but not all,
recipients have arrived, the client MUST process the replies that
arrived and treat the rest as temporary failures.

I think this means that, as long as lmtpd makes sure to prot_flush() after writing each recipient reply, then at any time it should be okay for it to exit early, and anything it didn't get to will just queue and retry later.

I don't remember the lmtpd architecture in detail though. At the time we're processing some user's sieve regex, have we already sent the replies for any earlier recipients? If we have, then in the case of regexec timeout, it should be fine to just send (and flush) the failed reply for this recipient, and then shut_down(). That would save us worrying about any undefined behaviour while processing the remaining recipients.

It still leaves open a possibility for a user with a pathological regex in their sieve to occasionally delay mail for other users (if those other users are the queued-and-retried later recipients). We might want to recommend that deployments that allow user-supplied sieve and have the "regex" extension enabled should configure their MTA to send one recipient per message, like FM does. Though I'm not sure where in the documentation this recommendation would best fit.

@rsto rsto removed their request for review August 20, 2024 06:13
@rsto rsto self-assigned this Oct 14, 2024
@rsto rsto marked this pull request as draft October 15, 2024 06:49
@rsto rsto removed their assignment Oct 15, 2024
@rsto
Copy link
Member

rsto commented Oct 15, 2024

@ksmurchison This currently fails on CI and it seems unclear if this is the approach we'd should be taking. I have converted this back to draft.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants