-
-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Qualys security scanner kills all cheroot worker threads, hangs server, eats all available file descriptors #365
Conversation
I can confirm this exact same issue with cheroot version A Qualsys scanner probing a SSL CherryPy server that uses the Here is the traceback I am seeing caused by a Qualsys scan when probing a SSL CherryPy server: Exception in thread ('CP Server Thread-22',):
Traceback (most recent call last):
File "cheroot/server.py", line 1290, in communicate
req.parse_request()
File "cheroot/server.py", line 719, in parse_request
success = self.read_request_line()
File "cheroot/server.py", line 760, in read_request_line
request_line = self.rfile.readline()
File "cheroot/server.py", line 310, in readline
data = self.rfile.readline(256)
File "/usr/lib/python3.7/_pyio.py", line 512, in readline
b = self.read(nreadahead())
File "/usr/lib/python3.7/_pyio.py", line 491, in nreadahead
readahead = self.peek(1)
File "/usr/lib/python3.7/_pyio.py", line 1085, in peek
return self._peek_unlocked(size)
File "/usr/lib/python3.7/_pyio.py", line 1092, in _peek_unlocked
current = self.raw.read(to_read)
File "/usr/lib/python3.7/socket.py", line 589, in readinto
return self._sock.recv_into(b)
File "/usr/lib/python3.7/ssl.py", line 1071, in recv_into
return self.read(nbytes, buffer)
File "/usr/lib/python3.7/ssl.py", line 929, in read
return self._sslobj.read(len, buffer)
ssl.SSLError: [SSL: DECRYPTION_FAILED_OR_BAD_RECORD_MAC] decryption failed or bad record mac (_ssl.c:2570)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "cheroot/workers/threadpool.py", line 125, in run
keep_conn_open = conn.communicate()
File "cheroot/server.py", line 1319, in communicate
self._conditional_error(req, '500 Internal Server Error')
File "cheroot/server.py", line 1362, in _conditional_error
req.simple_response(response)
File "cheroot/server.py", line 1128, in simple_response
self.conn.wfile.write(EMPTY.join(buf))
File "cheroot/makefile.py", line 438, in write
res = super().write(val, *args, **kwargs)
File "cheroot/makefile.py", line 36, in write
self._flush_unlocked()
File "cheroot/makefile.py", line 45, in _flush_unlocked
n = self.raw.write(bytes(self._write_buf))
File "/usr/lib/python3.7/socket.py", line 607, in write
return self._sock.send(b)
File "/usr/lib/python3.7/ssl.py", line 1003, in send
return self._sslobj.write(data)
ssl.SSLError: [SSL: UNKNOWN_STATE] unknown state (_ssl.c:2423) Can someone from the cheroot maintainer team please take a look at this? This is a fairly critical bug and a simple security scanner should not be able to crash a cheroot powered SSL CherryPy server. |
@webknjaz can you please take a look at this issue? t has been open for over a year now and it is fairly critical as it cripples a cheroot running as part of a SSL CherryPy server. I have also confirmed in a production deployment of CherryPy that @DangerPoodle's patch has from this PR works and mitigates the worker thread exhaustion caused by malformed SSL connections. Additional related cheroot issues: |
Hi @DangerPoodle - can I confirm with you that in order for the fix in your PR to work, we have to also enable the error_log with
by default on my side they are set to False. I tried implementing your PR while just leaving those settings above at default, False, and the problem continued to occur. So I am thinking I will set them to True, since your code
seems to require the error_log (at least!) to be turned on. |
β¦int-lockfiles Bump transitive deps in pip-tools-managed lockfiles
Hi, [22/Jan/2023:08:43:24] ENGINE Error in HTTPServer.serve
Traceback (most recent call last):
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\server.py", line 1823, in serve
self._connections.run(self.expiration_interval)
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\connections.py", line 203, in run
self._run(expiration_interval)
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\connections.py", line 246, in _run
new_conn = self._from_server_socket(self.server.socket)
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\connections.py", line 300, in _from_server_socket
s, ssl_env = self.server.ssl_adapter.wrap(s)
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\ssl\builtin.py", line 277, in wrap
s = self.context.wrap_socket(
File "ssl.py", line 512, in wrap_socket
File "ssl.py", line 1070, in _create
File "ssl.py", line 1341, in do_handshake
ssl.SSLError: [SSL: UNEXPECTED_MESSAGE] unexpected message (_ssl.c:997)
[22/Jan/2023:08:43:46] ENGINE socket.error 8
Traceback (most recent call last):
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\server.py", line 1290, in communicate
req.parse_request()
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\server.py", line 719, in parse_request
success = self.read_request_line()
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\server.py", line 778, in read_request_line
self.simple_response(
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\server.py", line 1128, in simple_response
self.conn.wfile.write(EMPTY.join(buf))
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\makefile.py", line 438, in write
res = super().write(val, *args, **kwargs)
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\makefile.py", line 36, in write
self._flush_unlocked()
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\makefile.py", line 45, in _flush_unlocked
n = self.raw.write(bytes(self._write_buf))
File "socket.py", line 723, in write
File "ssl.py", line 1205, in send
ssl.SSLEOFError: EOF occurred in violation of protocol (_ssl.c:2396)
Exception in thread ('CP Server Thread-16',):
Traceback (most recent call last):
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\server.py", line 1290, in communicate
req.parse_request()
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\server.py", line 719, in parse_request
success = self.read_request_line()
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\server.py", line 778, in read_request_line
self.simple_response(
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\server.py", line 1128, in simple_response
self.conn.wfile.write(EMPTY.join(buf))
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\makefile.py", line 438, in write
res = super().write(val, *args, **kwargs)
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\makefile.py", line 36, in write
self._flush_unlocked()
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\makefile.py", line 45, in _flush_unlocked
n = self.raw.write(bytes(self._write_buf))
File "socket.py", line 723, in write
File "ssl.py", line 1205, in send
ssl.SSLEOFError: EOF occurred in violation of protocol (_ssl.c:2396)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "threading.py", line 1009, in _bootstrap_inner
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\workers\threadpool.py", line 125, in run
keep_conn_open = conn.communicate()
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\server.py", line 1319, in communicate
self._conditional_error(req, '500 Internal Server Error')
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\server.py", line 1362, in _conditional_error
req.simple_response(response)
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\server.py", line 1128, in simple_response
self.conn.wfile.write(EMPTY.join(buf))
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\makefile.py", line 438, in write
res = super().write(val, *args, **kwargs)
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\makefile.py", line 36, in write
self._flush_unlocked()
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\makefile.py", line 45, in _flush_unlocked
n = self.raw.write(bytes(self._write_buf))
File "socket.py", line 723, in write
File "ssl.py", line 1205, in send
ssl.SSLEOFError: EOF occurred in violation of protocol (_ssl.c:2396)
[22/Jan/2023:08:43:47] ENGINE socket.error 8
Traceback (most recent call last):
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\server.py", line 1290, in communicate
req.parse_request()
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\server.py", line 719, in parse_request
success = self.read_request_line()
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\server.py", line 778, in read_request_line
self.simple_response(
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\server.py", line 1128, in simple_response
self.conn.wfile.write(EMPTY.join(buf))
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\makefile.py", line 438, in write
res = super().write(val, *args, **kwargs)
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\makefile.py", line 36, in write
self._flush_unlocked()
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\makefile.py", line 45, in _flush_unlocked
n = self.raw.write(bytes(self._write_buf))
File "socket.py", line 723, in write
File "ssl.py", line 1205, in send
ssl.SSLEOFError: EOF occurred in violation of protocol (_ssl.c:2396)
Exception in thread ('CP Server Thread-14',):
Traceback (most recent call last):
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\server.py", line 1290, in communicate
req.parse_request()
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\server.py", line 719, in parse_request
success = self.read_request_line()
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\server.py", line 778, in read_request_line
self.simple_response(
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\server.py", line 1128, in simple_response
self.conn.wfile.write(EMPTY.join(buf))
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\makefile.py", line 438, in write
res = super().write(val, *args, **kwargs)
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\makefile.py", line 36, in write
self._flush_unlocked()
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\makefile.py", line 45, in _flush_unlocked
n = self.raw.write(bytes(self._write_buf))
File "socket.py", line 723, in write
File "ssl.py", line 1205, in send
ssl.SSLEOFError: EOF occurred in violation of protocol (_ssl.c:2396)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "threading.py", line 1009, in _bootstrap_inner
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\workers\threadpool.py", line 125, in run
keep_conn_open = conn.communicate()
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\server.py", line 1319, in communicate
self._conditional_error(req, '500 Internal Server Error')
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\server.py", line 1362, in _conditional_error
req.simple_response(response)
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\server.py", line 1128, in simple_response
self.conn.wfile.write(EMPTY.join(buf))
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\makefile.py", line 438, in write
res = super().write(val, *args, **kwargs)
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\makefile.py", line 36, in write
self._flush_unlocked()
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\makefile.py", line 45, in _flush_unlocked
n = self.raw.write(bytes(self._write_buf))
File "socket.py", line 723, in write
File "ssl.py", line 1205, in send
ssl.SSLEOFError: EOF occurred in violation of protocol (_ssl.c:2396)
[22/Jan/2023:08:46:11] ENGINE Error in HTTPServer.serve
Traceback (most recent call last):
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\server.py", line 1823, in serve
self._connections.run(self.expiration_interval)
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\connections.py", line 203, in run
self._run(expiration_interval)
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\connections.py", line 246, in _run
new_conn = self._from_server_socket(self.server.socket)
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\connections.py", line 300, in _from_server_socket
s, ssl_env = self.server.ssl_adapter.wrap(s)
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\ssl\builtin.py", line 277, in wrap
s = self.context.wrap_socket(
File "ssl.py", line 512, in wrap_socket
File "ssl.py", line 1070, in _create
File "ssl.py", line 1341, in do_handshake
ssl.SSLError: [SSL] internal error (_ssl.c:997)
[22/Jan/2023:09:59:48] ENGINE Error in HTTPServer.serve
Traceback (most recent call last):
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\server.py", line 1823, in serve
self._connections.run(self.expiration_interval)
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\connections.py", line 203, in run
self._run(expiration_interval)
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\connections.py", line 246, in _run
new_conn = self._from_server_socket(self.server.socket)
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\connections.py", line 300, in _from_server_socket
s, ssl_env = self.server.ssl_adapter.wrap(s)
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\ssl\builtin.py", line 277, in wrap
s = self.context.wrap_socket(
File "ssl.py", line 512, in wrap_socket
File "ssl.py", line 1070, in _create
File "ssl.py", line 1341, in do_handshake
ssl.SSLError: [SSL] internal error (_ssl.c:997)
[22/Jan/2023:09:59:59] ENGINE socket.error 8
Traceback (most recent call last):
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\server.py", line 987, in read_request_headers
self.header_reader(self.rfile, self.inheaders)
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\server.py", line 213, in __call__
raise ValueError('Illegal end of headers.')
ValueError: Illegal end of headers.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\server.py", line 1290, in communicate
req.parse_request()
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\server.py", line 732, in parse_request
success = self.read_request_headers()
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\server.py", line 989, in read_request_headers
self.simple_response('400 Bad Request', ex.args[0])
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\server.py", line 1128, in simple_response
self.conn.wfile.write(EMPTY.join(buf))
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\makefile.py", line 438, in write
res = super().write(val, *args, **kwargs)
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\makefile.py", line 36, in write
self._flush_unlocked()
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\makefile.py", line 45, in _flush_unlocked
n = self.raw.write(bytes(self._write_buf))
File "socket.py", line 723, in write
File "ssl.py", line 1205, in send
ssl.SSLEOFError: EOF occurred in violation of protocol (_ssl.c:2396)
Exception in thread ('CP Server Thread-8',):
Traceback (most recent call last):
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\server.py", line 987, in read_request_headers
self.header_reader(self.rfile, self.inheaders)
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\server.py", line 213, in __call__
raise ValueError('Illegal end of headers.')
ValueError: Illegal end of headers.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\server.py", line 1290, in communicate
req.parse_request()
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\server.py", line 732, in parse_request
success = self.read_request_headers()
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\server.py", line 989, in read_request_headers
self.simple_response('400 Bad Request', ex.args[0])
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\server.py", line 1128, in simple_response
self.conn.wfile.write(EMPTY.join(buf))
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\makefile.py", line 438, in write
res = super().write(val, *args, **kwargs)
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\makefile.py", line 36, in write
self._flush_unlocked()
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\makefile.py", line 45, in _flush_unlocked
n = self.raw.write(bytes(self._write_buf))
File "socket.py", line 723, in write
File "ssl.py", line 1205, in send
ssl.SSLEOFError: EOF occurred in violation of protocol (_ssl.c:2396)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "threading.py", line 1009, in _bootstrap_inner
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\workers\threadpool.py", line 125, in run
keep_conn_open = conn.communicate()
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\server.py", line 1319, in communicate
self._conditional_error(req, '500 Internal Server Error')
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\server.py", line 1362, in _conditional_error
req.simple_response(response)
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\server.py", line 1128, in simple_response
self.conn.wfile.write(EMPTY.join(buf))
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\makefile.py", line 438, in write
res = super().write(val, *args, **kwargs)
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\makefile.py", line 36, in write
self._flush_unlocked()
File "D:\some\folder\python-windows\Lib\site-packages\cheroot\makefile.py", line 45, in _flush_unlocked
n = self.raw.write(bytes(self._write_buf))
File "socket.py", line 723, in write
File "ssl.py", line 1205, in send
ssl.SSLEOFError: EOF occurred in violation of protocol (_ssl.c:2396) |
Ran into the same issue recently - until this PR hopefully gets merged and prevents this from happening has anyone confirmed that just updating the same file to catch the SSLError actually works? nvm I see @siris has stated that it does |
This comment was marked as duplicate.
This comment was marked as duplicate.
This comment was marked as duplicate.
This comment was marked as duplicate.
Here's a monkey patch based on this PR that's worked for me. You can put it in your own code to modify cheroot without changing cheroot's files. # Work around HTTPS issues.
# https://github.com/cherrypy/cheroot/pull/365
import cheroot.workers.threadpool, time, ssl
from cheroot.workers.threadpool import _SHUTDOWNREQUEST
def monkeypatched_run(self):
"""Process incoming HTTP connections.
Retrieves incoming connections from thread pool.
"""
self.server.stats['Worker Threads'][self.name] = self.stats
try:
self.ready = True
while True:
conn = self.server.requests.get()
if conn is _SHUTDOWNREQUEST:
return
self.conn = conn
is_stats_enabled = self.server.stats['Enabled']
if is_stats_enabled:
self.start_time = time.time()
keep_conn_open = False
try:
keep_conn_open = conn.communicate()
except ssl.SSLError as ex:
self.server.error_log(f'ignored extra SSLError {repr(ex)}')
finally:
if keep_conn_open:
self.server.put_conn(conn)
else:
conn.close()
if is_stats_enabled:
self.requests_seen += self.conn.requests_seen
self.bytes_read += self.conn.rfile.bytes_read
self.bytes_written += self.conn.wfile.bytes_written
self.work_time += time.time() - self.start_time
self.start_time = None
self.conn = None
except (KeyboardInterrupt, SystemExit) as ex:
self.server.interrupt = ex
cheroot.workers.threadpool.WorkerThread.run = monkeypatched_run |
@webknjaz Does anything need to happen here to get this merged? It's a three-year-old, three-line change that fixes a showstopping bug. |
@Kodiologist I had something locally that helped me reproduce this with tlsfuzzer, but other priorities took over. I'll need to look deeper myself. #518 fixed other related problems that I thought might have influenced this one, but now I think that's orthogonal. Ideally, it'd be useful to have some tlsfuzzer-based testing in CI, but I won't declare that a blocker. Still a regression test for this specific bug would be important. I was holding off on this because I thought that this might not be the correct place to change. I'll try to find more time to look into it again. |
Cool, thank you. |
FWIW we would much rather have a change asap that is in the wrong place, but patches the issue (even if only partially), than no change at all. It can always be put in the "right" place later. There are several issues related to SSL, of which this is only one, that have been known bugs for many years, that have continues to stagnate, and which are making us increasingly uncomfortable over time because of, among other concerns, just the doubt it leaves about the level of commitment the maintainers have to production-grade quality. We would very much appreciate if these longstanding and critical bugs could be given more attention, in particular where the community has already put in the work to provide a solution. |
Just to add that we are in the same situation here: several users end up disabling ssl because of all the issues cherrypy suffers which obviously affects their systems security when they are forced to switch to plain |
FTR I decided to go with suggestions made in #358. |
This patch aims to prevent a situation when the working threads are killed by arbitrary exceptions that bubble up to their entry point layers that aren't handled properly or at all. This was causing DoS in many situations, including TLS errors and attempts to close the underlying sockets erroring out. Fixes cherrypy#358 Ref cherrypy#375 Resolves cherrypy#365
This patch aims to prevent a situation when the working threads are killed by arbitrary exceptions that bubble up to their entry point layers that aren't handled properly or at all. This was causing DoS in many situations, including TLS errors and attempts to close the underlying sockets erroring out. Fixes cherrypy#358 Ref cherrypy#375 Resolves cherrypy#365
This patch aims to prevent a situation when the working threads are killed by arbitrary exceptions that bubble up to their entry point layers that aren't handled properly or at all. This was causing DoS in many situations, including TLS errors and attempts to close the underlying sockets erroring out. Fixes cherrypy#358 Ref cherrypy#375 Resolves cherrypy#365
This patch aims to prevent a situation when the working threads are killed by arbitrary exceptions that bubble up to their entry point layers that aren't handled properly or at all. This was causing DoS in many situations, including TLS errors and attempts to close the underlying sockets erroring out. Fixes cherrypy#358 Ref cherrypy#375 Resolves cherrypy#365
This patch aims to prevent a situation when the working threads are killed by arbitrary exceptions that bubble up to their entry point layers that aren't handled properly or at all. This was causing DoS in many situations, including TLS errors and attempts to close the underlying sockets erroring out. Fixes cherrypy#358 Ref cherrypy#375 Resolves cherrypy#365
This patch aims to prevent a situation when the working threads are killed by arbitrary exceptions that bubble up to their entry point layers that aren't handled properly or at all. This was causing DoS in many situations, including TLS errors and attempts to close the underlying sockets erroring out. Fixes cherrypy#358 Ref cherrypy#375 Resolves cherrypy#365
This patch aims to prevent a situation when the working threads are killed by arbitrary exceptions that bubble up to their entry point layers that aren't handled properly or at all. This was causing DoS in many situations, including TLS errors and attempts to close the underlying sockets erroring out. Fixes cherrypy#358 Ref cherrypy#375 Resolves cherrypy#365
A DoS would happen in many situations, including TLS errors and attempts to close the underlying sockets erroring out. This patch aims to prevent a situation when the worker threads are killed by arbitrary exceptions that bubble up to their entry point layers that aren't handled properly or at all. PR #649 Fixes #358 Fixes #354 Ref #310 Ref #346 Ref #375 Ref #599 Ref #641 Resolves #365
It's great that this fix is now in the master/main branch-- thank you. But is it possible to use this fix on v10.0.0, without having to move to an unstable branch? Given how critical this fix is I wanted to apply just this change as a patch to the latest release version (10.0.0 / https://github.com/cherrypy/cheroot/tree/v10.0.0). However, it seems that threadpool.py file has changed quite a bit from v10.0.0 so applying this fix as a patch (e.g. https://github.com/cherrypy/cheroot/pull/649/files#diff-7e1452f6721288523eb718a38ef7ddf3cc98854f53606f0c02057b72dbd6c5d9) is not straightforward (for example, the line Is there any safe way to apply this critical change, by itself, to v10.0.0? Alternatively, would it be possible to create a v10.0.0-patched branch that only includes critical releases for v10.0.0, so that users of cheroot don't have to wait for the next full release in order to make use of critical fixes such as this one? @webknjaz @rkachach @Kodiologist @Wissperwind @siris @DangerPoodle |
I'd argue that it's more stable than that tag, given the improved evergreen CI and extended test coverage compared to that version. You can always pin to a Git commit SHA if you want it right now.
It's still there: a72385c#diff-7e1452f6721288523eb718a38ef7ddf3cc98854f53606f0c02057b72dbd6c5d9R129. I moved the nested logic from one method into another, dedicated one. This helps look at the abstraction layers more sanely. I think that this amount of nesting was one of the factors that allowed such a bug to sneak in as reviewing highly nested changes involves tremendous mental overhead so it's more prone to human error.
There are actually a few places with related bugs fixed. It doesn't make sense to retrieve only this big and keep other bugs in.
We don't have a process for this, though the automation might be able to handle it, theoretically, if someone were to figure out backporting all the necessary bits (which to me is rather pointless waste of time). I'm doing some maintenance in the repo related to the changelog management, it's nearly complete. I also want to add a few more CI jobs and declare official support for Python 3.12. After that, I should be able to trigger the release automation. This is going to be the fastest way to release, anyway, since the automation is there (I'll need to double check a few things, though, since it's been a while; something probably needs to be adjusted) and manual non-transparent releases haven't been a thing for the past decade. |
I have no problem updating cherrypy. In which version is / will be this update included? |
Should be 11. |
We are also open to update our cherrypy to a version that contains the fix |
I'd prefer this approach as well. Would you have any rough idea of what kind of timing range that might take? Not asking in order to hold you to it, but rather so that we can assess whether we think we should bother considering doing an out-of-release patch in the meantime (I am guessing we won't want to go that way, but if you were to say e.g. >= 6 mo then that may change things). Also-- can we help to get the next release out faster? We have limited resources to spare (manpower and/or $), but are open to contributing, if there is a way to do so without getting in the way. |
@liquidaty thanks for the offer! I think that working with contributors at this point would slow this down because I'd need to coordinate and communicate a lot more than just doing what I already have in my mind. I wanted to write a few words on the state of things and the ideas suggested. Release planning πI'll need to go through the changes and open pull requests to see if anything can be closed with the recent changes and recorded properly in the change log. #654 is mandatory for this, by the way β it won't change anything for the end-users functionally but is going to be vital maintenance-wise. It is already one of my top priorities (next to my normal work commitments; actually other projects in the company want this released too). So I'm planning to cut a new release as soon as the pipeline is functional, if no other blockers emerge. It mostly needs synchronizing bits of automation I normally use in other projects with some adjustments. Helping out πI'm in a position where I have access to way more projects than time/energy which I sometimes try to communicate when context switching. Here's the most recent explanation example: aio-libs/multidict#887 (comment). This is the primary reason for the delays β it's hard to keep up and my time+energy available aren't limitless, sadly. Long-term, though, having extra people capable of looking out for the project / cutting releases would be useful. Still, I'm not sure how the practices would have to change in the post xz-fiasco world... If there's people interested in getting more involved β I'm happy show the ropes, once the release is out. Funding π°As for $$$ β both of the active maintainers are enrolled on Tidelift and have things like GitHub Sponsors/Liberapay/etc set up. People are free to show their appreciation through those. However, personally, I'd be grateful if instead those funds were donated to https://savelife.in.ua/en/donate-en/#donate-army-card-monthly, https://u24.gov.ua or one of many more charities listed here β doing so supports people like me who do open source in more ways than one can imagine. A workaround π©ΉThis should get you more or less what I'll release later. It's a reproducible solution since it's pinned to a specific Git commit SHA which is immutable. And it won't cause build problems since it's pure-python. And more importantly β this is something people can use right now. Oh, and this way it's possible to get some early feedback if it starts causing issues (it shouldn't!). $ pip install git+https://github.com/cherrypy/cheroot.git@1d28a0703dbf6e6f7fa137e6d2f72b99d0bb0377 |
@rkachach so I just published a release candidate (11.0.0rc0). It should be the same as the previous betas functionally, except for the changelog bit. Let me know if you want to test it out first β if not, I'll probably click the release button again and type in the stable version number tomorrow'ish (seeing that it's already past the midnight, I'd say I mean today...). The process is rather quick and completes under 10 minutes (https://github.com/cherrypy/cheroot/actions/runs/8713203309) β so I'll only hold it off for longer if somebody explicitly asks for more time to test. |
UPDATE FOR EVERYONE FOLLOWING THIS DISCUSSION: @rkachach and I have been talking through the work channels and we identified that the Ceph project is still running Python 3.6 and can't migrate to a newer one fast enough. As a result I made an effort to see if it'd be difficult to make a release supporting that runtime and managed to revert the incompatible changes in a separate branch. I've just released v10.0.1rc0 about 10 minutes ago with everything that was included in v11.0.0rc0 but the Python 3.6 drop reverted. The plan is that Ceph folks make some testing tomorrow and I'll cut a stable release as v10.0.1. This is a one-time exception and Python 3.6/3.7 remain dropped from the |
@webknjaz thank you very much for all the big effort and the time you spent to get this done. Really appreciated. Just to followup on the discussion I wanted to confirm that the downgrade of typing_extensions package version to |
UPD: Due to the need of more testing, the stable release is being postponed until Monday. |
I released v10.0.1 finally. Here's a few places on the internet that you may be interested in with regard to what's included:
I think I'll release v11 series with mostly maintenance changes after some more work is done later. That'll finally contain changes that mark Python 3.6/3.7 as unsupported in the metadata. And perhaps better logging around broken TLS connections. P.S. I'm going to attend PyCon US 2024 in like 3 weeks, so if any of you is around β come say hi and I'm happy to hack on the CherryPy things if anybody's up for it during the Sprints (otherwise, I'll probably be mostly involved with things relating to packaging and various mass-maintenance automation). |
β What kind of change does this PR introduce?
π What is the related issue number (starting with
#
)N/A
β What is the current behavior? (You can also link to an open issue here)
This was originally discovered by scanning our cherrypy-based server with the Qualys vulnerability scanner. We later discovered we could reproduce the issue by:
The result is this exception logged:
(Note that I am using a locally-built version of cheroot with added logging to diagnose the error, but the same issue occurs with stock
cheroot
8.5.2. Unfortunately, this does mean the line numbers might not match exactly.)This exception is followed by the death of the worker thread executing this request. Hitting refresh as many times as the number of worker threads causes the server to become unresponsive. However, it will continue to receive and queue requests, until the queued requests consume all available file descriptors. Here is what
dynpool
will log once all worker threads are dead:Note that, although the worker threads are dead,
cheroot
anddynpool
think they are still alive. CallingThreadPool._clear_dead_threads()
would clear the dead threads and allowdynpool
to recover, butclear_dead_threads
is only called fromshrink()
, anddynpool
won't callshrink()
because the number of worker threads is within limits.The above exception is being caught and handled properly in
conn.communicate()
, but the worker thread is being killed by a secondSSLError
exception:β What is the new behavior (if this is a feature change)?
With the proposed change, the SSL exceptions will still be logged, but the worker thread will survive.
π Other information:
This is my first foray into cheroot, so it's possible this isn't the right way to fix this. We can discuss, and I can retest with alternate fixes.
Both the server and the firefox browser used to trigger the problem are running on CentOS 7. The server is using python 3.6.
π Checklist:
and description in grammatically correct, complete sentences
This change isβ