-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Show session and link details for AMQP 1.0 connection #12670
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
ansd
force-pushed
the
amqp-connection-sessions
branch
2 times, most recently
from
November 6, 2024 16:47
b9cf16a
to
beb6749
Compare
ansd
force-pushed
the
amqp-connection-sessions
branch
9 times, most recently
from
November 7, 2024 12:19
b8cdf6d
to
84d5c4a
Compare
## What? On the connection page in the Management UI, display detailed session and link information including: * Link names * Link target and source addresses * Link flow control state * Session flow control state * Number of unconfirmed and unacknowledged messages ## How? A new HTTP API endpoint is added: ``` /connections/:connection_name/sessions ``` The HTTP handler first queries the Erlang connection process to find out about all session Pids. The handler then queries each Erlang session process of this connection. (The table auto-refreshes by default every 5 seconds. The handler querying a single connection with 60 idle sessions with each 250 links takes ~100 ms.) For better user experience in the Management UI, this commit also makes the session process store and expose link names as well as source/target addresses.
This commit fixes two different bugs/crashes. To repro, prior to this commit: 1. Create an AMQP 1.0 connection on node-1. 2. Open the Management UI on node-2 and open the connection page of this single AMQP 1.0 connection. The first crash was the following: ``` [error] <0.1297.0> crasher: [error] <0.1297.0> initial call: cowboy_stream_h:request_process/3 [error] <0.1297.0> pid: <0.1297.0> [error] <0.1297.0> registered_name: [] [error] <0.1297.0> exception error: no case clause matching [error] <0.1297.0> {badrpc, [error] <0.1297.0> {'EXIT', [error] <0.1297.0> {undef, [error] <0.1297.0> [{rabbit_connection_tracking,lookup, [error] <0.1297.0> [<<"[::1]:51729 -> [::1]:5672">>, [error] <0.1297.0> ['rabbit-1@ABCDDDEEAA']], [error] <0.1297.0> []}]}}} [error] <0.1297.0> in function rabbit_connection_tracking:lookup/2 (rabbit_connection_tracking.erl, line 235) [error] <0.1297.0> in call from rabbit_mgmt_wm_connection_sessions:conn/1 (rabbit_mgmt_wm_connection_sessions.erl, line 72) [error] <0.1297.0> in call from rabbit_mgmt_wm_connection_sessions:is_authorized/2 (rabbit_mgmt_wm_connection_sessions.erl, line 63) [error] <0.1297.0> in call from cowboy_rest:call/3 (src/cowboy_rest.erl, line 1590) [error] <0.1297.0> in call from cowboy_rest:is_authorized/2 (src/cowboy_rest.erl, line 368) [error] <0.1297.0> in call from cowboy_rest:upgrade/4 (src/cowboy_rest.erl, line 284) [error] <0.1297.0> in call from cowboy_stream_h:execute/3 (src/cowboy_stream_h.erl, line 306) [error] <0.1297.0> in call from cowboy_stream_h:request_process/3 (src/cowboy_stream_h.erl, line 295) ``` The second crash was the following: ``` [error] <0.1132.0> crasher: [error] <0.1132.0> initial call: cowboy_stream_h:request_process/3 [error] <0.1132.0> pid: <0.1132.0> [error] <0.1132.0> registered_name: [] [error] <0.1132.0> exception error: no case clause matching [error] <0.1132.0> {tracked_connection, [error] <0.1132.0> {'rabbit-1@ABCDDDEEAA', [error] <0.1132.0> <<"[::1]:65505 -> [::1]:5672">>}, [error] <0.1132.0> 'rabbit-1@ABCDDDEEAA',<<"/">>, [error] <0.1132.0> <<"[::1]:65505 -> [::1]:5672">>,<13661.1110.0>, [error] <0.1132.0> {1,0}, [error] <0.1132.0> network, [error] <0.1132.0> {0,0,0,0,0,0,0,1}, [error] <0.1132.0> 65505,<<"guest">>,1730908606089} [error] <0.1132.0> in function rabbit_connection_tracking:lookup/2 (rabbit_connection_tracking.erl, line 235) [error] <0.1132.0> in call from rabbit_mgmt_wm_connection_sessions:conn/1 (rabbit_mgmt_wm_connection_sessions.erl, line 72) [error] <0.1132.0> in call from rabbit_mgmt_wm_connection_sessions:is_authorized/2 (rabbit_mgmt_wm_connection_sessions.erl, line 63) [error] <0.1132.0> in call from cowboy_rest:call/3 (src/cowboy_rest.erl, line 1590) [error] <0.1132.0> in call from cowboy_rest:is_authorized/2 (src/cowboy_rest.erl, line 368) [error] <0.1132.0> in call from cowboy_rest:upgrade/4 (src/cowboy_rest.erl, line 284) [error] <0.1132.0> in call from cowboy_stream_h:execute/3 (src/cowboy_stream_h.erl, line 306) [error] <0.1132.0> in call from cowboy_stream_h:request_process/3 (src/cowboy_stream_h.erl, line 295)
ansd
force-pushed
the
amqp-connection-sessions
branch
from
November 7, 2024 14:12
84d5c4a
to
124ef69
Compare
michaelklishin
approved these changes
Nov 7, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that a feature flag to avoid a logged exception would be an overkill.
If anything, we can focus on adjusting v4.0.x
accordingly if really necessary.
ansd
added a commit
that referenced
this pull request
Nov 11, 2024
This commit fixes two different bugs/crashes. To repro, prior to this commit on `main`: 1. Create an AMQP 1.0 connection on node-1. 2. Open the Management UI on node-2 and open the connection page of this single AMQP 1.0 connection. The first crash was the following: ``` [error] <0.1297.0> crasher: [error] <0.1297.0> initial call: cowboy_stream_h:request_process/3 [error] <0.1297.0> pid: <0.1297.0> [error] <0.1297.0> registered_name: [] [error] <0.1297.0> exception error: no case clause matching [error] <0.1297.0> {badrpc, [error] <0.1297.0> {'EXIT', [error] <0.1297.0> {undef, [error] <0.1297.0> [{rabbit_connection_tracking,lookup, [error] <0.1297.0> [<<"[::1]:51729 -> [::1]:5672">>, [error] <0.1297.0> ['rabbit-1@ABCDDDEEAA']], [error] <0.1297.0> []}]}}} [error] <0.1297.0> in function rabbit_connection_tracking:lookup/2 (rabbit_connection_tracking.erl, line 235) [error] <0.1297.0> in call from rabbit_mgmt_wm_connection_sessions:conn/1 (rabbit_mgmt_wm_connection_sessions.erl, line 72) [error] <0.1297.0> in call from rabbit_mgmt_wm_connection_sessions:is_authorized/2 (rabbit_mgmt_wm_connection_sessions.erl, line 63) [error] <0.1297.0> in call from cowboy_rest:call/3 (src/cowboy_rest.erl, line 1590) [error] <0.1297.0> in call from cowboy_rest:is_authorized/2 (src/cowboy_rest.erl, line 368) [error] <0.1297.0> in call from cowboy_rest:upgrade/4 (src/cowboy_rest.erl, line 284) [error] <0.1297.0> in call from cowboy_stream_h:execute/3 (src/cowboy_stream_h.erl, line 306) [error] <0.1297.0> in call from cowboy_stream_h:request_process/3 (src/cowboy_stream_h.erl, line 295) ``` The second crash was the following: ``` [error] <0.1132.0> crasher: [error] <0.1132.0> initial call: cowboy_stream_h:request_process/3 [error] <0.1132.0> pid: <0.1132.0> [error] <0.1132.0> registered_name: [] [error] <0.1132.0> exception error: no case clause matching [error] <0.1132.0> {tracked_connection, [error] <0.1132.0> {'rabbit-1@ABCDDDEEAA', [error] <0.1132.0> <<"[::1]:65505 -> [::1]:5672">>}, [error] <0.1132.0> 'rabbit-1@ABCDDDEEAA',<<"/">>, [error] <0.1132.0> <<"[::1]:65505 -> [::1]:5672">>,<13661.1110.0>, [error] <0.1132.0> {1,0}, [error] <0.1132.0> network, [error] <0.1132.0> {0,0,0,0,0,0,0,1}, [error] <0.1132.0> 65505,<<"guest">>,1730908606089} [error] <0.1132.0> in function rabbit_connection_tracking:lookup/2 (rabbit_connection_tracking.erl, line 235) [error] <0.1132.0> in call from rabbit_mgmt_wm_connection_sessions:conn/1 (rabbit_mgmt_wm_connection_sessions.erl, line 72) [error] <0.1132.0> in call from rabbit_mgmt_wm_connection_sessions:is_authorized/2 (rabbit_mgmt_wm_connection_sessions.erl, line 63) [error] <0.1132.0> in call from cowboy_rest:call/3 (src/cowboy_rest.erl, line 1590) [error] <0.1132.0> in call from cowboy_rest:is_authorized/2 (src/cowboy_rest.erl, line 368) [error] <0.1132.0> in call from cowboy_rest:upgrade/4 (src/cowboy_rest.erl, line 284) [error] <0.1132.0> in call from cowboy_stream_h:execute/3 (src/cowboy_stream_h.erl, line 306) [error] <0.1132.0> in call from cowboy_stream_h:request_process/3 (src/cowboy_stream_h.erl, line 295) This commit is a partial backport o #12670 (cherry picked from commit 124ef69)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What?
On the connection page in the Management UI, display detailed session and link information including:
The next screenshot shows quiver at full speed with two connections against a single node. On the incoming link, we see nicely how 255 messages are awaiting confirms from the target classic queue with RabbitMQ preventing the publisher from sending more messages by granting 0 link credits (in AMQP 0.9.1 the connection would have been in
flow
state instead):How?
A new HTTP API endpoint is added:
The HTTP handler first queries the Erlang connection process to find out about all session Pids. The handler then queries each Erlang session process of this connection.
(The table auto-refreshes by default every 5 seconds. The handler querying a single connection with 60 idle sessions with each 250 links takes ~100 ms.)
For better user experience in the Management UI, this commit also makes the session process store and expose link names as well as source/target addresses.
Mixed version notes
Note that due to the bugs described in the 2nd commit message of this PR and present on <=
v4.0.3
opening the connection page of an AMQ 1.0 connection in a mixed version cluster on a node >= v4.1.0 with other nodes running on <= v4.3.0 results in a 500 error and the following crash logged by the RabbitMQ node:This seems acceptable. Introducing a feature flag for this rare and harmless case seems to be an overkill.
Also note that we partially backported this PR to
v4.0.x
in #12700This means that the Management UI will only return error code 500 with RabbitMQ logging:
if the AMQP 1.0 connection is opened against the >= 4.0.4 (but < 4.1) node while opening the connection page on the Management UI on a >=4.1 node.