-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lib: Memory spike reduction for sh cmds at scale #16498
Conversation
1af9950
to
cff3985
Compare
I have attached the UT logs with and without fix.
The below logs are captured for 250k ipv4 and 250k ipv6 routes |
cff3985
to
48a44b9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for digging into this - still had some questions
48a44b9
to
794acc8
Compare
3687a52
to
8a0b052
Compare
8a0b052
to
a8de843
Compare
Could you fix frrbot styling? |
a8de843
to
1c1fb79
Compare
Done |
lib/vty.c
Outdated
/* print without crlf replacement */ | ||
buffer_put(vty->obuf, (uint8_t *)filtered, strlen(filtered)); | ||
/* For every 32 *4 KB invoke vtysh_flush where we |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll repeat the earlier comment: please don't hard-code values like these into comments. over time, they may become stale, and confusing.
lib/vty.c
Outdated
* socket and free the vty->obuf data. This way, we are flushing | ||
* data every 128KB for the show command. | ||
*/ | ||
if (vty->vty_buf_size_written >= 32 * 4096) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
repeating: please don't hard-code this threshold value in this way.
lib/vty.c
Outdated
/* Setting the SND buffer size to 16MB to increase the buffer size | ||
* the socket can hold before sending it to VTY shell | ||
*/ | ||
uint32_t sndbufsize = 4096 * 4096; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again, please don't open-code the limit here
vtysh/vtysh.c
Outdated
/* Setting the RCV buffer size to 16MB to increase the buffer size | ||
* the socket can hold after receving from other process | ||
*/ | ||
uint32_t rcvbufsize = 4096 * 4096; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hard-coded value
@@ -2227,6 +2250,7 @@ static int vtysh_flush(struct vty *vty) | |||
vty_close(vty); | |||
return -1; | |||
case BUFFER_EMPTY: | |||
vty->vty_buf_size_written = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so the counter is only reset if obuf is empty? if the write partially succeeds, we'll stay in a "flush" loop with every vty_out() call - is that ... what you want?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, there was an extra 2-3 MB used with reset when it hits the 128 KB and we will still have our 25-30 % memory reduction, so I changed it and it will reset in both the places and we won't be in flush loop for every vty_out call and it calls only after we reach the chunk is 128KB again
@@ -2227,6 +2250,7 @@ static int vtysh_flush(struct vty *vty) | |||
vty_close(vty); | |||
return -1; | |||
case BUFFER_EMPTY: | |||
vty->vty_buf_size_written = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the buffer is empty, should we cancel a "vtysh_write" event?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct me if I understood your question right. You are suggesting we should cancel the vtysh_write event when the buffer is empty?
In the existing code vtysh_flush is called at the end of command execution (considering outputs less than 128KB also). After that vty_close is called which will cancel the vtysh_write event since it is end of command execution. I thought it is better to leave the existing code as is and did not include intermediate cancel and starting of vtysh_write event.
3e84e0a
to
8b9384e
Compare
The output buffer vty->obuf is a linked list where each element is of 4KB. Currently, when a huge sh command like “show ip route json” is executed on a large scale, all the vty_out’s are processed and the entire data is accumulated. After the entire vty execution, vtysh_flush proceeses and puts this data in the socket (131KB at a time). Problem here is the memory spike for such heavy duty show commands. The fix here is to chunkify the output on VTY shell by flushing it intermediately for every 128 KB of output accumulated and free the memory allocated for the buffer data. This way, we achieve ~25-30% reduction in the memory spike. Signed-off-by: Rajasekar Raja <[email protected]> Signed-off-by: Srujana <[email protected]>
8b9384e
to
e5f4acd
Compare
The show bgp ipv4 detail json command takes a lot of memory, and on Linux, when reading a full route, the resulting VM size of the bgp daemon has increased from 10.2 GB to 19.2 GB (the measure has been done with the pathch of the below link. The hypothesis is that the memory management is culprit: on Linux, pages are allocated to host memory blocks. When those memory blocks are no more used, any empty page should be made available. If the page is not freed, this probably means that some memory blocks have not been freed. The troubleshooting shows that some data of bgp entries store the json format when the usage is requested. That would imply that for a given page, some memory blocks will remain after the show, and others not. In order to separate both usages on a shared page, do choose to allocate the json output of bgp aspath, bgp communities, and bgp large communities at startup. The number of additional allocations done at startup is measured: - 130472 aspaths - 18471 communities - 1308 large communities The memory measurement is given for information and does not show any significative improvement since only the initial VM size has changed frmo 10.2 to 14.5 GB. > root@dut-sureau-nianticvf:~# ps -aux | grep bgpd > root 7702 83.3 14.5 1663832 1485640 ? Ssl 08:14 1:56 /usr/bin/bgpd -A 127.0.0.1 -M snmp -M rpki -M bmp > root@dut-sureau-nianticvf:~# time vtysh -c "show bgp ipv4 json detail" > /tmp/showbgpipv4detailjson.txt > > real 0m30.286s > user 0m2.796s > sys 0m5.565s > root@dut-sureau-nianticvf:~# ps -aux | grep bgpd > root 7702 75.7 19.2 2137220 1959064 ? Ssl 08:14 2:23 /usr/bin/bgpd -A 127.0.0.1 -M snmp -M rpki -M bmp Link: FRRouting#16498 Signed-off-by: Philippe Guibert <[email protected]>
The show bgp ipv4 detail json command takes a lot of memory, and on Linux, when reading a full route, the resulting VM size of the bgp daemon has increased from 10.2 GB to 19.2 GB (the measure has been done with the pathch of the below link. The hypothesis is that the memory management is culprit: on Linux, pages are allocated to host memory blocks. When those memory blocks are no more used, any empty page should be made available. If the page is not freed, this probably means that some memory blocks have not been freed. The troubleshooting shows that some data of bgp entries store the json format when the usage is requested. That would imply that for a given page, some memory blocks will remain after the show, and others not. In order to separate both usages on a shared page, do choose to allocate the json output of bgp aspath, bgp communities, and bgp large communities at startup. The number of additional allocations done at startup is measured: - 130472 aspaths - 18471 communities - 1308 large communities The memory measurement is given for information and does not show any significative improvement since only the initial VM size has changed frmo 1663832 KB to 2137220 KB. > root@dut-sureau-nianticvf:~# ps -aux | grep bgpd > root 7702 83.3 14.5 1663832 1485640 ? Ssl 08:14 1:56 /usr/bin/bgpd -A 127.0.0.1 -M snmp -M rpki -M bmp > root@dut-sureau-nianticvf:~# time vtysh -c "show bgp ipv4 json detail" > /tmp/showbgpipv4detailjson.txt > > real 0m30.286s > user 0m2.796s > sys 0m5.565s > root@dut-sureau-nianticvf:~# ps -aux | grep bgpd > root 7702 75.7 19.2 2137220 1959064 ? Ssl 08:14 2:23 /usr/bin/bgpd -A 127.0.0.1 -M snmp -M rpki -M bmp Link: FRRouting#16498 Signed-off-by: Philippe Guibert <[email protected]>
|
||
/* Vty flush intermediate size (32 * 4096). */ | ||
#define MAX_INTERMEDIATE_FLUSH 131072 | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you want to put these in the public vty.h header, please follow the naming pattern and call them VTY_xxx
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do
@@ -35,6 +35,7 @@ extern struct event_loop *master; | |||
#define VTYSH_PATHD 0x80000 | |||
#define VTYSH_PIM6D 0x100000 | |||
#define VTYSH_MGMTD 0x200000 | |||
#define RCV_BUF_MAX 16777216 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please follow the naming convention used VTYSH_xxx
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will do
@@ -2141,6 +2150,21 @@ static void vtysh_accept(struct event *thread) | |||
close(sock); | |||
return; | |||
} | |||
/* Setting the SND buffer size to 16MB to increase the buffer size |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please don't embed the current value in the comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do
@@ -4721,6 +4721,22 @@ static int vtysh_connect(struct vtysh_client *vclient) | |||
close(sock); | |||
return -1; | |||
} | |||
/* Setting the RCV buffer size to 16MB to increase the buffer size |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please don't embed the value in the comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do
/* Setting the SND buffer size to 16MB to increase the buffer size | ||
* the socket can hold before sending it to VTY shell | ||
*/ | ||
uint32_t sndbufsize = SEND_BUF_MAX; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please declare the locals at the top, follow the existing style
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do
/* Setting the RCV buffer size to 16MB to increase the buffer size | ||
* the socket can hold after receving from other process | ||
*/ | ||
uint32_t rcvbufsize = RCV_BUF_MAX; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please declare the locals at the top, follow the existing style.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do
* socket and free the vty->obuf data. | ||
*/ | ||
if (vty->vty_buf_size_written >= MAX_INTERMEDIATE_FLUSH) { | ||
vty->vty_buf_size_written = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still unclear about the meaning of this new variable.
it's set to zero here, even if nothing was written (or less than the amount of data pending), so it no longer represents the amount of data buffered?
it's set to zero (again) on a complete send.
what is it supposed to represent - it's not the "buf_size_written", right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is, that we keep accumulating data (few vty_outs() calls) until we hit 32KB and once we hit the 32KB limit,
- we call vtysh_flush where the data(32KB) is written into the socket
- reset vty->vty_buf_size_written to 0 so we again accumulate another 32KB.
vty_buf_size_written is essentially to tell me how much memory is accumulated. if renaming it makes life easy, would vty_buf_size_accumulated work?
It would be good to rebase to an updated master - you're way behind, and I think the CI failure that you're seeing is fixed. |
The output buffer vty->obuf is a linked list where each element is of 4KB. Currently, when a huge sh command like <show ip route json> is executed on a large scale, all the vty_outs are processed and the entire data is accumulated. After the entire vty execution, vtysh_flush proceeses and puts this data in the socket (131KB at a time). Problem here is the memory spike for such heavy duty show commands. The fix here is to chunkify the output on VTY shell by flushing it intermediately for every 128 KB of output accumulated and free the memory allocated for the buffer data. This way, we achieve ~25-30% reduction in the memory spike. Fixes: FRRouting#16498 Note: This is a continuation of MR FRRouting#16498 Signed-off-by: Srujana <[email protected]> Signed-off-by: Rajasekar Raja <[email protected]>
Need to close this, New #16672 raised. |
The output buffer vty->obuf is a linked list where each element is of 4KB. Currently, when a huge sh command like <show ip route json> is executed on a large scale, all the vty_outs are processed and the entire data is accumulated. After the entire vty execution, vtysh_flush proceeses and puts this data in the socket (131KB at a time). Problem here is the memory spike for such heavy duty show commands. The fix here is to chunkify the output on VTY shell by flushing it intermediately for every 128 KB of output accumulated and free the memory allocated for the buffer data. This way, we achieve ~25-30% reduction in the memory spike. Fixes: FRRouting#16498 Note: This is a continuation of MR FRRouting#16498 Ticket :#4050898 Signed-off-by: Srujana <[email protected]> Signed-off-by: Rajasekar Raja <[email protected]>
The output buffer vty->obuf is a linked list where each element is of 4KB.
Currently, when a huge sh command like “show ip route json” is executed on a large scale, all the vty_out’s are processed and the entire data is accumulated.
After the entire vty execution, vtysh_flush proceeses and puts this data in the socket (131KB at a time).
Problem here is the memory spike for such heavy duty show commands.
The fix here is to chunkify the output on VTY shell by flushing it intermediately for every 128 KB of output accumulated and free the memory allocated for the buffer data.
This way, we achieve ~25-30% reduction in the memory spike.