Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lib: Memory spike reduction for sh cmds at scale #16498

Closed

Conversation

skanchisa
Copy link
Contributor

The output buffer vty->obuf is a linked list where each element is of 4KB.
Currently, when a huge sh command like “show ip route json” is executed on a large scale, all the vty_out’s are processed and the entire data is accumulated.
After the entire vty execution, vtysh_flush proceeses and puts this data in the socket (131KB at a time).

Problem here is the memory spike for such heavy duty show commands.

The fix here is to chunkify the output on VTY shell by flushing it intermediately for every 128 KB of output accumulated and free the memory allocated for the buffer data.

This way, we achieve ~25-30% reduction in the memory spike.

@frrbot frrbot bot added the libfrr label Jul 30, 2024
@skanchisa skanchisa force-pushed the srujana/memory_save_vty_out branch 4 times, most recently from 1af9950 to cff3985 Compare July 30, 2024 22:02
@skanchisa
Copy link
Contributor Author

I have attached the UT logs with and without fix.
Example:
For 500k ipv4 routes, Buffer memory used for the command sh ip route json

  • Before : 255 MB
  • Now: 199 MB

The below logs are captured for 250k ipv4 and 250k ipv6 routes
Memory_spike_UT.xlsx

lib/vty.c Outdated Show resolved Hide resolved
vtysh/vtysh.c Outdated Show resolved Hide resolved
lib/vty.c Outdated Show resolved Hide resolved
vtysh/vtysh.c Show resolved Hide resolved
vtysh/vtysh.c Outdated Show resolved Hide resolved
lib/vty.c Outdated Show resolved Hide resolved
@skanchisa skanchisa force-pushed the srujana/memory_save_vty_out branch from cff3985 to 48a44b9 Compare July 31, 2024 19:21
Copy link
Contributor

@mjstapp mjstapp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for digging into this - still had some questions

lib/buffer.h Outdated Show resolved Hide resolved
lib/vty.c Outdated Show resolved Hide resolved
lib/vty.c Outdated Show resolved Hide resolved
lib/vty.c Outdated Show resolved Hide resolved
lib/vty.c Outdated Show resolved Hide resolved
@skanchisa skanchisa force-pushed the srujana/memory_save_vty_out branch from 48a44b9 to 794acc8 Compare August 1, 2024 22:16
@github-actions github-actions bot added size/M and removed size/L labels Aug 1, 2024
@skanchisa skanchisa force-pushed the srujana/memory_save_vty_out branch 3 times, most recently from 3687a52 to 8a0b052 Compare August 1, 2024 22:57
@skanchisa skanchisa requested review from mjstapp and ton31337 August 5, 2024 21:32
@skanchisa skanchisa force-pushed the srujana/memory_save_vty_out branch from 8a0b052 to a8de843 Compare August 8, 2024 17:50
@ton31337
Copy link
Member

Could you fix frrbot styling?

@skanchisa skanchisa force-pushed the srujana/memory_save_vty_out branch from a8de843 to 1c1fb79 Compare August 12, 2024 17:48
@github-actions github-actions bot added the rebase PR needs rebase label Aug 12, 2024
@skanchisa
Copy link
Contributor Author

Could you fix frrbot styling?

Done

@skanchisa skanchisa closed this Aug 12, 2024
@skanchisa skanchisa reopened this Aug 12, 2024
lib/vty.c Outdated
/* print without crlf replacement */
buffer_put(vty->obuf, (uint8_t *)filtered, strlen(filtered));
/* For every 32 *4 KB invoke vtysh_flush where we
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll repeat the earlier comment: please don't hard-code values like these into comments. over time, they may become stale, and confusing.

lib/vty.c Outdated
* socket and free the vty->obuf data. This way, we are flushing
* data every 128KB for the show command.
*/
if (vty->vty_buf_size_written >= 32 * 4096)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

repeating: please don't hard-code this threshold value in this way.

lib/vty.c Outdated
/* Setting the SND buffer size to 16MB to increase the buffer size
* the socket can hold before sending it to VTY shell
*/
uint32_t sndbufsize = 4096 * 4096;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, please don't open-code the limit here

vtysh/vtysh.c Outdated
/* Setting the RCV buffer size to 16MB to increase the buffer size
* the socket can hold after receving from other process
*/
uint32_t rcvbufsize = 4096 * 4096;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hard-coded value

@@ -2227,6 +2250,7 @@ static int vtysh_flush(struct vty *vty)
vty_close(vty);
return -1;
case BUFFER_EMPTY:
vty->vty_buf_size_written = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so the counter is only reset if obuf is empty? if the write partially succeeds, we'll stay in a "flush" loop with every vty_out() call - is that ... what you want?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there was an extra 2-3 MB used with reset when it hits the 128 KB and we will still have our 25-30 % memory reduction, so I changed it and it will reset in both the places and we won't be in flush loop for every vty_out call and it calls only after we reach the chunk is 128KB again

@@ -2227,6 +2250,7 @@ static int vtysh_flush(struct vty *vty)
vty_close(vty);
return -1;
case BUFFER_EMPTY:
vty->vty_buf_size_written = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the buffer is empty, should we cancel a "vtysh_write" event?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct me if I understood your question right. You are suggesting we should cancel the vtysh_write event when the buffer is empty?
In the existing code vtysh_flush is called at the end of command execution (considering outputs less than 128KB also). After that vty_close is called which will cancel the vtysh_write event since it is end of command execution. I thought it is better to leave the existing code as is and did not include intermediate cancel and starting of vtysh_write event.

@skanchisa skanchisa force-pushed the srujana/memory_save_vty_out branch 2 times, most recently from 3e84e0a to 8b9384e Compare August 12, 2024 19:17
The output buffer vty->obuf is a linked list where
each element is of 4KB.
Currently, when a huge sh command  like “show ip route json”
is executed on a large scale, all the vty_out’s are
processed and the entire data is accumulated.
After the entire vty execution, vtysh_flush proceeses
and puts this data in the socket (131KB at a time).

Problem here is the memory spike for such heavy duty
show commands.

The fix here is to chunkify the output on VTY shell by
flushing it intermediately for every 128 KB of output
accumulated and free the memory allocated for the buffer data.

This way, we achieve ~25-30% reduction in the memory spike.

Signed-off-by: Rajasekar Raja <[email protected]>
Signed-off-by: Srujana <[email protected]>
@raja-rajasekar
Copy link
Contributor

@mjstapp @ton31337 Anything pending to be addressed on this commit?

pguibert6WIND added a commit to pguibert6WIND/frr that referenced this pull request Aug 26, 2024
The show bgp ipv4 detail json command takes a lot of memory, and
on Linux, when reading a full route, the resulting VM size of the
bgp daemon has increased from 10.2 GB to 19.2 GB (the measure
has been done with the pathch of the below link.

The hypothesis is that the memory management is culprit: on Linux,
pages are allocated to host memory blocks. When those memory
blocks are no more used, any empty page should be made available.

If the page is not freed, this probably means that some memory
blocks have not been freed. The troubleshooting shows that some
data of bgp entries store the json format when the usage is
requested. That would imply that for a given page, some memory
blocks will remain after the show, and others not.

In order to separate both usages on a shared page, do choose to
allocate the json output of bgp aspath, bgp communities, and
bgp large communities at startup.

The number of additional allocations done at startup is measured:
- 130472 aspaths
- 18471 communities
- 1308 large communities

The memory measurement is given for information and does not show
any significative improvement since only the initial VM size has
changed frmo 10.2 to 14.5 GB.

> root@dut-sureau-nianticvf:~# ps -aux | grep bgpd
> root        7702 83.3 14.5 1663832 1485640 ?     Ssl  08:14   1:56 /usr/bin/bgpd -A 127.0.0.1 -M snmp -M rpki -M bmp
> root@dut-sureau-nianticvf:~# time vtysh -c "show bgp ipv4 json detail" > /tmp/showbgpipv4detailjson.txt
>
> real    0m30.286s
> user    0m2.796s
> sys     0m5.565s
> root@dut-sureau-nianticvf:~# ps -aux | grep bgpd
> root        7702 75.7 19.2 2137220 1959064 ?     Ssl  08:14   2:23 /usr/bin/bgpd -A 127.0.0.1 -M snmp -M rpki -M bmp

Link: FRRouting#16498

Signed-off-by: Philippe Guibert <[email protected]>
pguibert6WIND added a commit to pguibert6WIND/frr that referenced this pull request Aug 26, 2024
The show bgp ipv4 detail json command takes a lot of memory, and
on Linux, when reading a full route, the resulting VM size of the
bgp daemon has increased from 10.2 GB to 19.2 GB (the measure
has been done with the pathch of the below link.

The hypothesis is that the memory management is culprit: on Linux,
pages are allocated to host memory blocks. When those memory
blocks are no more used, any empty page should be made available.

If the page is not freed, this probably means that some memory
blocks have not been freed. The troubleshooting shows that some
data of bgp entries store the json format when the usage is
requested. That would imply that for a given page, some memory
blocks will remain after the show, and others not.

In order to separate both usages on a shared page, do choose to
allocate the json output of bgp aspath, bgp communities, and
bgp large communities at startup.

The number of additional allocations done at startup is measured:
- 130472 aspaths
- 18471 communities
- 1308 large communities

The memory measurement is given for information and does not show
any significative improvement since only the initial VM size has
changed frmo 1663832 KB to 2137220 KB.

> root@dut-sureau-nianticvf:~# ps -aux | grep bgpd
> root        7702 83.3 14.5 1663832 1485640 ?     Ssl  08:14   1:56 /usr/bin/bgpd -A 127.0.0.1 -M snmp -M rpki -M bmp
> root@dut-sureau-nianticvf:~# time vtysh -c "show bgp ipv4 json detail" > /tmp/showbgpipv4detailjson.txt
>
> real    0m30.286s
> user    0m2.796s
> sys     0m5.565s
> root@dut-sureau-nianticvf:~# ps -aux | grep bgpd
> root        7702 75.7 19.2 2137220 1959064 ?     Ssl  08:14   2:23 /usr/bin/bgpd -A 127.0.0.1 -M snmp -M rpki -M bmp

Link: FRRouting#16498

Signed-off-by: Philippe Guibert <[email protected]>

/* Vty flush intermediate size (32 * 4096). */
#define MAX_INTERMEDIATE_FLUSH 131072

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you want to put these in the public vty.h header, please follow the naming pattern and call them VTY_xxx

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do

@@ -35,6 +35,7 @@ extern struct event_loop *master;
#define VTYSH_PATHD 0x80000
#define VTYSH_PIM6D 0x100000
#define VTYSH_MGMTD 0x200000
#define RCV_BUF_MAX 16777216
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please follow the naming convention used VTYSH_xxx

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will do

vtysh/vtysh.c Show resolved Hide resolved
@@ -2141,6 +2150,21 @@ static void vtysh_accept(struct event *thread)
close(sock);
return;
}
/* Setting the SND buffer size to 16MB to increase the buffer size
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please don't embed the current value in the comment

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do

@@ -4721,6 +4721,22 @@ static int vtysh_connect(struct vtysh_client *vclient)
close(sock);
return -1;
}
/* Setting the RCV buffer size to 16MB to increase the buffer size
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please don't embed the value in the comment

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do

/* Setting the SND buffer size to 16MB to increase the buffer size
* the socket can hold before sending it to VTY shell
*/
uint32_t sndbufsize = SEND_BUF_MAX;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please declare the locals at the top, follow the existing style

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do

/* Setting the RCV buffer size to 16MB to increase the buffer size
* the socket can hold after receving from other process
*/
uint32_t rcvbufsize = RCV_BUF_MAX;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please declare the locals at the top, follow the existing style.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do

* socket and free the vty->obuf data.
*/
if (vty->vty_buf_size_written >= MAX_INTERMEDIATE_FLUSH) {
vty->vty_buf_size_written = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still unclear about the meaning of this new variable.
it's set to zero here, even if nothing was written (or less than the amount of data pending), so it no longer represents the amount of data buffered?
it's set to zero (again) on a complete send.
what is it supposed to represent - it's not the "buf_size_written", right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is, that we keep accumulating data (few vty_outs() calls) until we hit 32KB and once we hit the 32KB limit,

  • we call vtysh_flush where the data(32KB) is written into the socket
  • reset vty->vty_buf_size_written to 0 so we again accumulate another 32KB.

vty_buf_size_written is essentially to tell me how much memory is accumulated. if renaming it makes life easy, would vty_buf_size_accumulated work?

@mjstapp
Copy link
Contributor

mjstapp commented Aug 26, 2024

It would be good to rebase to an updated master - you're way behind, and I think the CI failure that you're seeing is fixed.

raja-rajasekar pushed a commit to raja-rajasekar/frr that referenced this pull request Aug 27, 2024
The output buffer vty->obuf is a linked list where
each element is of 4KB.
Currently, when a huge sh command  like <show ip route json>
is executed on a large scale, all the vty_outs are
processed and the entire data is accumulated.
After the entire vty execution, vtysh_flush proceeses
and puts this data in the socket (131KB at a time).

Problem here is the memory spike for such heavy duty
show commands.

The fix here is to chunkify the output on VTY shell by
flushing it intermediately for every 128 KB of output
accumulated and free the memory allocated for the buffer data.

This way, we achieve ~25-30% reduction in the memory spike.

Fixes: FRRouting#16498
Note: This is a continuation of MR FRRouting#16498

Signed-off-by: Srujana <[email protected]>

Signed-off-by: Rajasekar Raja <[email protected]>
@raja-rajasekar
Copy link
Contributor

raja-rajasekar commented Aug 27, 2024

Need to close this, New #16672 raised.

@ton31337 ton31337 closed this Aug 28, 2024
donaldsharp pushed a commit to donaldsharp/frr that referenced this pull request Sep 20, 2024
The output buffer vty->obuf is a linked list where
each element is of 4KB.
Currently, when a huge sh command  like <show ip route json>
is executed on a large scale, all the vty_outs are
processed and the entire data is accumulated.
After the entire vty execution, vtysh_flush proceeses
and puts this data in the socket (131KB at a time).

Problem here is the memory spike for such heavy duty
show commands.

The fix here is to chunkify the output on VTY shell by
flushing it intermediately for every 128 KB of output
accumulated and free the memory allocated for the buffer data.

This way, we achieve ~25-30% reduction in the memory spike.

Fixes: FRRouting#16498
Note: This is a continuation of MR FRRouting#16498

Ticket :#4050898

Signed-off-by: Srujana <[email protected]>

Signed-off-by: Rajasekar Raja <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants