-
Notifications
You must be signed in to change notification settings - Fork 355
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pools leaked at backtrace - question #36
Comments
@pavelnemirovsky Seems like your nginx binary lacks debug symbols (aka debuginfo)? |
@pavelnemirovsky BTW, official OpenResty or NGINX distributions always build with debug symbols out of the box (note: it has nothing to do with debugging builds with |
Thanks a lot agentzh, that's my concern since it is original openresty packages installed from your repo:
|
Looks like here is the problem:
|
@pavelnemirovsky Okay, seems like you have already installed the I guess your Please try the following patch for diff --git a/ngx-leaked-pools b/ngx-leaked-pools
index ec39086..5fc02e5 100755
--- a/ngx-leaked-pools
+++ b/ngx-leaked-pools
@@ -108,7 +108,7 @@ probe end {
hits = 0
foreach (bt in btcount- limit 10) {
- printf("%d pools leaked at backtrace %s\\n", btcount[bt], bt)
+ printf("%d pools leaked at backtrace:\\n%s\\n", btcount[bt], print_ustack(bt))
hits++
}
Please let me know if this patch works for you. Thanks! |
Done
|
[root@server nginx-systemtap-toolkit]# rpm -q -a | grep systemtap |
@pavelnemirovsky Strange. Maybe you should try building the latest version of systemtap and elfutils from source? |
maybe this is the problem ?
|
@pavelnemirovsky This doc might be helpful. But you also need to update the version numbers there: https://openresty.org/en/build-systemtap.html BTW, also please ensure that you have installed those kernel-* packages matching your current running kernel: https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/5/html/SystemTap_Beginners_Guide/using-systemtap.html |
I actually followed the guide in the first place:
|
@pavelnemirovsky BTW, you can also use
Your current gdb session needs to get attached to one of your nginx processes first though. |
Thanks a lot, that's what i get (gdb) info symbol 0x41e5da |
@pavelnemirovsky Good, gdb works, to some extend. But the backtraces are still incomplete. The debuginfo does not really take effect here. Strange. BTW, your backtrace above is a false positive due to the sampling window boundaries of your last run of the |
I think you should give more more details about your operating system. It's a CentOS? CentOS 6.x? What version? |
Great, here is it: CentOS release 6.8 (Final) |
Me concern this
Just compiled stap with elf
[root@server systemtap-3.0]# ./stap -v -e 'probe vfs.read {printf("read performed\n"); exit()}' semantic error: probe point mismatch (similar: procfs, end, never, perf, begin): identifier 'vfs' at :1:7 Pass 2: analyzed script: 0 probes, 0 functions, 0 embeds, 0 globals using 131048virt/7116res/2424shr/4936data kb, in 0usr/10sys/3real ms.
|
@pavelnemirovsky Maybe your running kernel does not match the kernel-debuginfo* packages you have installed. Check with the |
|
@pavelnemirovsky Indeed. You have no kernel-debuginfo-2.6.32-642.6.1.el6.x86_64 installed in your system. You should check it out yourself. I don't really have the time to do everything for you. |
This package was missed "yum install kernel-debuginfo-2.6.32-642.6.1.el6.x86_64" but i still see same result. |
@pavelnemirovsky Then you should really ask on the systemtap mailing list instead. It's already OT here. |
We were in sync, it was missed from CLI due "kernel-debug-debuginfo" |
i will but systemtap looks to me fine right now, but this one still doesn't work.
|
@pavelnemirovsky If your addr2line does not support separate debuginfo files, then there is nothing we could do to make it work in If it's that the debuginfo package of |
Got u checking ... |
Looks like here is the issue, looks like addr2line is looking in different place vs rpm installs it ...
|
Meanwhile i had to do this on my machine "ln -s /usr/lib/debug/ /usr/lib64/debug" do u want me to submit PR and change spec file to support lib64 path for openresty-debuginfo? |
Now i see the proper backtrace, thank you so much, i really appreciate your help and definitely will support your project. Regarding an output we constantly see that nginx workers are mem leaking meaning process memory consumption constantly increasing, we use few packages that were installed without debug info, i'll prepare a proper output later to resolve missed missed backtrace symbols.
|
@pavelnemirovsky There is no real leaked memory pools according to your output. You can increase your sampling time window to see if the counts in the output increase proportionally, otherwise it's just noises due to your time window boundaries. |
@pavelnemirovsky It's important to find out if your nginx worker's RES memory size can indeed grow without bound. Otherwise the freed memory may just get cached by your glibc library on the userland for future allocations or just cannot get returned back to the OS due to other reasons (like memory fragmentation). The next version of OpenResty (1.11.2.2 RC1 atm) introduces the lua_malloc_trim directive which often makes the memory footprint much smaller after traffic peaks than previous versions (including 1.11.2.1) by forcing glibc to return cached free memory back to the OS. If it's due to glibc caching or memory fragmentation, then such high memory usage is not really a leak. For tracking real memory leaks, the following tools are more useful than https://github.com/openresty/stapxx#sample-bt-leaks https://github.com/openresty/stapxx#lj-gc https://github.com/openresty/stapxx#lj-gc-objs |
@pavelnemirovsky The |
Thanks again agenttz, your inputs are extremely valuable. Actually i tested lua_malloc_trim functionality as part of "lua-nginx-module 0.10.7", but i didn't notice any changes in term of memory consumption / release, maybe i miss something. We observe that NGINX workers upon a start are starting to grow and allocate more and more memory and when we reach 2.2Gb per worker, it looks that GC cycle is working more aggressively in term of CPU consumption and then we do manual restart for whole nginx service with 4 hours interval. You may see below how it looks like in terms of graph. Regarding the rest of the things you proposed i'll try to grab more info and will reply later. |
Here is an output of 3 tools u proposed to use:
sample-bt-leaks.sxx looks as below: |
I'm afraid we are already well off topic here. |
yes i agree, i apologize i'll open another one |
Can you help me to see the mistake? [root@1_201 sh]# ps aux | grep nginx [root@1_201 sh]# ls -l /data/app/game_platform_union/lua/resty/utils/db/mysql_help.lua Why does it hint that this does not exist? |
Hello,
Can you please guide me what's the reason i was unable identify the leaked pool using "ngx-backtrace", its appears as below? We are running latest version of openresty not debug one!
Thanks in advance,
P
The text was updated successfully, but these errors were encountered: