-
Notifications
You must be signed in to change notification settings - Fork 387
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
siege hangs sometimes #4
Comments
This is likely a kernel problem, see: |
Similar problem here, siege hangs after the log file notification, waiting on futex (why does it need that after dumping the stats and writing the logfile?):
(repeated until ctrl-c)
This was the first time it got stuck after ~20 normal runs with the same parameters. debian10 |
This was fixed a while ago. It looks like you're running v4.0.4. The latest
version is v4.0.7.
https://github.com/JoeDog/siege
…On Mon, Feb 8, 2021 at 4:02 PM Mark-O. Wolter ***@***.***> wrote:
Similar problem here, siege hangs *after* the log file notification,
waiting on futex (*why* does it need that after dumping the stats and
writing the logfile?):
# strace -p 27807
strace: Process 27807 attached
futex(0x558c41f0e408, FUTEX_WAIT_PRIVATE, 2, NULL) = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
Name: siege
Umask: 0022
State: S (sleeping)
Tgid: 27807
Ngid: 0
Pid: 27807
PPid: 9104
TracerPid: 0
Uid: 0 0 0 0
Gid: 0 0 0 0
FDSize: 256
Groups: 0
NStgid: 27807
NSpid: 27807
NSpgid: 27807
NSsid: 9104
VmPeak: 7591400 kB
VmSize: 6739116 kB
VmLck: 0 kB
VmPin: 0 kB
VmHWM: 300748 kB
VmRSS: 252644 kB
RssAnon: 247216 kB
RssFile: 5428 kB
RssShmem: 0 kB
VmData: 310732 kB
VmStk: 132 kB
VmExe: 156 kB
VmLib: 3748 kB
VmPTE: 800 kB
VmSwap: 0 kB
HugetlbPages: 0 kB
CoreDumping: 0
Threads: 1
SigQ: 2/257265
SigPnd: 0000000000000000
ShdPnd: 0000000000000003
SigBlk: 0000000000007003
SigIgn: 0000000000000000
SigCgt: 0000000180000000
CapInh: 0000000000000000
CapPrm: 0000003fffffffff
CapEff: 0000003fffffffff
CapBnd: 0000003fffffffff
CapAmb: 0000000000000000
NoNewPrivs: 0
Seccomp: 0
Speculation_Store_Bypass: thread vulnerable
Cpus_allowed: ffffffff
Cpus_allowed_list: 0-31
Mems_allowed: 00000000,00000001
Mems_allowed_list: 0
voluntary_ctxt_switches: 31
nonvoluntary_ctxt_switches: 3
# pstack 27807
27807: siege -f caching-liste --log=siegelog -i -c 100 -t 2M
(No symbols found)
0x7fe5344c629c: ???? (558c41f0e440, 2, 100006cda, 0, 0, 0) + 2eeb50
0x00000000: ???? (558c4224d8a0, 31, 100002800, 0, 558c421fcfd0, 0) + 2a58f1b51780
0x558c42316fa0: ???? (7fe533d4e740, 1, 0, d9390842b5ab500, 7abd2b5425136d8d, 0) + ffffffffffffffc0
0x7fe533d4f0a0: ???? (7fe533d4e740, 1, 0, d9390842b5ab500, 7abd2b5425136d8d, 0) + ffffffffffffffc0
0x7fe533d4f0a0: ???? (7fe533d4e740, 1, 0, d9390842b5ab500, 7abd2b5425136d8d, 0) + ffffffffffffffc0
0x7fe533d4f0a0: ???? (7fe533d4e740, 1, 0, d9390842b5ab500, 7abd2b5425136d8d, 0) + ffffffffffffffc0
(repeated until ctrl-c)
# cat /proc/27807/stack
[<0>] futex_wait_queue_me+0xc1/0x120
[<0>] futex_wait+0x13f/0x240
[<0>] do_futex+0x3f6/0xbe0
[<0>] __x64_sys_futex+0x143/0x180
[<0>] do_syscall_64+0x53/0x110
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[<0>] 0xffffffffffffffff
This was the first time it got stuck after ~20 normal runs with the same
parameters.
Had to kill it with -9.
debian10
siege 4.0.4-1
4.19.0-13-amd64 #1 <#1> SMP Debian
4.19.160-2 (2020-11-28)
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#4 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABJRHZRJVIYZC2OQPCFQVW3S6BGNDANCNFSM4A6CHOHQ>
.
--
Jeff Fulmer
1-717-799-8226
https://www.joedog.org/
He codes
|
[root@localhost home]# siege -v Copyright (C) 2022 by Jeffrey Fulmer, et al. [root@localhost home]# uname -r [root@localhost home]# strace -p 1000474 strace: Process 1000474 detached [root@localhost home]# pstack 1000474 [root@localhost home]# cat /proc/1000474/status |
In your siege source directory, what's the value of HAVE_LOCALTIME_R in
include/config.h?
…On Sun, Apr 23, 2023 at 7:32 PM Rajendrakumar Chinnaiyan < ***@***.***> wrote:
***@***.*** home]# siege -v
*SIEGE 4.1.2*
Usage: siege [options]
siege [options] URL
siege -g URL
Options:
-V, --version VERSION, prints the version number.
-h, --help HELP, prints this section.
-C, --config CONFIGURATION, show the current config.
-v, --verbose VERBOSE, prints notification to screen.
-q, --quiet QUIET turns verbose off and suppresses output.
-g, --get GET, pull down HTTP headers and display the
transaction. Great for application debugging.
-p, --print PRINT, like GET only it prints the entire page.
-c, --concurrent=NUM CONCURRENT users, default is 10
-r, --reps=NUM REPS, number of times to run the test.
-t, --time=NUMm TIMED testing where "m" is modifier S, M, or H
ex: --time=1H, one hour test.
-d, --delay=NUM Time DELAY, random delay before each request
-b, --benchmark BENCHMARK: no delays between requests.
-i, --internet INTERNET user simulation, hits URLs randomly.
-f, --file=FILE FILE, select a specific URLS FILE.
-R, --rc=FILE RC, specify an siegerc file
-l, --log[=FILE] LOG to FILE. If FILE is not specified, the
default is used: PREFIX/var/siege.log
-m, --mark="text" MARK, mark the log file with a string.
between .001 and NUM. (NOT COUNTED IN STATS)
-H, --header="text" Add a header to request (can be many)
-A, --user-agent="text" Sets User-Agent in request
-T, --content-type="text" Sets Content-Type in request
-j, --json-output JSON OUTPUT, print final stats to stdout as JSON
--no-parser NO PARSER, turn off the HTML page parser
--no-follow NO FOLLOW, do not follow HTTP redirects
Copyright (C) 2022 by Jeffrey Fulmer, et al.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS
FOR A PARTICULAR PURPOSE.
***@***.*** home]# uname -r
*6.2.12-1.el8.elrepo.x86_64*
***@***.*** home]#
***@***.*** home]# *strace -p 1000474*
strace: Process 1000474 attached
*futex(0x7fcc84bc2380, FUTEX_WAIT_PRIVATE, 2, NULL*
strace: Process 1000474 detached
<detached ...>
***@***.*** home]# pstack 1000474
#0 0x00007fcc8482418c in __lll_lock_wait_private () from /lib64/libc.so.6
#1 <#1> 0x00007fcc848ea1c6 in
__tz_convert () from /lib64/libc.so.6
#2 <#2> 0x0000564571c166c0 in
write_to_log ()
#3 <#3> 0x0000564571c0793e in main ()
***@***.*** home]# cat /proc/1000474/sta
stack stat statm status
***@***.*** home]# cat /proc/1000474/status
Name: siege
Umask: 0022
*State: S (sleeping)*
Tgid: 1000474
Ngid: 0
Pid: 1000474
PPid: 1000473
TracerPid: 0
Uid: 0 0 0 0
Gid: 0 0 0 0
FDSize: 512
Groups: 0
NStgid: 1000474
NSpid: 1000474
NSpgid: 669512
NSsid: 6062
VmPeak: 19625852 kB
VmSize: 16849904 kB
VmLck: 0 kB
VmPin: 0 kB
VmHWM: 82876 kB
VmRSS: 76460 kB
RssAnon: 72364 kB
RssFile: 4096 kB
RssShmem: 0 kB
VmData: 337384 kB
VmStk: 132 kB
VmExe: 164 kB
VmLib: 5624 kB
VmPTE: 1232 kB
VmSwap: 0 kB
HugetlbPages: 0 kB
CoreDumping: 0
THP_enabled: 1
Threads: 1
SigQ: 1/3094693
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000007003
SigIgn: 0000000000000000
SigCgt: 0000000180000000
CapInh: 0000000000000000
CapPrm: 000001ffffffffff
CapEff: 000001ffffffffff
CapBnd: 000001ffffffffff
CapAmb: 0000000000000000
NoNewPrivs: 0
Seccomp: 0
Seccomp_filters: 0
Speculation_Store_Bypass: thread vulnerable
SpeculationIndirectBranch: conditional enabled
Cpus_allowed:
ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff
Cpus_allowed_list: 0-255
Mems_allowed:
00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,0000000f
Mems_allowed_list: 0-3
voluntary_ctxt_switches: 25
nonvoluntary_ctxt_switches: 0
—
Reply to this email directly, view it on GitHub
<#4 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABJRHZVZHTZBGKKHX3Z3BELXCW3YVANCNFSM4A6CHOHQ>
.
You are receiving this because you commented.Message ID:
***@***.***>
--
Jeff Fulmer
1-717-799-8226
https://www.joedog.org/
He codes
|
I also tried the siege source from https://download.joedog.org/siege/siege-4.1.5.tar.gz That contains HAVE_LOCALTIME_R value as 1
Built & installed using following command
That installed version on the system
Additional info Using Cent OS Stream 8:
with GCC 8.5.0-18 installed via "Development tools" group.
|
I'm confused. Was this in the version for which you sent the stack trace
above?
#define HAVE_LOCALTIME_R 1
Or is that in the one you built from source? BTW: I only support the
source code.
If that's from the source version, could you send me a stack trace of
that version when its hung?
…On Sun, Apr 23, 2023 at 11:46 PM Rajendrakumar Chinnaiyan < ***@***.***> wrote:
I also tried the siege source from
https://download.joedog.org/siege/siege-4.1.5.tar.gz
*That contains HAVE_LOCALTIME_R value as 1*
/* Define to 1 if you have the `localtime_r' function. */
#define HAVE_LOCALTIME_R 1
*Built & installed using following command*
./configure
make
make install
Additional info
*Using Cent OS Stream 8:*
NAME="CentOS Stream"
VERSION="8"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="8"
PLATFORM_ID="platform:el8"
PRETTY_NAME="CentOS Stream 8"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:8"
HOME_URL="https://centos.org/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux 8"
REDHAT_SUPPORT_PRODUCT_VERSION="CentOS Stream"
*with GCC 8.5.0-18 installed via "Development tools" group.*
Using built-in specs.
COLLECT_GCC=g++
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/8/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-redhat-linux
Configured with: ../configure --enable-bootstrap --enable-languages=c,c++,fortran,lto --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared --enable-threads=posix --enable-checking=release --enable-multilib --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-gcc-major-version-only --with-linker-hash-style=gnu --enable-plugin --enable-initfini-array --with-isl --disable-libmpx --enable-offload-targets=nvptx-none --without-cuda-driver --enable-gnu-indirect-function --enable-cet --with-tune=generic --with-arch_32=x86-64 --build=x86_64-redhat-linux
Thread model: posix
gcc version 8.5.0 20210514 (Red Hat 8.5.0-18) (GCC)
—
Reply to this email directly, view it on GitHub
<#4 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABJRHZXZWCZ4QAPXRSCHJSTXCXZSXANCNFSM4A6CHOHQ>
.
You are receiving this because you commented.Message ID:
***@***.***>
--
Jeff Fulmer
1-717-799-8226
https://www.joedog.org/
He codes
|
Sorry for confusion.. To clarify I initally used the 4.1.2 version from CentOS Distribution and later tried latest 4.1.5 from source code.. both showed the hang issue. Here is the dump from 4.1.5 where the hang observed where
|
Can you give me a stack trace of 4.1.5?
…On Mon, Apr 24, 2023 at 12:47 PM Rajendrakumar Chinnaiyan < ***@***.***> wrote:
Sorry for confusion.. To clarify I initally used the 4.1.2 version from
CentOS Distribution and later tried latest 4.1.5 from source code.. both
showed the hang issue.
Here is the dump from 4.1.5 where the hang observed where #define
HAVE_LOCALTIME_R 1
***@***.*** ~]# siege -v
SIEGE 4.1.5
Usage: siege [options]
siege [options] URL
siege -g URL
Options:
-V, --version VERSION, prints the version number.
-h, --help HELP, prints this section.
-C, --config CONFIGURATION, show the current config.
-v, --verbose VERBOSE, prints notification to screen.
-q, --quiet QUIET turns verbose off and suppresses output.
-g, --get GET, pull down HTTP headers and display the
transaction. Great for application debugging.
-p, --print PRINT, like GET only it prints the entire page.
-c, --concurrent=NUM CONCURRENT users, default is 10
-r, --reps=NUM REPS, number of times to run the test.
-t, --time=NUMm TIMED testing where "m" is modifier S, M, or H
ex: --time=1H, one hour test.
-d, --delay=NUM Time DELAY, random delay before each request
-b, --benchmark BENCHMARK: no delays between requests.
-i, --internet INTERNET user simulation, hits URLs randomly.
-f, --file=FILE FILE, select a specific URLS FILE.
-R, --rc=FILE RC, specify an siegerc file
-l, --log[=FILE] LOG to FILE. If FILE is not specified, the
default is used: PREFIX/var/siege.log
-m, --mark="text" MARK, mark the log file with a string.
between .001 and NUM. (NOT COUNTED IN STATS)
-H, --header="text" Add a header to request (can be many)
-A, --user-agent="text" Sets User-Agent in request
-T, --content-type="text" Sets Content-Type in request
-j, --json-output JSON OUTPUT, print final stats to stdout as JSON
--no-parser NO PARSER, turn off the HTML page parser
--no-follow NO FOLLOW, do not follow HTTP redirects
Copyright (C) 2022 by Jeffrey Fulmer, et al.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS
FOR A PARTICULAR PURPOSE.
***@***.*** ~]# cat /proc/2988096/status
Name: siege
Umask: 0022
State: S (sleeping)
Tgid: 2988096
Ngid: 0
Pid: 2988096
PPid: 2988095
TracerPid: 0
Uid: 0 0 0 0
Gid: 0 0 0 0
FDSize: 512
Groups: 0
NStgid: 2988096
NSpid: 2988096
NSpgid: 2678321
NSsid: 1038661
VmPeak: 18970528 kB
VmSize: 17653096 kB
VmLck: 0 kB
VmPin: 0 kB
VmHWM: 25532 kB
VmRSS: 21320 kB
RssAnon: 17224 kB
RssFile: 4096 kB
RssShmem: 0 kB
VmData: 992796 kB
VmStk: 132 kB
VmExe: 148 kB
VmLib: 5624 kB
VmPTE: 1632 kB
VmSwap: 0 kB
HugetlbPages: 0 kB
CoreDumping: 0
THP_enabled: 1
Threads: 3
SigQ: 1/3094693
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000007003
SigIgn: 0000000000000000
SigCgt: 0000000180000000
CapInh: 0000000000000000
CapPrm: 000001ffffffffff
CapEff: 000001ffffffffff
CapBnd: 000001ffffffffff
CapAmb: 0000000000000000
NoNewPrivs: 0
Seccomp: 0
Seccomp_filters: 0
Speculation_Store_Bypass: thread vulnerable
SpeculationIndirectBranch: conditional enabled
Cpus_allowed: ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff
Cpus_allowed_list: 0-255
Mems_allowed: 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,0000000f
Mems_allowed_list: 0-3
voluntary_ctxt_switches: 12
nonvoluntary_ctxt_switches: 0
***@***.*** ~]# strace -p 2988096
strace: Process 2988096 attached
futex(0x7f5bf21629d0, FUTEX_WAIT, 2988255, NULL
^Cstrace: Process 2988096 detached
<detached ...>
***@***.*** ~]# uname -r
6.2.12-1.el8.elrepo.x86_64
***@***.*** ~]#
—
Reply to this email directly, view it on GitHub
<#4 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABJRHZTPIL6MCIRXDVKDRDTXC2VDDANCNFSM4A6CHOHQ>
.
You are receiving this because you commented.Message ID:
***@***.***>
--
Jeff Fulmer
1-717-799-8226
https://www.joedog.org/
He codes
|
Here is the one.. Seeing it multiple times getting stuck at this same place.
|
I'm trying to figure out why I can't reproduce it.
…On Mon, Apr 24, 2023 at 3:38 PM Rajendrakumar Chinnaiyan < ***@***.***> wrote:
Here is the one.. Seeing it multiple times getting stuck at this same
place.
***@***.*** ~]# pstack 3663817
Thread 2 (Thread 0x7fe7b5b7a700 (LWP 3663951)):
#0 0x00007fe7f842418c in __lll_lock_wait_private () from /lib64/libc.so.6
#1 0x00007fe7f854ab89 in __check_pf () from /lib64/libc.so.6
#2 0x00007fe7f8519324 in getaddrinfo () from /lib64/libc.so.6
#3 0x000000000041b3db in new_socket (C=0x7fe698000b60, hostparam=0x1fdd480 "localhost", portparam=8000) at sock.c:156
#4 0x0000000000408e85 in __init_connection (this=0x200ed10, U=0x1fddb90) at browser.c:906
#5 0x0000000000407563 in __http (this=0x200ed10, U=0x1fddb90) at browser.c:472
#6 0x00000000004073b9 in __request (this=0x200ed10, U=0x1fddb90) at browser.c:410
#7 0x0000000000406eb6 in start (this=0x200ed10) at browser.c:294
#8 0x000000000040cd89 in crew_thread (crew=0x2020f50) at crew.c:141
#9 0x00007fe7f9a081ca in start_thread () from /lib64/libpthread.so.0
#10 0x00007fe7f8439e73 in clone () from /lib64/libc.so.6
Thread 1 (Thread 0x7fe7fa154740 (LWP 3663817)):
#0 0x00007fe7f9a096cd in __pthread_timedjoin_ex () from /lib64/libpthread.so.0
#1 0x000000000040d26d in crew_join (crew=0x2020f50, finish=boolean_true, payload=0x7fffbb4613b0) at crew.c:280
#2 0x0000000000415f28 in main (argc=10, argv=0x7fffbb461528) at main.c:507
—
Reply to this email directly, view it on GitHub
<#4 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABJRHZTCQUKX3UAVT4YJGMDXC3JETANCNFSM4A6CHOHQ>
.
You are receiving this because you commented.Message ID:
***@***.***>
--
Jeff Fulmer
1-717-799-8226
https://www.joedog.org/
He codes
|
So when you say you see that a lot, do you mean you see this combination:
#2 0x00007fe7f8519324 in getaddrinfo () from /lib64/libc.so.6
#3 0x000000000041b3db in new_socket (C=0x7fe698000b60,
hostparam=0x1fdd480 "localhost", portparam=8000) at sock.c:156
getaddrinfo is supposed to be threadsafe on Linux as long as the local
and environment doesn't change while multiple threads are running. I'd
be shocked if that was happening in your case.
…On Mon, Apr 24, 2023 at 3:38 PM Rajendrakumar Chinnaiyan < ***@***.***> wrote:
Here is the one.. Seeing it multiple times getting stuck at this same
place.
***@***.*** ~]# pstack 3663817
Thread 2 (Thread 0x7fe7b5b7a700 (LWP 3663951)):
#0 0x00007fe7f842418c in __lll_lock_wait_private () from /lib64/libc.so.6
#1 0x00007fe7f854ab89 in __check_pf () from /lib64/libc.so.6
#2 0x00007fe7f8519324 in getaddrinfo () from /lib64/libc.so.6
#3 0x000000000041b3db in new_socket (C=0x7fe698000b60, hostparam=0x1fdd480 "localhost", portparam=8000) at sock.c:156
#4 0x0000000000408e85 in __init_connection (this=0x200ed10, U=0x1fddb90) at browser.c:906
#5 0x0000000000407563 in __http (this=0x200ed10, U=0x1fddb90) at browser.c:472
#6 0x00000000004073b9 in __request (this=0x200ed10, U=0x1fddb90) at browser.c:410
#7 0x0000000000406eb6 in start (this=0x200ed10) at browser.c:294
#8 0x000000000040cd89 in crew_thread (crew=0x2020f50) at crew.c:141
#9 0x00007fe7f9a081ca in start_thread () from /lib64/libpthread.so.0
#10 0x00007fe7f8439e73 in clone () from /lib64/libc.so.6
Thread 1 (Thread 0x7fe7fa154740 (LWP 3663817)):
#0 0x00007fe7f9a096cd in __pthread_timedjoin_ex () from /lib64/libpthread.so.0
#1 0x000000000040d26d in crew_join (crew=0x2020f50, finish=boolean_true, payload=0x7fffbb4613b0) at crew.c:280
#2 0x0000000000415f28 in main (argc=10, argv=0x7fffbb461528) at main.c:507
—
Reply to this email directly, view it on GitHub
<#4 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABJRHZTCQUKX3UAVT4YJGMDXC3JETANCNFSM4A6CHOHQ>
.
You are receiving this because you commented.Message ID:
***@***.***>
--
Jeff Fulmer
1-717-799-8226
https://www.joedog.org/
He codes
|
May be something related? https://bugzilla.redhat.com/show_bug.cgi?id=1209433 |
Could you put
pthread_testcancel();
on the line before getaddrinfo() which should be at sock.c 156
And see if you still get this problem?
…On Mon, Apr 24, 2023 at 6:25 PM Rajendrakumar Chinnaiyan < ***@***.***> wrote:
*May be something related?*
https://bugzilla.redhat.com/show_bug.cgi?id=1209433 <http://url>
https://sourceware.org/bugzilla/show_bug.cgi?id=20975 <http://url>
—
Reply to this email directly, view it on GitHub
<#4 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABJRHZRLYPKMC7BITSWHBLDXC34WJANCNFSM4A6CHOHQ>
.
You are receiving this because you commented.Message ID:
***@***.***>
--
Jeff Fulmer
1-717-799-8226
https://www.joedog.org/
He codes
|
Still stuck in similar way...
|
I'm not sure how to fix this:
https://bugzilla.redhat.com/show_bug.cgi?id=1405071
in the source code of glibc, sysdeps/unix/sysv/linux/check_pf.c
between L322-L356, there are pthread cancellation point in __socket,
__bind, or make_request. If we get pthread_cancel, when code goes in
L322-L356 the check_pf lock is left locked.
by the way the upstream glibc seems has no such issue.
Your work-around would be to use repetition-based testing instead of
timed testing.
Use -r NUM/--reps=NUM instead of -tNUM/--time=NUM
…On Mon, Apr 24, 2023 at 7:25 PM Rajendrakumar Chinnaiyan < ***@***.***> wrote:
Still stuck in similar way...
***@***.*** ~]# pstack 1339484
Thread 2 (Thread 0x7fed5112e700 (LWP 1339694)):
#0 0x00007fedb9a2418c in __lll_lock_wait_private () from /lib64/libc.so.6
#1 0x00007fedb9b4ab89 in __check_pf () from /lib64/libc.so.6
#2 0x00007fedb9b19324 in getaddrinfo () from /lib64/libc.so.6
#3 0x0000000000416446 in new_socket ***@***.***=0x7fea8c000b60, hostparam=<optimized out>, ***@***.***=8000) at sock.c:156
#4 0x00000000004068e6 in __init_connection ***@***.***=0x96ba70, ***@***.***=0x92e770) at browser.c:906
#5 0x0000000000407090 in __http (U=0x92e770, this=0x96ba70) at browser.c:472
#6 __request ***@***.***=0x96ba70, ***@***.***=0x92e770) at browser.c:410
#7 0x0000000000408415 in start (this=0x96ba70) at browser.c:294
#8 0x000000000040ae85 in crew_thread (crew=0x962f60) at crew.c:141
#9 0x00007fedbb0081ca in start_thread () from /lib64/libpthread.so.0
#10 0x00007fedb9a39e73 in clone () from /lib64/libc.so.6
Thread 1 (Thread 0x7fedbb68a740 (LWP 1339484)):
#0 0x00007fedbb0096cd in __pthread_timedjoin_ex () from /lib64/libpthread.so.0
#1 0x000000000040b429 in crew_join ***@***.***=0x962f60, ***@***.***=boolean_true, ***@***.***=0x7ffc68575f50) at crew.c:280
#2 0x0000000000403965 in main (argc=<optimized out>, argv=<optimized out>) at main.c:507
—
Reply to this email directly, view it on GitHub
<#4 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABJRHZV2CP4T5BBRFGKVV6LXC4DW3ANCNFSM4A6CHOHQ>
.
You are receiving this because you commented.Message ID:
***@***.***>
--
Jeff Fulmer
1-717-799-8226
https://www.joedog.org/
He codes
|
Anyone who hits his issue might want to check https://sourceware.org/pipermail/libc-alpha/2023-April/147654.html |
Thanks for posting that. I appreciate it.
It's important to note that this only occurs on time-based testing. You can
work around this issue with -r NUM / --reps=NUM
…On Fri, Apr 28, 2023 at 1:26 PM dreddy ***@***.***> wrote:
Anyone who hits his issue might want to check
https://sourceware.org/pipermail/libc-alpha/2023-April/147654.html
—
Reply to this email directly, view it on GitHub
<#4 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABJRHZTHR24O5EOB6FQQ623XDP4TBANCNFSM4A6CHOHQ>
.
You are receiving this because you commented.Message ID:
***@***.***>
--
Jeff Fulmer
1-717-799-8226
https://www.joedog.org/
He codes
|
There are reports for hang in __check_pf: JoeDog/siege#4 It is reproducible only under specific configurations: 1. Large number of cores (>= 64) and large number of threads (> 3X of the number of cores) with long lived socket connection. 2. Low power (frequency) mode. 3. Power management is enabled. While holding lock, __check_pf calls make_request which calls __sendto and __recvmsg. Since __sendto and __recvmsg are cancellation points, lock held by __check_pf won't be released and can cause deadlock when thread cancellation happens in __sendto or __recvmsg. Add a cancellation cleanup handler for __check_pf to unlock the lock when cancelled by another thread. This fixes BZ #20975 and the siege hang issue.
There are reports for hang in __check_pf: JoeDog/siege#4 It is reproducible only under specific configurations: 1. Large number of cores (>= 64) and large number of threads (> 3X of the number of cores) with long lived socket connection. 2. Low power (frequency) mode. 3. Power management is enabled. While holding lock, __check_pf calls make_request which calls __sendto and __recvmsg. Since __sendto and __recvmsg are cancellation points, lock held by __check_pf won't be released and can cause deadlock when thread cancellation happens in __sendto or __recvmsg. Add a cancellation cleanup handler for __check_pf to unlock the lock when cancelled by another thread. This fixes BZ #20975 and the siege hang issue. (cherry picked from commit a443bd3)
There are reports for hang in __check_pf: JoeDog/siege#4 It is reproducible only under specific configurations: 1. Large number of cores (>= 64) and large number of threads (> 3X of the number of cores) with long lived socket connection. 2. Low power (frequency) mode. 3. Power management is enabled. While holding lock, __check_pf calls make_request which calls __sendto and __recvmsg. Since __sendto and __recvmsg are cancellation points, lock held by __check_pf won't be released and can cause deadlock when thread cancellation happens in __sendto or __recvmsg. Add a cancellation cleanup handler for __check_pf to unlock the lock when cancelled by another thread. This fixes BZ #20975 and the siege hang issue. (cherry picked from commit a443bd3)
There are reports for hang in __check_pf: JoeDog/siege#4 It is reproducible only under specific configurations: 1. Large number of cores (>= 64) and large number of threads (> 3X of the number of cores) with long lived socket connection. 2. Low power (frequency) mode. 3. Power management is enabled. While holding lock, __check_pf calls make_request which calls __sendto and __recvmsg. Since __sendto and __recvmsg are cancellation points, lock held by __check_pf won't be released and can cause deadlock when thread cancellation happens in __sendto or __recvmsg. Add a cancellation cleanup handler for __check_pf to unlock the lock when cancelled by another thread. This fixes BZ #20975 and the siege hang issue. (cherry picked from commit a443bd3)
There are reports for hang in __check_pf: JoeDog/siege#4 It is reproducible only under specific configurations: 1. Large number of cores (>= 64) and large number of threads (> 3X of the number of cores) with long lived socket connection. 2. Low power (frequency) mode. 3. Power management is enabled. While holding lock, __check_pf calls make_request which calls __sendto and __recvmsg. Since __sendto and __recvmsg are cancellation points, lock held by __check_pf won't be released and can cause deadlock when thread cancellation happens in __sendto or __recvmsg. Add a cancellation cleanup handler for __check_pf to unlock the lock when cancelled by another thread. This fixes BZ #20975 and the siege hang issue. (cherry picked from commit a443bd3)
There are reports for hang in __check_pf: JoeDog/siege#4 It is reproducible only under specific configurations: 1. Large number of cores (>= 64) and large number of threads (> 3X of the number of cores) with long lived socket connection. 2. Low power (frequency) mode. 3. Power management is enabled. While holding lock, __check_pf calls make_request which calls __sendto and __recvmsg. Since __sendto and __recvmsg are cancellation points, lock held by __check_pf won't be released and can cause deadlock when thread cancellation happens in __sendto or __recvmsg. Add a cancellation cleanup handler for __check_pf to unlock the lock when cancelled by another thread. This fixes BZ #20975 and the siege hang issue. (cherry picked from commit a443bd3)
There are reports for hang in __check_pf: JoeDog/siege#4 It is reproducible only under specific configurations: 1. Large number of cores (>= 64) and large number of threads (> 3X of the number of cores) with long lived socket connection. 2. Low power (frequency) mode. 3. Power management is enabled. While holding lock, __check_pf calls make_request which calls __sendto and __recvmsg. Since __sendto and __recvmsg are cancellation points, lock held by __check_pf won't be released and can cause deadlock when thread cancellation happens in __sendto or __recvmsg. Add a cancellation cleanup handler for __check_pf to unlock the lock when cancelled by another thread. This fixes BZ #20975 and the siege hang issue. (cherry picked from commit a443bd3) (cherry picked from commit f5d377c896b95fefc712b0fd5e5804ae3f48d392)
siege work on:
siege run with command:
sh -c "/usr/bin/siege -c 4 -f /tmp/load_testser_0.955169454146.urls -d 10 -r1 -i;" 2>&1
sometimes siege process goto sleep mode:
try to view some info via pstack:
siege wait for "futex_wait_queue_me" to complete
The text was updated successfully, but these errors were encountered: