-
Notifications
You must be signed in to change notification settings - Fork 166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stalled processes not cleared on IBM i #2937
Comments
(I'm assuming we can manually clear the stalled process to get the CI passing but it would be good if the automation in the build scripts just worked.) |
This is very strange, indeed! Strangely:
So an easy fix would be to simply change the Makefile do use Regardless, I'm still trying to figure out root cause. IBM i has two different types of signals: ILE and PASE (Node.js runs in PASE), and the numerical representations differ:
|
|
👍 |
this works clear-stalled:
$(info Clean up any leftover processes but don't error if found.)
ps awwx | grep Release/node | grep -v grep | cat
@PS_OUT=`ps awwx | grep Release/node | grep -v grep | awk '{print $$1}'`; \
if [ "$${PS_OUT}" ]; then \
kill -9 $${PS_OUT}; \
fi as does (as mentioned) clear-stalled:
$(info Clean up any leftover processes but don't error if found.)
ps awwx | grep Release/node | grep -v grep | cat
@PS_OUT=`ps awwx | grep Release/node | grep -v grep | awk '{print $$1}'`; \
if [ "$${PS_OUT}" ]; then \
echo $${PS_OUT} | xargs -t kill -KILL; \
fi In my experimentation, it seems that |
We debugged this today and discovered the root cause turns out to a bug in the GNU This affects using GNU kill with pretty much any numeric value, not just I'm working on an update with the fix, but due to some infrastructure issues this won't be available for a while. |
This issue is stale because it has been open many days with no activity. It will be closed soon unless the stale label is removed or a comment is made. |
@abmusse is this something you could take a look at? |
Yes, I'll take a look at this one |
Looks like we push the fix up in |
@abmusse On test-iinthecloud-ibmi73-ppc64_be-1: -bash-5.1$ yum info coreutils-gnu
Installed Packages
Name : coreutils-gnu
Arch : ppc64
Version : 8.25
Release : 6
Size : 118 M
Repo : installed
From repo : ibm
Summary : GNU coreutils
URL : https://www.gnu.org/software/coreutils
License : GPL-3.0-or-later
Description : The GNU Core Utilities are the basic file, shell and text manipulation utilities
: of the GNU operating system. These are the core utilities which are expected to
: exist on every operating system.
-bash-5.1$ yum upgrade coreutils-gnu
Setting up Upgrade Process
No Packages marked for Update
-bash-5.1$ |
What repos does this box have? yum repolist all We migrated base repos last year. This box may need the ibmi-repos upgrade. https://ibmi-oss-docs.readthedocs.io/en/latest/yum/IBM_REPOS.html#transition |
-bash-5.1$ yum repolist all
repo id repo name status
ibm ibm enabled: 1002
ibm-7.3 ibm-7.3 disabled
ibmi-base IBM i base enabled: 1002
ibmi-release IBM i 7.3 enabled: 67
repolist: 2071
-bash-5.1$ |
What url does ibmi-base point to?
I suspect its outdated and the |
We need to upgrade ibmi-repos package. yum upgrade ibmi-repos Then we should also disable the old ibm repo yum-config-manager --disable ibm After that the latest coreutils-gnu should be installable! |
@abmusse thanks for taking a lok and create to see you and @richardlau moving it forward. |
@richardlau |
Update to use current yum repositories for IBM i 7.3. Install Python 3.9, and use it to install `tap2junit`. Do not set group on the `.ssh` directory on platforms such as IBM i and z/OS where we do not create a group. Refs: nodejs#2937 Refs: https://ibmi-oss-docs.readthedocs.io/en/latest/yum/IBM_REPOS.html
Ansible changes, including using the correct yum repositories: #3358 |
Update to use current yum repositories for IBM i 7.3. Install Python 3.9, and use it to install `tap2junit`. Do not set group on the `.ssh` directory on platforms such as IBM i and z/OS where we do not create a group. Refs: #2937 Refs: https://ibmi-oss-docs.readthedocs.io/en/latest/yum/IBM_REPOS.html
Update to use current yum repositories for IBM i 7.3. Install Python 3.9, and use it to install `tap2junit`. Do not set group on the `.ssh` directory on platforms such as IBM i and z/OS where we do not create a group. Refs: nodejs#2937 Refs: https://ibmi-oss-docs.readthedocs.io/en/latest/yum/IBM_REPOS.html
We are now using the correct IBM i yum repositories and |
IBM i builds have been failing on
test-iinthecloud-ibmi73-ppc64_be-1
since https://ci.nodejs.org/job/node-test-commit-ibmi/743/nodes=ibmi73-ppc64/ due to a dangling node process.i.e. https://ci.nodejs.org/job/node-test-commit-ibmi/743/nodes=ibmi73-ppc64/consoleFull
This process is leftover from https://ci.nodejs.org/job/node-test-commit-ibmi/742/nodes=ibmi73-ppc64/ where
parallel/test-child-process-exec-abortcontroller-promisified
timed out -- the test spawns the process in https://github.com/nodejs/node/blob/e46c680bf2b211bbd52cf959ca17ee98c7f657f5/test/parallel/test-child-process-exec-abortcontroller-promisified.js#L15The Node.js
Makefile
is supposed to be able to clear stalled/danglingout/Release/node
processes inclear-stalled
: https://github.com/nodejs/node/blob/68fb0bf553e2af3e0b61733d29e1e9ba7f73d9b2/Makefile#L460-L466but it looks like on IBM i this isn't killing the process:
If I add some debug into the Makefile I can see that xargs gets the process ID but it looks like
kill -9
isn't terminating the process?@ThePrez Any ideas?
The text was updated successfully, but these errors were encountered: