-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ci] Kill hanged docker build process to avoid build timeout issue. #13726
Conversation
do | ||
sleep 120 | ||
now=$(date +%s) | ||
pids=$(ps aux | grep -v grep | grep -E "^.{,100}docker build" | awk '{print$2}') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You want to list the docker PIDs, right? A small improvement.
ps -C docker -o pid,etime,args | grep "docker build"
for line in lines:
do something
And etime can be used, not necessary to print the time again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think 'ps -C' is good. But etime is not good to use.
etime has a format '[[dd-]hh:]mm:ss'. If I want to compare it with 1 hour, I need to compare the etime length.
pids=$(ps -C docker -o pid,etime,args | grep "docker build" | cut -d" " -f1) | ||
for pid in $pids | ||
do | ||
start=$(date --date="$(ls -dl /proc/$pid --time-style full-iso | awk '{print$6,$7}')" +%s) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ps -o "lstart" $pid
@@ -110,6 +110,25 @@ jobs: | |||
buildSteps: | |||
- template: template-skipvstest.yml | |||
- bash: | | |||
( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add the step in a template, and the release branch use the same template?
…onic-net#13726) Why I did it Docker build has a low rate of hanging up. It hangs on different steps. So, it looks like a bug in docker daemon. How I did it Start a daemon process to scan running time more than 1 hours, and kill the process. How to verify it
…onic-net#13726) Why I did it Docker build has a low rate of hanging up. It hangs on different steps. So, it looks like a bug in docker daemon. How I did it Start a daemon process to scan running time more than 1 hours, and kill the process. How to verify it
…onic-net#13726) Why I did it Docker build has a low rate of hanging up. It hangs on different steps. So, it looks like a bug in docker daemon. How I did it Start a daemon process to scan running time more than 1 hours, and kill the process. How to verify it
…onic-net#13726) Why I did it Docker build has a low rate of hanging up. It hangs on different steps. So, it looks like a bug in docker daemon. How I did it Start a daemon process to scan running time more than 1 hours, and kill the process. How to verify it
Related work items: sonic-net#276, sonic-net#305, sonic-net#332, sonic-net#338, sonic-net#339, sonic-net#1188, sonic-net#1192, sonic-net#1197, sonic-net#1206, sonic-net#1685, sonic-net#1690, sonic-net#1696, sonic-net#1699, sonic-net#1709, sonic-net#1727, sonic-net#1737, sonic-net#1741, sonic-net#1742, sonic-net#2511, sonic-net#2512, sonic-net#2532, sonic-net#2559, sonic-net#2626, sonic-net#2638, sonic-net#2645, sonic-net#2649, sonic-net#2660, sonic-net#2669, sonic-net#2670, sonic-net#2678, sonic-net#10084, sonic-net#11442, sonic-net#11873, sonic-net#12047, sonic-net#12110, sonic-net#12207, sonic-net#12529, sonic-net#12678, sonic-net#13235, sonic-net#13287, sonic-net#13372, sonic-net#13395, sonic-net#13456, sonic-net#13497, sonic-net#13522, sonic-net#13545, sonic-net#13547, sonic-net#13552, sonic-net#13569, sonic-net#13572, sonic-net#13578, sonic-net#13591, sonic-net#13611, sonic-net#13647, sonic-net#13649, sonic-net#13660, sonic-net#13710, sonic-net#13716, sonic-net#13724, sonic-net#13726, sonic-net#13732, sonic-net#13735, sonic-net#13739, sonic-net#13757, sonic-net#13786, sonic-net#13792, sonic-net#13800, sonic-net#13801, sonic-net#13802, sonic-net#13805, sonic-net#13806, sonic-net#13812, sonic-net#13814, sonic-net#13822, sonic-net#13831, sonic-net#13834, sonic-net#13847, sonic-net#13870, sonic-net#13882, sonic-net#13884, sonic-net#13885, sonic-net#13894, sonic-net#13895, sonic-net#13926, sonic-net#13932, sonic-net#13935, sonic-net#13942, sonic-net#13951, sonic-net#13953, sonic-net#13964
Why I did it
Docker build has a low rate of hanging up.
It hangs on different steps. So, it looks like a bug in docker daemon.
How I did it
Start a daemon process to scan running time more than 1 hours, and kill the process.
How to verify it
Which release branch to backport (provide reason below if selected)
Description for the changelog
Ensure to add label/tag for the feature raised. example - PR#2174 under sonic-utilities repo. where, Generic Config and Update feature has been labelled as GCU.
Link to config_db schema for YANG module changes
A picture of a cute animal (not mandatory but encouraged)