-
Notifications
You must be signed in to change notification settings - Fork 136
[WIP] template/router: explicitly handle "no child processes" error #78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] template/router: explicitly handle "no child processes" error #78
Conversation
4223c25 to
2b7d19c
Compare
|
/hold Until I'm utterly convinced by my assertions and reality. |
|
/test e2e-aws |
|
/test images |
|
Latest hack looks like: diff --git a/Makefile b/Makefile
index e866dff..173e4f2 100644
--- a/Makefile
+++ b/Makefile
@@ -20,7 +20,7 @@ define version-ldflags
-X $(1).buildDate="$(shell date -u +'%Y-%m-%dT%H:%M:%SZ')"
endef
GO_LD_EXTRAFLAGS ?=
-GO_LDFLAGS ?=-ldflags "-s -w $(call version-ldflags,$(PACKAGE)/pkg/version) $(GO_LD_EXTRAFLAGS)"
+GO_LDFLAGS ?=-ldflags "$(call version-ldflags,$(PACKAGE)/pkg/version) $(GO_LD_EXTRAFLAGS)"
GO=GO111MODULE=on GOFLAGS=-mod=vendor go
GO_BUILD_RECIPE=CGO_ENABLED=0 $(GO) build -o $(BIN) $(GO_GCFLAGS) $(GO_LDFLAGS) $(MAIN_PACKAGE)
@@ -30,7 +30,7 @@ all: build
build:
$(GO_BUILD_RECIPE)
-images/router/*/Dockerfile: images/router/base/Dockerfile
+images/router/*/Dockerfile:
imagebuilder -t registry.svc.ci.openshift.org/openshift/origin-v4.0:`basename $(@D)`-router -f images/router/`basename $(@D)`/Dockerfile .
images/router/*/Dockerfile.rhel: images/router/base/Dockerfile.rhel
diff --git a/images/router/haproxy/Dockerfile b/images/router/haproxy/Dockerfile
index 1a646b1..4cb8328 100644
--- a/images/router/haproxy/Dockerfile
+++ b/images/router/haproxy/Dockerfile
@@ -1,5 +1,11 @@
-FROM registry.svc.ci.openshift.org/openshift/origin-v4.0:base-router
-RUN INSTALL_PKGS="haproxy20 rsyslog sysvinit-tools" && \
+# FROM registry.access.redhat.com/ubi8/ubi-init
+# RUN yum -y install strace
+
+FROM centos:centos7
+RUN yum -y install strace wget
+RUN rpm -ivh http://spicy.frobware.com/~aim/x86_64/haproxy21-2.1.2-1.el7.x86_64.rpm
+RUN /usr/sbin/haproxy -vv
+RUN INSTALL_PKGS="strace procps-ng socat rsyslog sysvinit-tools" && \
yum install -y $INSTALL_PKGS && \
rpm -V $INSTALL_PKGS && \
yum clean all && \
@@ -9,7 +15,10 @@ RUN INSTALL_PKGS="haproxy20 rsyslog sysvinit-tools" && \
setcap 'cap_net_bind_service=ep' /usr/sbin/haproxy && \
chown -R :0 /var/lib/haproxy && \
chmod -R g+w /var/lib/haproxy
+RUN wget -O /usr/local/bin/dumb-init https://github.com/Yelp/dumb-init/releases/download/v1.2.2/dumb-init_1.2.2_amd64
+RUN chmod +x /usr/local/bin/dumb-init
COPY images/router/haproxy/* /var/lib/haproxy/
+COPY openshift-router /usr/bin/openshift-router
LABEL io.k8s.display-name="OpenShift HAProxy Router" \
io.k8s.description="This component offers ingress to an OpenShift cluster via Ingress and Route rules." \
io.openshift.tags="openshift,router,haproxy"
@@ -18,4 +27,5 @@ EXPOSE 80 443
WORKDIR /var/lib/haproxy/conf
ENV TEMPLATE_FILE=/var/lib/haproxy/conf/haproxy-config.template \
RELOAD_SCRIPT=/var/lib/haproxy/reload-haproxy
-ENTRYPOINT ["/usr/bin/openshift-router"]
+ENTRYPOINT ["/usr/local/bin/dumb-init", "--"]
+CMD ["/usr/bin/openshift-router"]
diff --git a/pkg/router/template/router.go b/pkg/router/template/router.go
index 7122cee..6e3788f 100644
--- a/pkg/router/template/router.go
+++ b/pkg/router/template/router.go
@@ -549,7 +549,7 @@ func (r *templateRouter) reloadRouter() error {
// CombinedOutput(). The logic there calls Start(), then
// Wait() and that could be racy if there is a GC pause (or
// other scheduling activity).
- if err != nil && !noChildProcessesRegExp.MatchString(err.Error()) {
+ if err != nil {
return fmt.Errorf("error reloading router: %v\n%s", err, string(out))
}
log.V(0).Info("router reloaded", "output", string(out)) |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: frobware The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
5a98727 to
70f6441
Compare
b6d2cdb to
a4ada1d
Compare
863c3ea to
72db938
Compare
|
I don’t think we should do this. If this is a “reap happens too fast and impacts the app” bug, let’s just fix reap. Pid 1 isn’t hard. |
72db938 to
5e4c64b
Compare
5e4c64b to
dbdbb64
Compare
But doesn't this slow it down for all consumers now? Is this desirable? IIRC, this library was being used by, or was developed for, |
The issue we have is that sometimes the |
|
@frobware: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
|
Can be closed by #111 |
DO NOT MERGE.
We were getting races when reloading haproxy via the reload-haproxy script because we had our own process reaper (
StartReaper). Occasionally the reload would reportno child processesand this happened whenStartReaperhad already reaped the reload script that we were independently waiting on elsewhere.This summarises the situation we currently have: krallin/tini#8 (comment)
It seems better to separate these concerns so using
catatonit[1] to do that, so:catatonitTODO: decide how/where we get
catatonitfrom, options are:[1] https://github.com/openSUSE/catatonit