Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to complete deployment with awx-operator #275

Closed
rwalsh3711 opened this issue May 3, 2021 · 7 comments · Fixed by #303
Closed

Unable to complete deployment with awx-operator #275

rwalsh3711 opened this issue May 3, 2021 · 7 comments · Fixed by #303
Assignees
Labels
type:bug Something isn't working

Comments

@rwalsh3711
Copy link

ISSUE TYPE
  • Bug Report
SUMMARY

The awx-operator builds the postgres container, but the awx containers never get past 'Pending'. The awx-operator logs loop through steps and a "Reconciler error" repeats through the process.

ENVIRONMENT
  • AWX version: 19.1.0
  • Operator version: 0.9.0
  • Kubernetes version: Minikube 1.18.1
  • AWX install method: Minikube deployment on VM running CentOS v7.9.2009
STEPS TO REPRODUCE

Following commands are run as root

# minikube start --driver=none --addons=ingress
# kubectl apply -f https://raw.githubusercontent.com/ansible/awx-operator/0.9.0/deploy/awx-operator.yaml
# minikube kubectl -- apply -f lab_awx.yml
# minikube kubectl logs -- -f awx-operator-5595d6fc57-gwntw

lab_awx.yml file contents:

apiVersion: awx.ansible.com/v1beta1
kind: AWX
metadata:
  name: awx
EXPECTED RESULTS

Expecting the kubectl apply command to complete and awx pods to be in 'Running' state

ACTUAL RESULTS

The logs loop through the deployment steps indefinitely and the awx pods never proceed passed 'Pending'

NAME                            READY   STATUS    RESTARTS   AGE
awx-5b58db49c-h6sqn             0/4     Pending   0          18m
awx-operator-5595d6fc57-gwntw   1/1     Running   0          61m
awx-postgres-0                  1/1     Running   0          18m
ADDITIONAL INFORMATION

The error I find in the logs seems to point to a "Reconciler error". I've attempted the fix offered in issue issue #205 where I perform a delete of "ingress-nginx-admission", however the issue remains.

kubectl delete -A ValidatingWebhookConfiguration ingress-nginx-admission
AWX-OPERATOR LOGS

The following error keeps repeating:

{"level":"error","ts":1620071907.235778,"logger":"controller-runtime.controller","msg":"Reconciler error","controller":"awx-controller","request":"default/awx","error":"event runner on failed","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\tpkg/mod/github.com/go-logr/[email protected]/zapr.go:128\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\tpkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:258\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\tpkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:232\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\n\tpkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:211\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1\n\tpkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:155\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil\n\tpkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:156\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\tpkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:133\nk8s.io/apimachinery/pkg/util/wait.Until\n\tpkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:90"}
@emdzej1987
Copy link

The same issue for me.

@FullofQuarks
Copy link

I too am unable to deploy with the awx-operator.

AWX Operator version: 0.9.0
Kubernetes version: Bare metal 1.19.8

This is the formatted error that I am getting:

The error was: 'tower_loadbalancer_annotations' is undefined

The error appears to be in '/opt/ansible/roles/installer/tasks/resources_configuration.yml': line 20, column 3, but may
be elsewhere in the file depending on the exact syntax problem.

The offending line appears to be:


- name: Apply Resources
  ^ here
  \"}
  
  PLAY RECAP *********************************************************************
  localhost                  : ok=27   changed=0    unreachable=0    failed=1    skipped=25   rescued=0    ignored=0   
  
  ","job":"2015796113853353331","name":"awx-tower","namespace":"awx","error":"exit status 2","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error
    pkg/mod/github.com/go-logr/[email protected]/zapr.go:128
  github.com/operator-framework/operator-sdk/pkg/ansible/runner.(*runner).Run.func1
    src/github.com/operator-framework/operator-sdk/pkg/ansible/runner/runner.go:239"}

{"level":"error","ts":1620152374.7633853,"logger":"controller-runtime.controller","msg":"Reconciler error","controller":"awx-controller","request":"awx/awx-tower","error":"event runner on failed","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error
    pkg/mod/github.com/go-logr/[email protected]/zapr.go:128
  sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
    pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:258
  sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
    pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:232
  sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker
    pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:211
  k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1
    pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:155
  k8s.io/apimachinery/pkg/util/wait.BackoffUntil
    pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:156
  k8s.io/apimachinery/pkg/util/wait.JitterUntil
    pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:133
  k8s.io/apimachinery/pkg/util/wait.Until
    pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:90"

@tchellomello tchellomello self-assigned this May 5, 2021
@tchellomello
Copy link
Contributor

I'll install a minikube tonight and investigate this issue. I cannot reproduce it on a vanilla k8s 1.20.6

@emdzej1987
Copy link

emdzej1987 commented May 6, 2021

I was able to deploy successfully now using default configuration (not using LoadBalancer). I was able to sync the Demo Project but Demo job is not working. It is lunching but it is stuck in Running state. In awx-ee there are those errors:

INFO 2021/05/06 06:58:03 Client connected to control service INFO 2021/05/06 06:58:06 Stdout complete - closing channel ERROR 2021/05/06 06:58:06 Read error in control service: read unix /var/run/receptor/receptor.sock->@: use of closed network connection INFO 2021/05/06 06:58:06 Client disconnected from control service ERROR 2021/05/06 06:58:06 Error closing connection: close unix /var/run/receptor/receptor.sock->@: use of closed network connection INFO 2021/05/06 06:58:06 Client connected to control service INFO 2021/05/06 06:58:06 Control service closed INFO 2021/05/06 06:58:06 Client disconnected from control service

EDIT: OK so in my case the issue is that awx-job pod is created on wrong node. Is there a way to select on which node to create job pods?

EDIT2: If you want to use LoadBalancer you need to add empty annotations line:
tower_ingress_type: LoadBalancer
tower_loadbalancer_annotations: |

EDIT3: OK found it in Instance groups and Customize pod specification.

@FullofQuarks
Copy link

@emdzej1987 is correct. In order to use the LoadBalancer ingress type, you have to have some annotations set, even if it's an empty line. While this allows the deployment to work, I'm unsure if it's useful if I have no annotations to add (and the documentation does not state that it is required)

@tchellomello
Copy link
Contributor

Yes, that is a bug. Added a PR to address this. In the meantime, you can add this on your spec:

spec:
  tower_ingress_type: LoadBalancer
  tower_loadbalancer_annotations: ''

@Vitexus
Copy link

Vitexus commented Nov 22, 2022

AWX 21.9.0 Operator 1.1.0

{"level":"error","ts":1669130659.0954497,"msg":"Reconciler error","controller":"awx-controller","object":{"name":"awx","namespace":"awx-test-01-app"},"namespace":"awx-test-01-app","name":"awx","reconcileID":"918b81de-22fe-40d7-92c3-2a72b176f0d0","error":"event runner on failed","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:273\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:234"}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants