Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ansible: upgrade test-nearform_intel-ubuntu1604-x64-2 #3122

Merged
merged 2 commits into from
Dec 21, 2022

Conversation

targos
Copy link
Member

@targos targos commented Dec 18, 2022

Update the second benchmark machine to Ubuntu 18.04.

Update the second benchmark machine to Ubuntu 18.04.
@targos
Copy link
Member Author

targos commented Dec 18, 2022

I am currently blocked on this error when running the playbook.

apt update manually runs without errors.

fatal: [test-nearform_intel-ubuntu1804-x64-2]: FAILED! => {
    "ansible_facts": {
        "discovered_interpreter_python": "/usr/bin/python3"
    },
    "changed": false,
    "invocation": {
        "module_args": {
            "allow_change_held_packages": false,
            "allow_downgrade": false,
            "allow_unauthenticated": false,
            "autoclean": false,
            "autoremove": false,
            "cache_valid_time": 0,
            "clean": false,
            "deb": null,
            "default_release": null,
            "dpkg_options": "force-confdef,force-confold",
            "fail_on_autoremove": false,
            "force": false,
            "force_apt_get": false,
            "install_recommends": null,
            "lock_timeout": 60,
            "only_upgrade": false,
            "package": null,
            "policy_rc_d": null,
            "purge": false,
            "state": "present",
            "update_cache": true,
            "update_cache_retries": 5,
            "update_cache_retry_max_delay": 12,
            "upgrade": "dist"
        }
    },
    "msg": "Failed to update apt cache: unknown reason"
}

@targos targos marked this pull request as draft December 18, 2022 19:25
@sxa
Copy link
Member

sxa commented Dec 19, 2022

I am currently blocked on this error when running the playbook.

@targos To clarify, was this a normal distribution upgrade and therefore other than the hostname this machine should till be fully functional as it was before as opposed to being blocked due to the playbook not working?

@targos
Copy link
Member Author

targos commented Dec 19, 2022

I did a do-release-upgrade which went fine but I don't know if the system is in a usable state for benchmarking jobs

@richardlau
Copy link
Member

I am currently blocked on this error when running the playbook.

apt update manually runs without errors.

That's annoying 😞. Normally when this task fails there's usually a warning or error message from the underlying apt command when run manually -- bad repository (no longer exists) or expired gpg key are the ones I remember hitting before. I think the bad repository one may have been a warning and not an error (i.e. the apt update completed) but Ansible treated the warning as an error.

@targos
Copy link
Member Author

targos commented Dec 20, 2022

If I comment out this step, the next error is even weirder:

...
"cmd": "None -y -o Dpkg::Options::=--force-confdef -o Dpkg::Options::=--force-confold autoremove",
...
"msg": "[Errno 2] No such file or directory: b'None': b'None'",

@targos
Copy link
Member Author

targos commented Dec 20, 2022

Wow, I found the problem 🤦🏻‍♂️

The first error I got was

TASK [check if secret is properly set] **************************************************************************************************************************************************************************
task path: /home/mzasso/git/nodejs/build/ansible/playbooks/jenkins/worker/create.yml:27
fatal: [test-nearform_intel-ubuntu1804-x64-2]: FAILED! => {
    "changed": false,
    "failed_when_result": "The conditional check 'not secret' failed. The error was: error while evaluating conditional (not secret): 'secret' is undefined",
    "msg": "Failed as requested from task"
}

To fix it, I just copied and renamed the template. After deleting all variables except the placeholder secret, the playbook works better.

@targos targos marked this pull request as ready for review December 20, 2022 08:12
@targos
Copy link
Member Author

targos commented Dec 20, 2022

I'm almost done. It just seems that it's running with the wrong secret? Not sure how to reconnect it to Jenkins.

@sxa
Copy link
Member

sxa commented Dec 20, 2022

I'm almost done. It just seems that it's running with the wrong secret? Not sure how to reconnect it to Jenkins.

Ah - you've renamed it in jenkins which means it gets a new secret. The jenkins agent is still trying to connect to the old name of test-nearform_intel-ubuntu1604-x64-2 so they are inconsistent. The inventory.tml in the secrets repo will need to be updated to sync up the secret key with the one jenkins now shows.

@targos
Copy link
Member Author

targos commented Dec 20, 2022

Ok, thanks! Now how do I "install" the new key on the agent? I tried to rerun the playbook after updating the secrets repo but it's still using the old key.

@sxa
Copy link
Member

sxa commented Dec 20, 2022

Ok, thanks! Now how do I "install" the new key on the agent? I tried to rerun the playbook after updating the secrets repo but it's still using the old key.

Simplest fix would just be be to update /etc/systemd/system/multi-user.target.wants/jenkins.service on the machine which seems to still have the JENKINS_SECRET text which has not been replaced.

@targos
Copy link
Member Author

targos commented Dec 20, 2022

Fixed with

# vim /etc/systemd/system/multi-user.target.wants/jenkins.service
# systemctl daemon-reload
# systemctl restart jenkins

ansible/inventory.yml Outdated Show resolved Hide resolved
@richardlau
Copy link
Member

Ok, thanks! Now how do I "install" the new key on the agent? I tried to rerun the playbook after updating the secrets repo but it's still using the old key.

Did you still have the copy of the renamed template present?

@targos
Copy link
Member Author

targos commented Dec 20, 2022

Ok, thanks! Now how do I "install" the new key on the agent? I tried to rerun the playbook after updating the secrets repo but it's still using the old key.

Did you still have the copy of the renamed template present?

Yes. TBH I don't understand why I had to create this in the first place. I was able to create a macOS host from scratch without it.
Also, while it seems that the placeholder key was put in the system config, the agent wasn't running with it, but with the previous key (from before the rename)

@richardlau
Copy link
Member

Ok, thanks! Now how do I "install" the new key on the agent? I tried to rerun the playbook after updating the secrets repo but it's still using the old key.

Did you still have the copy of the renamed template present?

Yes. TBH I don't understand why I had to create this in the first place. I was able to create a macOS host from scratch without it. Also, while it seems that the placeholder key was put in the system config, the agent wasn't running with it, but with the previous key (from before the rename)

The templates are not needed. Perhaps we should remove them (they were what was used before the secrets repo existed to set the Jenkins agent secret). I suspect the still running with previous key would be resolved by systemctl daemon-reload to make systemd see the updated service definition.

@targos
Copy link
Member Author

targos commented Dec 21, 2022

Can I merge this?

@richardlau
Copy link
Member

Can I merge this?

Yeah, looks okay to merge. I'm surprised we don't need the benchmarking role on this machine but the inventory didn't have it before 🤷.

@targos
Copy link
Member Author

targos commented Dec 21, 2022

I'm surprised we don't need the benchmarking role on this machine but the inventory didn't have it before

I'm still not sure if we need it. Note that I ran the playbook before removing it (also for #3135, I forgot to pull the last change done on GH).

@targos targos merged commit a684475 into nodejs:main Dec 21, 2022
@targos targos deleted the ubuntu18-bench-2 branch December 21, 2022 16:53
@anonrig
Copy link
Member

anonrig commented Dec 21, 2022

Thanks @targos, for the work you did on this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants