-
Notifications
You must be signed in to change notification settings - Fork 166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Set up new Ampere Altra machines in the build farm to replace older packetnet systems #2729
Comments
Successful test-commit run on two docker images on the machine, however there were issues running the playbooks:
|
You could copy
If using Docker it might make sense to follow a similar framework to what we do on the x64 containers? See
These run the Jenkins agent directly in the containers, see the templates in https://github.com/nodejs/build/tree/master/ansible/roles/docker/templates. The host running the containers gets a service to start the containers.
|
Yep that will probably be the better option :-) |
Note: Also required the |
@sxa please try and use or extend the existing docker-host setup we have in Ansible; if you see opportunity for improvement then that's great and of course there'll be some things that are necessarily different for the new environment. What we want to avoid is duplication, doing essentially the same thing but in multiple ways .. we have enough of that with our mixed infra already! |
Absolutely @rvagg - this was the first time I'd used the ansible scripts directly myself so I wanted to do some experiments/education within statically generated docker containers and separate jenkins jobs on the system while I had a clean machine to play with before integrating the machine fully sicen they're easy to regenerate. Certainly don't want to add unnecessary complexity.
Hopefully not too many :-) |
@rvagg I've stuck a couple of DRAFT PRs in to support creating containers on the new system and am setting them up accordingly as I write this (nothing that can't be undone obviously). We'll have three of these new servers to replace the previous ones. We'll need to look at what to do to replace |
Current allocation of packetnet ThunderX arm64 machines:
Would @nodejs/build be ok with reducing this to two of the new systems running docker containers and the third as a release machine? I don't believe they can have CentOS7 directly installed just now, so we may have to run the release image within a containers too. For the normal machines we then have to choose between using Ubuntu 20.04 as the host OS on each, or using CentOS8 (as long as It's supported!) on one and Ubuntu 20.04 on the other. As far as possible I'd probably choose not to run things directly on the host since it's likely quite wasteful of such a large machine, unless we want to be completely sure that we're running some tests on the bare metal in case certain problems are masked in docker containers. I've set one of them up with 8 containers - four each of Ubuntu 18.04 and Ubuntu 20.04. Of each group of 4 one is from the normal dockerfile, and three are from the |
+1 to dropping 16.04 +1 to setting up the release machine with docker and releasing from within a centos container, although there might be some difficulties getting that working because there's additional steps to getting ssh config set up properly for releasing—some of which is typically done manually! I don't think we even have an Ansible way to get the ssh key in there.. So doing it in a Docker container might have challenges. Although we have cross-compile containers in our mix doing releases, so I suppose I got that working somehow! The rest sounds good, along with the other comments I had via email. Sorry I haven't been very responsive, a lot going on at the moment. |
ok, so another thought - the 18.04 containers may not even be necessary, I wouldn't object if you just wanted to go with 20.04 and multiply them a bit. Your call. |
No worries - that response is great and means I'll look at getting this stuff merged and live today and try and get some of the older systems decomissioned for Packet to take them back. I'll have a chat with Richard on the release machine setup.
My gut feel is always that it's nice to at least try and support things that are currently in service so I'll stick with the current split on this machine and possibly only do 20.04 on the second machine. |
@sxa, All sounds good to me as well. One this is that with the security release next week it might make sense to try to hold on to the old machines until just after that goes out? |
As of yesterday one of the Ubuntu 16.04 machines has been decomissioned (it had been marked offline for a few weeks): #2708 - the rest were still fully active during the security release. The following docker images on the first Altra are now live:
The Ubuntu ones have been live for a few days and seem to be working ok, subject to needing the The CentOS one has just been made live today after adding the I have marked the old CentOS7 boxes offline in jenkins for now to test all runs on the new CentOS7 system.
I've also removed the |
Is there a plan to run something on the sharedlib containers? |
The node-test-commit-arm jobs should be able to run on them - the first one I'd set up with the tags was |
Just one container for centos7 ? Sounds like we had a couple of machines before? |
@mhdawson That would (eventually) be one container per machine giving us two containers in the test CI (replacing the two existing Packet centos7 machines). |
@richardlau k, got it. |
Updated the following jobs so they will be ok on the new systems.
The libuv job needed some extra prereqs as per #2744 I'm going to formally decommission the following today:
They have been offline for a few days and no extra problems have been observed. |
At this point the outstanding actions will be to set up the second Altra for build/test and the extra machine to replace the release ThunderX box. |
(Splitting off the release machine into a separate issue as shown above) The second build/test Altra test-equinix-centos8-arm64-1(139.178.85.13) has now been provisioned. Will need to be tested to see how well the dockerhost setup works on CentOS8 but worst case we can switch it to ubuntu 20.04 like the other one :-) |
Release machine now decomissioned:
|
I've been experimenting today with one of the other things I mentioend in the description of this issue - running armv7l OS images in containers on these new hosts. We've done this successfully at the Adoptium project and it's looking promising here so far too: #2775 |
arm32 container live and included in the main node-test-commit-arm job under the |
@sxa We're currently setting up RHEL 8 CI instances to replace CentOS 7 for Node.js. Any thoughts on how we should handle the release machine for ARM 64? We'll need to be able to keep building Node.js 12/14/16 on CentOS 7 but also build Node.js 18 onwards on RHEL 8. We have hit one issue in V8 canary (nodejs/node-v8#220) that suggested an incompatibility with the kernel (too old) in CentOS 7 which I think means we wouldn't be able to run a RHEL 8 container on top of CentOS 7 as it would still be using the older kernel. |
This issue is stale because it has been open many days with no activity. It will be closed soon unless the stale label is removed or a comment is made. |
Equinix (formerly packet) are replacing their older aarch64 hardware and consolidating us on new 160-core Ampere Altra systems. This issue will track progress/deployments/miration on the new machines.
My intention is to prototype creating multiple docker containers underneath a base OS of Ubuntu 20.04 (to allow a wider range of OSs to be used) to allow a wider range of testing. Unlike the ThunderX system we had previously these also support the 32-bit armv7 instruction set, so we can also look at running docker containers that could build and test our
armv7l
builds on these systems which could allow us to reduce our reliance on the raspberry pi systems.(Ref #2708 as the machine that will likely be the first to be decommissioned)
The text was updated successfully, but these errors were encountered: