Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Add ARM64 support #803

Closed
wants to merge 11 commits into from
Closed

[WIP] Add ARM64 support #803

wants to merge 11 commits into from

Conversation

Alxandr
Copy link

@Alxandr Alxandr commented Feb 8, 2019

What this PR does / why we need it: It adds ARM64 support to brigade, allowing it to run on ARM (and hybrid ARM64/AMD64) clusters.

If applicable:

  • this PR contains documentation
  • this PR contains unit tests
  • this PR has been tested for backwards compatibility

No code changes have been performed, so it should not need unit tests, nor break backwards compatibility. It might warrant an update to docs just to say it supports ARM though?

Resolves #773

@msftclas
Copy link

msftclas commented Feb 8, 2019

CLA assistant check
All CLA requirements met.

@Alxandr
Copy link
Author

Alxandr commented Feb 8, 2019

@radu-matei
Copy link
Contributor

Question: I'm getting some warnings while building the worker:

Step 5/6 : RUN yarn install && yarn build
 ---> Running in 3301a56a3fa4
Unknown host QEMU_IFLA type: 40
Unknown host QEMU_IFLA type: 41
Unknown host QEMU_IFLA type: 40
Unknown host QEMU_IFLA type: 41
Unknown QEMU_IFLA_INFO_KIND ipip
Unknown host QEMU_IFLA type: 40
Unknown host QEMU_IFLA type: 41
Unknown QEMU_IFLA_INFO_KIND ip6tnl
Unknown host QEMU_IFLA type: 40
Unknown host QEMU_IFLA type: 41
yarn install v1.12.3
Unknown host QEMU_IFLA type: 40
Unknown host QEMU_IFLA type: 41
Unknown host QEMU_IFLA type: 40
Unknown host QEMU_IFLA type: 41
Unknown QEMU_IFLA_INFO_KIND ipip
Unknown host QEMU_IFLA type: 40
Unknown host QEMU_IFLA type: 41
Unknown QEMU_IFLA_INFO_KIND ip6tnl
Unknown host QEMU_IFLA type: 40
Unknown host QEMU_IFLA type: 41

However, they don't seem to directly affect building the image. Do we at least know what is causing them?

@Alxandr
Copy link
Author

Alxandr commented Feb 8, 2019

I have the same warnings, and no. I'm not quite sure. As you said, they don't seem to be affecting the image. Google has been most unhelpful. What operating system are you running on? I was hoping that building on linux might not give this error (or a different one).

That being said, thinking about it (while typing); I would guess the issue is yarn downloading native (pre-built) dependencies of some sort?

@radu-matei
Copy link
Contributor

I'm currently testing on Windows, but in a couple of hours I can also test on Linux and macOS.
Yeah, it appears it has to do with pulling the Node packages - as long as the resulting worker image is functional, (and we open an issue to remember to actually track this at some point), I'm fine with merging.

@Alxandr
Copy link
Author

Alxandr commented Feb 8, 2019

I need to actually test that the resulting arm images work first. I'm looking at the helm chart to see what values I need to modify to deploy to my cluster.

@radu-matei
Copy link
Contributor

radu-matei commented Feb 8, 2019

I tested the build, build images, push images, and push manifest bits of this PR.

Let us know how actually deploying Brigade on an ARM cluster goes, and I'd also like to ping @vdice and/or @adamreese since this PR pokes at the Makefile.

Thanks!

@Alxandr
Copy link
Author

Alxandr commented Feb 8, 2019

Also on the TODO: the brigade.js file might need changing. I've not really looked at it though, but I assume it also builds the project?

@radu-matei
Copy link
Contributor

Yeah, and all releases are actually done through Brigade, so that will also need updating.

@Alxandr
Copy link
Author

Alxandr commented Feb 9, 2019

I'm getting the following error from vacuum:

/usr/bin/brigade-vacuum: line 1: syntax error: unterminated quoted string

Any idea what could be causing this?

@Alxandr
Copy link
Author

Alxandr commented Feb 9, 2019

Figured it out. I was putting the amd64 binary in the arm image -.-

I can now run jobs successfully:
https://asciinema.org/a/Y8rreMLgk7Voo48CAt2pkfBG9

@radu-matei
Copy link
Contributor

@vdice - could you have a look at this when you get a couple of minutes?
Also, the question about the changes in brigade.js seem like right up your alley 😂

Thanks!

Copy link
Contributor

@vdice vdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool stuff! Let's hold off on changes needed in brigade.js until some of the questions/concerns around the changes in Makefile get hashed out.

Makefile Outdated Show resolved Hide resolved
Makefile Outdated Show resolved Hide resolved
Makefile Outdated Show resolved Hide resolved
@Alxandr
Copy link
Author

Alxandr commented Feb 13, 2019

@vdice I did a major overhaul of the makefile. Targets that are based on files (like how make is supposed to work) as far as is possible etc. I've not tested the manifest creation yet, but building single or multiple arches works. The dockerfiles now also have arguments, such that they will not go out of sync. However, as a downside, the makefile is now harder to understand (though I guess that's subjective). Please take a look at it and see if you like it, if not I can go back to scratch and see if it can be done in another way. Though considering how you're using make (all of the targets are .PHONY), switching to another, easier to handle task runner might be a suggestion?

Copy link
Contributor

@vdice vdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fantastic work, @Alxandr ! I, for one, am all in favor of an overhaul of the Makefile it it means an improved and more make-appropriate use. The immediate benefits I see are:

  1. Finally harnessing the power/functionality of Make with file-based targets where appropriate, thereby only processing/re-building files when their source has changed.
  2. Dynamic generation of arch-specific Dockerfiles
  3. Target generation for the various combinations of: brigade component, os, arch now fully captured by re-usable functions and corresponding foreach loops

As you've mentioned, this approach definitely requires a bit more time to understand. I'm hoping adding more comments (with examples), as suggested in a PR comment here, helps in this regard.

There remain a few important points that I think we want to ensure before going this direction:

  1. make targets continue to work for all platforms. Here, I'm most interested in @radu-matei 's WSL experience and not so much worried about Mac (my testing looks great w/ some caveats mentioned in PR comments) -- and certainly less so w/ Linux :)
  2. Ensure make build by default only builds the os-arch combination (for brigade component binaries in bin/) matching the developer's host machine. I'm still seeing all three of linux, darwin and windows being built when I run this target. (Also mentioned in a PR comment on this review.)
  3. Ensure the momentum to cut/release Brigade v1.0.0 isn't disrupted (not super worried, just anticipating the CI/brigade.js changes)

I'm glad to help sorting out changes needed in brigade.js when we get some of the details above sorted out. Again, the work put in on this overhaul (make-over 😄) is very much appreciated and I personally do like the direction.

Makefile Outdated
# This is for debugging the makefile
.PHONY: list-targets
list-targets:
@$(MAKE) -pRrq -f $(lastword $(MAKEFILE_LIST)) : 2>/dev/null | awk -v RS= -F: '/^# File/,/^# Finished Make data base/ {if ($$1 !~ "^[#.]") {print $$1}}' | sort | egrep -v -e '^[^[:alnum:]]' -e '^$@$$' | xargs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice -- and crucial -- with how the targets are now generated.

Can we separate each target by a newline? I did so via xargs -0.

Makefile Outdated

.PHONY: build build-images push-images dockerfiles

define GO_CMD_TARGETS
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking some further comments around each of the define sections would help new users -- and even us after time goes by! -- more quickly grok how it all ties together.

For example, maybe we add, godoc style :

# GO_CMD_TARGETS generates go build targets using the provided 
# Brigade component ($1) and architecture ($2), e.g., brig/rootfs/amd64/rootfs
#
# See the `foreach` loop that invokes this function - elsewhere in this file - for how these targets are generated
define GO_CMD_TARGETS
...

Definitely open as to wording, but at a high-level being sure to mention what each arg ($1, $2, etc.) corresponds to and a literal example.

Makefile Outdated
$(foreach cmd,$(BINS),$(eval $(call BRIG_STATIC_ARCH_TARGETS,$2,$1,$(cmd))))
endef

define GO_ARCH_TARGETS
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we move lines 128-140 further up at the top, before the definitions of the functions that are invoked in each foreach statement (BRIG_STATIC_TARGETS, etc.).

Then the flow to understand would be: seeing these meta targets and foreach loops, then digging in to each looped function for how each parameterized target is built...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm an F#-er, so I'm used to reading files with the exports at the bottom, and the implementation details at the top, hence the ordering (up to a point where I confused myself I think). Will try to move targets that are more likely to be called higher up.

Makefile Outdated
$1: $1-images

create-$1-manifest: push-$1-images
# Remove old manifest if present
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any drawbacks in removing the need for this cleanup logic and instead adding --amend to the docker manifest create?

I explored this option as the cleanup logic broke down for me when my registry was a non-docker.io registry (ACR) -- so the manifest file was named differently ($(DOCKER_REGISTRY)_$1-$(IMAGE_VERSION)). Instead of updating the cleanup logic, I thought it'd be great if we could get rid of it altogether if docker manifest create --amend ... would work for us.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, I don't know. The docker manifest commands are so badly documented, that I'm not even sure what they really do. I've just found a way to use them that works 😛. My main issue with doing amend would be that you would possibly just elongate the manifest list in repeated builds. This shouldn't be a problem as long as the manifest tag and the image tags are the same, but I would have to test it. Also, when doing manifest push, we can add --purge (with the option to disable it for debugging purposes), which will purge after pushing.

Makefile Outdated
arm64_DOCKER_ANNOTATIONS = --os linux --arch arm64 --variant v8
arm32_DOCKER_ANNOTATIONS = --os linux --arch arm

.PHONY: build build-images push-images dockerfiles
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add a top-level manifests target as well? (Or, update-manifests, etc., to follow pattern of <verb>-<noun>)

Then, we could add the following in the DOCKER_MANIFEST_TARGETS function:

manifests: create-$1-manifest

Flow for CI would then be: make build-images push-images manifests for building/pushing/updating manifests.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the names should probably be walked over (put in a list, drawn in a tree, and then painted on a wall a few times, to get a good feel for them 😛). I spent quite a while on this, re-discovering make while doing it (been a while since I did a make template), so my state of mind probably changed quite a bit while writing it, leading to inconsistencies. Doing a dependency-graph in comments at the top of the file would probably be a good idea.

@@ -0,0 +1,13 @@
FROM arm64v8/alpine:3.8
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was there a reason to keep this arch-specific Dockerfile around? Or can it be removed now that they are dynamically generated?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oversight. Will remove it.

Makefile Outdated

# Create arch dockerfile
$1/Dockerfile.$2: $1/Dockerfile
@cat $1/Dockerfile | ARCH="$2" QEMU_IMPORT="$$($2_QEMU_IMPORT)" DOCKER_ARCH="$$($2_DOCKER_ARCH)" envsubst '$$$${ARCH} $$$${QEMU_IMPORT} $$$${DOCKER_ARCH}' > $1/Dockerfile.$2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll need to check for the existence of envsubst and either install in a platform-agnostic manner if it doesn't exist -- or exit out and inform the user that it is a required dependency to run the targets needing it.

As a Mac user, I didn't have it by default, although I did have the package that contained it (gettext). For me, the key was the latter command from: brew install gettext && brew link --force gettext

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did the same command :). Alternatively, using something like go templates might be a better idea?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just from half a minute of googeling, go get github.com/hairyhenderson/gomplate/cmd/gomplate maybe?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice -- if that utility works, a go get-able option would be a good choice.

Makefile Outdated

.PHONY: helm-install
helm-install: helm-upgrade

.PHONY: helm-upgrade
helm-upgrade:
helm upgrade --install $(BRIGADE_RELEASE) brigade/brigade --namespace $(BRIGADE_NAMESPACE) \
--set brigade-github-app.enabled=true \
--set gw.service.type=ClusterIP \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to go back to how master looks for this target (gw.* sets removed in place of --set brigade-github-app.enabled=true), as we hope to deprecate the classic GH gateway (gw) in favor of the brigade-github-app gateway.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was probably just a merge conflict. Pretty sure I didn't intentionally touch the helm-upgrade target.

Makefile Outdated

define GO_CMD_TARGETS
# Build command
$1/rootfs/$2/$1: vendor $(shell find $1 -type f -name '*.go')
Copy link
Contributor

@vdice vdice Feb 14, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I must admit I'm still trying to figure out where the change is needed :) but I'm seeing these (docker binary) image targets being called when I run make build.

I think I prefer maintaining the pattern of make build only building go binaries (and placing into /bin) corresponding to the os+arch of the host machine.

Then, these docker binary build targets (putting the linux+arch binaries in /rootfs) would only be run, as needed, by make build-images (and/or, if we had the need to run separately, e.g. via make build-docker-bins)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. I'll try drawing a dependency graph with the names. Then we can discuss how it should work after it's more visible. Makes it easier to make naming changes like this.

@vdice
Copy link
Contributor

vdice commented Feb 19, 2019

Hi @Alxandr , an update from today's (2/19/19) Brigade Dev Meeting. Although we/I will still continue to test/review the functionality presented in this pull request, the main concern at this time is to reduce the potential for speed bumps/issues in releasing Brigade 1.0.0. As the Makefile has been fully overhauled and the core maintainers have yet to fully understand its new design/mechanisms, the expectation is that the earliest this work would be merged would be after this milestone.

@Alxandr
Copy link
Author

Alxandr commented Feb 19, 2019

Fair enough. I have things running on my cluster, which was my main concern. I have one issue though, which I would probably label as a "bug" (but would not matter if this branch got merged): I need to modify the worker of every new project I make, even though I set the worker in the values for the helm template. Other than that, things are working just as they should on my end.

@odidev
Copy link

odidev commented Feb 15, 2021

@vdice

I am also interested in Linux/ARM64 Docker image release for ‘brigade’.

According to the previous comments, it seems that the Linux/ARM64 docker image addition was in your backlog, but due to v1.0.0 milestone, the activity was on-hold.

May I know why this activity is still on-hold?
If you are still interested in this PR, can you please test and review the changes?

Kindly let me know if there is anything I can do, I will be happy to help and contribute.

@krancour
Copy link
Contributor

@odidev I cannot speak directly to why this PR stalled out two years ago. I can tell you that, currently, all our available resources are being directed at Brigade 2.0.

We would love to add ARM64 support and It's certain to happen in some version 2.x > 2.0.

I'm also certain the state of the art in building multi-architecture Docker images has advanced in the two years since this PR was touched.

Assistance is welcome with multi-arch builds for 2.0. See the v2 branch.

@radu-matei
Copy link
Contributor

One issue was that we didn't have the proper infrastructure for CI - and I'm still not sure if we have now.

I believe @squillace may have some more info around ARM CI?

@Alxandr
Copy link
Author

Alxandr commented Feb 16, 2021

I've (personally) moved from ARM back to X86, so I don't really have a stake in this anymore, but I did keep up to date on multi-arch images etc, so I can say that there are several new(ish) things there that make things way easier than when I started on that journey (I don't remember where in the spectrum this image falls). For anyone who wants to look at this, I recommend looking at "docker buildx bake". I will close this PR as I have no idea what's in it anymore :P.

@Alxandr Alxandr closed this Feb 16, 2021
@odidev
Copy link

odidev commented Feb 16, 2021

@radu-matei @krancour @Alxandr

Thank you for the clarifications.

I tried building the package (v2 branch) locally on Linux x86_64 and ARM64 machines using ‘make’ command. The build was successful on the x86_64 machine, and amd64 binaries were generated in the ‘bin’ folder.
But the build failed on a Linux/ARM64 machine as qemu was unable to detect the host platform (linux/arm64/v8), as shown below:

qemu: uncaught target signal 11 (Segmentation fault) - core dumped

May I know if you are planning to use Github actions to include the linux jobs to the CI in the development branch ‘v2’? If yes, then it will be feasible to add the Linux/ARM64 docker image build and release process via GA CI.

Also, are you planning to add Linux/ARM64 support in the 2.0 release? If yes, then I will be happy to contribute for the ARM64 docker image release process.

Kindly share your opinion on this.

@krancour
Copy link
Contributor

May I know if you are planning to use Github actions to include the linux jobs to the CI in the development branch ‘v2’?

We have used GitHub Actions for only one thing-- building the Windows log agent image. And the reason we're doing that is because building a Window-based Docker image inside another Windows-based Docker container does not seem to be possible or practical yet. With this one exception, we're avoiding GitHub Actions for the sake of "dogfooding" Brigade. Our CI and CD are, and will remain, based on Brigade. (Full transparency, we're currently using Brigade 1.4.something to build the v2 branch, but that will change as other things, like the GitHub gateway, in the Brigade v2 "orbit" start to mature.)

Also, are you planning to add Linux/ARM64 support in the 2.0 release?

Reiterating what I already said:

We would love to add ARM64 support and It's certain to happen in some version 2.x > 2.0.

Also, @radu-matei has raised an excellent point that we may still be lacking the infrastructure to accomplish this. i.e. I'm not sure we have access to ARM64 VMs right now.

@odidev
Copy link

odidev commented Mar 9, 2021

Also, @radu-matei has raised an excellent point that we may still be lacking the infrastructure to accomplish this. i.e. I'm not sure we have access to ARM64 VMs right now.

One issue was that we didn't have the proper infrastructure for CI - and I'm still not sure if we have now.

I believe @squillace may have some more info around ARM CI?

@krancour @radu-matei , Can't we use Travis-CI for the same as it has support for multi-architecture builds.

@krancour
Copy link
Contributor

krancour commented Mar 9, 2021

@odidev While Brigade isn't exclusively for CI, CI is a popular use case for Brigade. To the greatest extent we're able, we prefer to implement our CI by "dogfooding" our own product. I think I've previously indicated this preference for dogfooding. For the time being, using Travis or Circle or similar is off the table.

I can see you're keen to get ARM64 support. You're welcome to explore whatever options may exist that don't violate our preference for dogfooding.

@squillace
Copy link

Per #803 (comment), @radu-matei, I will insist on ARM outputs at some point, and I will arrange to pay for them as well. Lemme look into this more deeply. (I'm on Team Pinebook over here.)

As to anything but Brigade in the Brigade automation, I'm completely on Team Dogfood. FWIW and if anyone cares.

@odidev
Copy link

odidev commented Mar 24, 2021

@krancour , I've built the package on my local Arm64 system. It is getting built successfully there but to build this we need the Docker images of "brigadecore/kaniko", "brigadecore/helm-tools" & "brigadecore/go-tools" packages for Arm64 platform at Dockerhub. I've raised issues regarding the same in respective packages. Can you please look into those issues?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Provide ARM images
7 participants