Skip to content

Conversation

DaAwesomeP
Copy link
Member

This is just a draft! Still lots to do, but it should go quickly once I fix this pesky GH Actions cancel issue.

Only problem right now is getting GitHub not to cancel the build after 45 minutes. It must be OOM or some other resource; GitHub usually allows jobs to run for a very, very long time.

Example cancelled build (not by me!):

@DaAwesomeP
Copy link
Member Author

OK, I tried make check and this is fascinating...it also gets stucking compiling somewhere around libolausbdmx and the usbdmx and usbpro plugins. Maybe I can try turning those off.

@DaAwesomeP
Copy link
Member Author

DaAwesomeP commented Apr 30, 2023

OK, I tried and I am having an issue where the following command does not disable usbdmx:

./configure --enable-rdm-tests --enable-java-libs --enable-ja-rule --enable-e133 --disable-usbdmx --disable-usbpro CPPFLAGS=-Wno-deprecated-declarations
OLA Version 0.10.9

Prefix: '/usr/local'
Compiler: 'g++ -g -O2 -std=gnu++11 -pthread '
Linker: '/usr/bin/ld -m elf_x86_64   -ldl '
Python: /usr/bin/python3

Python API: yes
Java API: yes
Enable HTTP Server: yes
RDM Responder Tests: yes
Ja Rule: yes
Enabled Plugins: artnet dummy e131 espnet ftdidmx gpio karate kinet milinst opendmx openpixelcontrol osc pathport renard sandnet shownet spi stageprofi uartdmx usbdmx
UUCP Lock Directory: /var/lock

I am able to reproduce this locally.

UPDATE: It appears ja-rule causes this. Disabling for now.

@DaAwesomeP
Copy link
Member Author

Hmmm, disabling these gets it further but not to completion. I'm thinking it must be running out of memory. I'll try disabling some of the parallelization.

@DaAwesomeP
Copy link
Member Author

Alright, finally! It took 20 minutes to run, but disabling parallelization (forcing -j1) causes it to get to the end of make check. This might fix distcheck too!

One check does fail (CredentialsTest) because the tests are running as root. What is this test testing? Why can't it run as root?

@DaAwesomeP
Copy link
Member Author

@peternewman @kripton Hey look at that! Fully working make distcheck and make check! I'm going to separate the verify trees into a separate job after distcheck and then this is good to go!

Looks like maybe nobody has run that verify trees check in a while.

I am no longer running make dist after make distcheck because it appears to be redundant: https://www.gnu.org/software/automake/manual/html_node/Preparing-Distributions.html

We still run make check separately because it also does gcov. It also has different configure flags that I copied from the Travis file.

@DaAwesomeP DaAwesomeP changed the title GitHub Actions Build: Distcheck and Build GitHub Actions Build: distcheck, check, verify-trees, coverage, coveralls May 1, 2023
@peternewman
Copy link
Member

UPDATE: It appears ja-rule causes this. Disabling for now.

Yeah sorry I could have told you that, it's because ja-rule needs the usbdmx plugin to talk to the interfaces and send DMX.

Alright, finally! It took 20 minutes to run, but disabling parallelization (forcing -j1) causes it to get to the end of make check.

That's the same sort of runtime as Travis used to be.

This might fix distcheck too!
🤞

One check does fail (CredentialsTest) because the tests are running as root. What is this test testing? Why can't it run as root?

Because running as root is bad and means our code is at risk of being responsible for a security breach. There's a configure option to run as root (which I assume skips that test), but not running the check as root would be better!

@peternewman @kripton Hey look at that! Fully working make distcheck and make check! I'm going to separate the verify trees into a separate job after distcheck and then this is good to go!

Amazing, just the minimally working check/distcheck would be great progress for now! Indeed would probably be better for the short term than even verify trees etc and having them in a later PR.

Looks like maybe nobody has run that verify trees check in a while.

Yeah our CI has been broken for quite a while! I missed some bits in the release because it was AWOL (and I didn't run it manually). Some of that stuff hasn't changed though, so I'm slightly suspicious about other results.

I am no longer running make dist after make distcheck because it appears to be redundant: https://www.gnu.org/software/automake/manual/html_node/Preparing-Distributions.html

That seems about as I vaguely understood.

We still run make check separately because it also does gcov. It also has different configure flags that I copied from the Travis file.

That makes sense, but again leaving the gcov until later would be fine and probably makes sense too.

Is it worth updating this issue as it sounds like you've found the cause? actions/runner-images#7004

I wonder if our Docker usage has made it worse? Can we tell Docker to offer more memory/use more cores?

@DaAwesomeP
Copy link
Member Author

Because running as root is bad and means our code is at risk of being responsible for a security breach. There's a configure option to run as root (which I assume skips that test), but not running the check as root would be better!

Fixed, I now make a new user and run it the same way that the Debian build does.

Amazing, just the minimally working check/distcheck would be great progress for now! Indeed would probably be better for the short term than even verify trees etc and having them in a later PR.

Separated it out already! It has the same results as before separating.

Yeah our CI has been broken for quite a while! I missed some bits in the release because it was AWOL (and I didn't run it manually). Some of that stuff hasn't changed though, so I'm slightly suspicious about other results.

I think I'm not going to fix the source of these failures in this pull, especially because I (:sob:) need to rebase this to 0.10.

That makes sense, but again leaving the gcov until later would be fine and probably makes sense too.

Nearly done with this too, just fixing the Coveralls upload bit. Should be done in an hour or so I think.

Is it worth updating this issue as it sounds like you've found the cause? actions/runner-images#7004

I know I should (and I will), though their lack of support with issues like these is not promising.

I wonder if our Docker usage has made it worse? Can we tell Docker to offer more memory/use more cores?

Unsure. I can maybe try this out later.

@DaAwesomeP
Copy link
Member Author

Testing this is a funny process. I have a chain of different commits trying different things every 5 minutes, but because it takes 20 to 30 minutes to run I have to keep track of what was what in many tabs. Before I solved the OOM issue I had to keep these tabs open or I would lose the logs too.

@DaAwesomeP DaAwesomeP requested a review from peternewman May 2, 2023 15:58
@DaAwesomeP
Copy link
Member Author

Hey @peternewman anything else to change or are we good to go?

@DaAwesomeP
Copy link
Member Author

@peternewman polite ping!

Copy link
Member

@peternewman peternewman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry a few more comments...

@DaAwesomeP
Copy link
Member Author

Debian 12 is now Debian stable, and it handles Python 3 system dependencies/pip differently, so I need to fix that. Working on that now!

@DaAwesomeP
Copy link
Member Author

DaAwesomeP commented Jun 13, 2023

OK, I think I fixed the Python3 venv/system-wide issue (all builds are passing the configure stage). We'll know for sure in about 35 minutes.

Here is what needed to change:

  1. Debian 12 no longer allows pip to install packages system-wide alongside python3-* packages installed by APT.
  2. We now create a clean venv for the build. This venv is set to also use system packages, so we can still use the Debian-provided packages as they would be used for building. Like before, we also install the newer tools that we want from pip.
  3. When we run the build as builduser (since we can't run as root or we fail an OLA test), we must pass along the $PATH variable (which activates the venv) and also the CC/CXX variables. The --preserve-env flag won't pass $PATH, so we pass that with the env helper command.

@DaAwesomeP
Copy link
Member Author

Passing! I was simply missing id: on the number of cores step.

@DaAwesomeP DaAwesomeP requested a review from peternewman June 13, 2023 22:43
Copy link
Member

@peternewman peternewman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just the one TODO comment which I'll auto merge.

Thanks again for this @DaAwesomeP ! It's awesome!

@peternewman peternewman enabled auto-merge June 18, 2023 16:53
@peternewman peternewman merged commit 175ea59 into OpenLightingProject:0.10 Jun 18, 2023
@peternewman
Copy link
Member

Coverage wasn't in the required list, but I figure it probably doesn't need to be unless we're being really strict!

I guess we'll now see if it pushes our coverage status correctly...

@DaAwesomeP
Copy link
Member Author

Coverage wasn't in the required list, but I figure it probably doesn't need to be unless we're being really strict!

It might be possible to setup rules that block merging is coverage dips by X percent in total? I'm not sure.

I guess we'll now see if it pushes our coverage status correctly...

Passed! See https://coveralls.io/github/OpenLightingProject/ola?branch=0.10 and https://coveralls.io/jobs/123479842

@peternewman peternewman added this to the 0.10.10 milestone Jun 18, 2023
@peternewman
Copy link
Member

Coverage wasn't in the required list, but I figure it probably doesn't need to be unless we're being really strict!

It might be possible to setup rules that block merging is coverage dips by X percent in total? I'm not sure.

Maybe if we can just get it to comment if it drops by say 5% or something? Then it will only be usefully spammy!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants