Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add oic auth e2e #1701

Merged
merged 22 commits into from
Sep 11, 2024
Merged

Add oic auth e2e #1701

merged 22 commits into from
Sep 11, 2024

Conversation

fcojfernandez
Copy link
Member

Adding a test for the oic-auth plugin. The test relies on test-containers to start a container for keycloak to provide the user, groups and roles.

  1. Configures the keycloak container
    1. Create a test realm with a couple of roles
    2. Create a client for Jenkins
    3. Create users and groups, adding users to the groups and assigning roles to the groups
  2. Configures the SecurityRealm to use keycloak. A new SecurityRealm class has been created for this purpose
  3. Adds two tests so the login and logout is tested from Jenkins to Keycloak and from Keycloak to Jenkins.
  4. As part of the use check, the users are checked with their roles, which comes from the groups proving the user details are provided by the OICD provider through the oic-auth plugin. To check easily the roles, the whoAmI endpoint is used

Testing done

Submitter checklist

  • Make sure you are opening from a topic/feature/bugfix branch (right side) and not your main branch!
  • Ensure that the pull request title represents the desired changelog entry
  • Please describe what you did
  • Link to relevant issues in GitHub or Jira
  • Link to relevant pull requests, esp. upstream and downstream changes
  • Ensure you have provided tests - that demonstrates feature works or fixes the issue

@jtnord
Copy link
Member

jtnord commented Sep 2, 2024

tests are failing as there is no docker available... (which is kinda fun for the ATH running here!)

Tests should skip if docker is not available, but the ATH run in ci.jenkins.io surely should have docker!!

org.testcontainers.containers.ContainerFetchException: Can't get Docker image: RemoteDockerImage(imageName=quay.io/keycloak/keycloak:25.0, imagePullPolicy=DefaultPullPolicy(), imageNameSubstitutor=org.testcontainers.utility.ImageNameSubstitutor$LogWrappedImageNameSubstitutor@5c7bfdc1)
	at org.testcontainers.containers.GenericContainer.getDockerImageName(GenericContainer.java:1364)
	at org.testcontainers.containers.GenericContainer.logger(GenericContainer.java:671)
	at dasniko.testcontainers.keycloak.ExtendableKeycloakContainer.<init>(ExtendableKeycloakContainer.java:138)
	at dasniko.testcontainers.keycloak.ExtendableKeycloakContainer.<init>(ExtendableKeycloakContainer.java:126)
	at dasniko.testcontainers.keycloak.KeycloakContainer.<init>(KeycloakContainer.java:24)
	at plugins.OicAuthPluginTest.<init>(OicAuthPluginTest.java:46)
	at java.base/jdk.internal.reflect.DirectConstructorHandleAccessor.newInstance(DirectConstructorHandleAccessor.java:62)
	at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:502)
	at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:486)
	at org.junit.runners.BlockJUnit4ClassRunner.createTest(BlockJUnit4ClassRunner.java:250)
	at org.junit.runners.BlockJUnit4ClassRunner.createTest(BlockJUnit4ClassRunner.java:260)
	at org.junit.runners.BlockJUnit4ClassRunner$2.runReflectiveCall(BlockJUnit4ClassRunner.java:309)
	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
	at org.junit.runners.BlockJUnit4ClassRunner.methodBlock(BlockJUnit4ClassRunner.java:306)
	at org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
	at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
	at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
	at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
	at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
	at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
	at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:316)
	at org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:240)
	at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:214)
	at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:155)
	at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:385)
	at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:162)
	at org.apache.maven.surefire.booter.ForkedBooter.run(ForkedBooter.java:507)
	at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:495)
Caused by: java.lang.IllegalStateException: Could not find a valid Docker environment. Please see logs and check configuration
	at org.testcontainers.dockerclient.DockerClientProviderStrategy.lambda$getFirstValidStrategy$7(DockerClientProviderStrategy.java:277)
	at java.base/java.util.Optional.orElseThrow(Optional.java:403)
	at org.testcontainers.dockerclient.DockerClientProviderStrategy.getFirstValidStrategy(DockerClientProviderStrategy.java:268)
	at org.testcontainers.DockerClientFactory.getOrInitializeStrategy(DockerClientFactory.java:152)
	at org.testcontainers.DockerClientFactory.client(DockerClientFactory.java:194)
	at org.testcontainers.images.LocalImagesCache.get(LocalImagesCache.java:31)
	at org.testcontainers.images.AbstractImagePullPolicy.shouldPull(AbstractImagePullPolicy.java:18)
	at org.testcontainers.images.RemoteDockerImage.resolve(RemoteDockerImage.java:75)
	at org.testcontainers.images.RemoteDockerImage.resolve(RemoteDockerImage.java:33)
	at org.testcontainers.utility.LazyFuture.getResolvedValue(LazyFuture.java:20)
	at org.testcontainers.utility.LazyFuture.get(LazyFuture.java:41)
	at org.testcontainers.containers.GenericContainer.getDockerImageName(GenericContainer.java:1362)
	... 32 more

@fcojfernandez fcojfernandez requested a review from jtnord September 2, 2024 15:54
@basil
Copy link
Member

basil commented Sep 2, 2024

tests are failing as there is no docker available... (which is kinda fun for the ATH running here!)
Tests should skip if docker is not available, but the ATH run in ci.jenkins.io surely should have docker!!

The ATH run in ci.jenkins.io definitely has Docker installed, and it is used via docker-fixtures in other tests. I have confirmed recently that this is working as expected.

@olamy
Copy link
Member

olamy commented Sep 2, 2024

@fcojfernandez the Jenkinsfile is not standard compared to plugins. In plugins using testcontainers needs useContainerAgent: false. So I'm not quite sure about what needs to be done in this scripted Jenkinsfile.
Looking here it looks to be a different label to use https://github.com/jenkins-infra/pipeline-library/blob/afa988b6beb40944005f02b841492d34c026e268/vars/buildPlugin.groovy#L50
Maybe @dduportal can confirm?

@dduportal
Copy link
Collaborator

Hello folks, thanks for the pings!

I'm not sure about the root problem: the only log I can get from testcontainer is:

WARNING: DOCKER_HOST unix:///var/run/docker.sock is not listening

I have 2 workings hypothesis for now:

  • Might be an issue due to Docker CE 27.2: we upgraded from 27.1.x to 27.2.0 in the VM agents templates yesterday: Bump Packer Agent Templates (All-In-One) Version to 1.92.0 jenkins-infra/jenkins-infra#3635. But this version seems buggy (we saw failure in Docker on other jobs of ci.jenkins.io). We reverted to Docker CE 27.1 a few minutes ago (Bump Packer Agent Templates (All-In-One) Version to 1.94.0 jenkins-infra/jenkins-infra#3638) and I've triggered a new build (number 4) to see the result.
    • Note: I did NOT spend much time analyzing for now. The infra team will spend the required time if this revert does not fix your build though!
  • Second hypothesis: the failure is due to testcontainer in the context of "DonD" (Docker on Docker). It's not nested containerization but a container with its own Docker clients (CLI, Jenkins pipeline native Docker, testcontainer native Docker client, etc.) and the Unix socket mounted.
    • You can see the "Unix socket" mount here:
      image.inside("-v /var/run/docker.sock:/var/run/docker.sock -v '${cwd}/target/ath-reports:/reports:rw' --shm-size 2g") {
    • I do NOT know why DonD is used instead of running natively. My best guess is "it's historical" because of the default JDK version which used to be JDK8 or JDK11 months ago. Not the case anymore though (use Jenkins tool for this).
    • Let's cover this hypothesis if the first one is evicted. There would be some facts to check: which user testcontainers uses to reach Docker? does this user has permissions to read and write the Unix socket? etc.

@dduportal
Copy link
Collaborator

Hello folks, thanks for the pings!

* All the "split tests" stages are running in a Linux VM agents with Docker available
  
  * https://github.com/jenkinsci/acceptance-test-harness/blob/fda9804fe3e68cf63946ed747bcddd7966af4f2f/Jenkinsfile#L117-L123
     defines the labels `docker-highmem<...>`proving this
  * If you reach the step (which the builds 1,2 and 3 of this PR did) https://github.com/jenkinsci/acceptance-test-harness/blob/fda9804fe3e68cf63946ed747bcddd7966af4f2f/Jenkinsfile#L128C1-L128C132, then it means Docker CE is there and ready.

* Container agents are used for other stages (such as the Launchable one) with labels `maven-21`. This is what @olamy (thanks!) described for the usual plugin builds with the pipeline library `buildPlugin()`. I confirm this function is NOT used in the ATH here.

I'm not sure about the root problem: the only log I can get from testcontainer is:

WARNING: DOCKER_HOST unix:///var/run/docker.sock is not listening

I have 2 workings hypothesis for now:

* Might be an issue due to Docker CE 27.2: we upgraded from 27.1.x to 27.2.0 in the VM agents templates yesterday: [Bump Packer Agent Templates (All-In-One) Version to 1.92.0 jenkins-infra/jenkins-infra#3635](https://github.com/jenkins-infra/jenkins-infra/pull/3635). But this version seems buggy (we saw failure in Docker on other jobs of ci.jenkins.io). We reverted to Docker CE 27.1 a few minutes ago ([Bump Packer Agent Templates (All-In-One) Version to 1.94.0 jenkins-infra/jenkins-infra#3638](https://github.com/jenkins-infra/jenkins-infra/pull/3638)) and I've triggered a new build (number **4**) to see the result.
  
  * Note: I did NOT spend much time analyzing for now. The infra team will spend the required time if this revert does not fix your build though!

* Second hypothesis: the failure is due to `testcontainer` in the context of "DonD" (Docker on Docker). It's not nested containerization but a container with its own Docker clients (CLI, Jenkins pipeline native Docker, testcontainer native Docker client, etc.) and the Unix socket mounted.
  
  * You can see the "Unix socket" mount here: https://github.com/jenkinsci/acceptance-test-harness/blob/fda9804fe3e68cf63946ed747bcddd7966af4f2f/Jenkinsfile#L128
  * I do NOT know **why** DonD is used instead of running natively. My best guess is "it's historical" because of the default JDK version which used to be JDK8 or JDK11 months ago. Not the case anymore though (use Jenkins tool for this).
  * Let's cover this hypothesis if the first one is evicted. There would be some facts to check: which user `testcontainers` uses to reach Docker? does this user has permissions to read and write the Unix socket? etc.

OK so we can remove the first hypothesis: it's NOT the Docker CE version.
=> It's clearly related to testcontainer in the ATH context.

As per https://java.testcontainers.org/supported_docker_environment/continuous_integration/dind_patterns/#docker-wormhole-pattern-sibling-docker-containers, the pattern of mounting the docker.sock socket is expected to work.

We might need to fine tune testcontainer for the host's hostname (see https://java.testcontainers.org/features/configuration/#customizing-docker-host-detection) but it will be after solving the problem here.

@fcojfernandez can you try running the same command with the ATH 's image (and it's DinD) on your laptop to see if you can reproduce the same issue?

@olamy
Copy link
Member

olamy commented Sep 3, 2024

Usually with dind image using testcontainer may need the following env var:

  DOCKER_HOST: tcp://localhost:2375
  DOCKER_TLS_VERIFY: 0

not sure exactly how the pods are configured though.

@dduportal
Copy link
Collaborator

Usually with dind image using testcontainer may need the following env var:

  DOCKER_HOST: tcp://localhost:2375
  DOCKER_TLS_VERIFY: 0

not sure exactly how the pods are configured though.

Be careful, in the case here (e.g. ATH in ci.jenkins.io).

  • It is "DonD", aka Docker on Docker (or "Docker wormhole" as per testcontainer). It's a different pattern than DinD (Docker in Docker). DinD is about nested container engines while DonD reuses the host's Docker engine. Both are really unsecured and bad practises honestly.
  • We do not use Pod for ATH but VM with a fully fledged Docker CE Engine

=> Check the pipeline, it is the source of truth (while my memory can betrays me :D) .

The PR here introduces a new pattern (testcontainer test harnesses next to existing Dockerfixture test harnesses) so it's not expected to work at first step. My understanding of testcontainer is that we should run the Maven command directly on the host (and not in a container based on jenkins/ath) but that requires the PR to change elements on the pipeline.

@dduportal
Copy link
Collaborator

I'm currently trying to understand which Maven command to run to reproduce the test (yeah Java is hard for me)

@fcojfernandez
Copy link
Member Author

can you try running the same command with the ATH 's image (and it's DinD) on your laptop to see if you can reproduce the same issue?

@dduportal it's reproducible using the ATH's image :/ I'll try to debug and see how the docker command is executed

@dduportal
Copy link
Collaborator

can you try running the same command with the ATH 's image (and it's DinD) on your laptop to see if you can reproduce the same issue?

@dduportal it's reproducible using the ATH's image :/ I'll try to debug and see how the docker command is executed

Cool thanks! I just got the same result (reproduction on my local Docker Desktop) as well. I'm a bit busy this week but we can pair Friday on it if you need help

@dduportal
Copy link
Collaborator

For documentation sake, here are the steps for local reproduction (on a Silicon Mac laptop with Docker Desktop, but also an Intel Windows 11 with Docker Desktop in WSL Linux containers mode):

docker pull --platform=linux/amd64 jenkins/ath
# From repo root
docker run -ti --platform=linux/amd64 \
  -v /var/run/docker.sock:/var/run/docker.sock \
  -v $(pwd)/target/ath-reports:/reports:rw \
  -v $(pwd):$(pwd) \
  -w $(pwd) \
  --shm-size 2g \
  jenkins/ath
> mvn -V -e -Plts -Dmaven.test.failure.ignore=true -DforkCount=1 -B -Dtest=OicAuthPluginTest

TestContainers does not use docker, but talks directly to the docker
socket.  The permissions on this socket come from the host where it is
mapped and the docker groupid may not match what we have in the
container.
So allow th arg to be passed through at build time and add the ath-user
to the docker group so it has the permissions.

We retain the legacy suid on the docker binary as we publish the
container and there is only a single test so far using this
test-containers.  (this can be revistied if required).
@jtnord
Copy link
Member

jtnord commented Sep 4, 2024

So test containers uses the docker socket directly, and this has incorrect permissions. docker-fixtures uses the docker cli and that is SUID in the container to work around the permission issue for it.

This obviously fails for anyone directly talking to docker. 067ae95 should hopefully fix this.
Tested locally from a WSL ubuntu session (I also had to export TESTCONTAINERS_HOST_OVERRIDE=host.docker.internal so that test-containers knew where to find the running container ports.

the fix changes the build options so that the pipeline passes in the docker user group so that the ath-user can be added correctly to this group in the container, but we retain the SUID as given we are publishing this container, without that no docker tests would be runnable (unless you got lucky with the docker group id!) This can be re-visted over time if an when we have more test-container based tests.

@jtnord
Copy link
Member

jtnord commented Sep 4, 2024

WARNING: DOCKER_HOST unix:///var/run/docker.sock is not listening

is misleading, filed testcontainers/testcontainers-java#9195

basil
basil previously requested changes Sep 4, 2024
src/main/resources/ath-container/Dockerfile Outdated Show resolved Hide resolved
@basil basil marked this pull request as draft September 5, 2024 18:20
@basil
Copy link
Member

basil commented Sep 5, 2024

Updated to consume the incremental build of jenkinsci/jenkins#9696 (and converted to draft until that is released)

pom.xml Outdated Show resolved Hide resolved
@basil basil dismissed their stale review September 6, 2024 14:29

Problem was fixed

pom.xml Outdated Show resolved Hide resolved
@basil basil marked this pull request as ready for review September 10, 2024 17:00
Copy link
Member

@jtnord jtnord left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't work correctly with a remote-webdriver

the keycloak URL is set to
http://localhost:54940/realms/test-realm/ but this is not accessible, unclear why, most likely this is not the mapped port or the port it not correctly exposed.

Additionally the image should be tied an updated by dependabot.

(I am working on both issues)

docs/BROWSER.md Outdated Show resolved Hide resolved
docs/DOCKER.md Outdated Show resolved Hide resolved
Testcontainers is with a lower case c
@jtnord
Copy link
Member

jtnord commented Sep 11, 2024

Doesn't work correctly with a remote-webdriver

the keycloak URL is set to http://localhost:54940/realms/test-realm/ but this is not accessible, unclear why, most likely this is not the mapped port or the port it not correctly exposed.

Additionally the image should be tied an updated by dependabot.

(I am working on both issues)

addressed in da7973e / d9fa1de

@jtnord jtnord enabled auto-merge (squash) September 11, 2024 16:48
private static final String REALM = "test-realm";
private static final String CLIENT = "jenkins";

private static final String KEYCLOAK_IMAGE="keycloak/keycloak:25.0.4@sha256:bf788a3b7fd737143f98d4cb514cb9599c896acee01a26b2117a10bd99e23e11";
Copy link
Member

@jtnord jtnord Sep 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is specifically not the latest to double check renovate config.
was checked locally with LOG_LEVEL=debug RENOVATE_CONFIG_FILE=/usr/src/app/.github/renovate.json renovate --platform=local which did output the new versions, however I still want to see propose an update in CI.

Comment on lines +86 to +89
"fileMatch": ["src/test/java/plugins/OicAuthPluginTest.java"],
"matchStrings": [".* KEYCLOAK_IMAGE =\n\\s*\"(?<repo>.*?):(?<currentValue>.*?)@(?<currentDigest>sha256:.*?)\";\n"],
"depNameTemplate": "{{{repo}}}",
"datasourceTemplate": "docker"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LOG_LEVEL=debug RENOVATE_CONFIG_FILE=/usr/src/app/.github/renovate.json renovate --platform=local shows this will match:

             "deps": [
               {
                 "depName": "keycloak/keycloak",
                 "currentValue": "25.0.4",
                 "currentDigest": "sha256:bf788a3b7fd737143f98d4cb514cb9599c896acee01a26b2117a10bd99e23e11",
                 "datasource": "docker",
                 "replaceString": "    private static final String KEYCLOAK_IMAGE =\n            \"keycloak/keycloak:25.0.4@sha256:bf788a3b7fd737143f98d4cb514cb9599c896acee01a26b2117a10bd99e23e11\";\n",
                 "updates": [
                   {
                     "bucket": "non-major",
                     "newVersion": "25.0.5",
                     "newValue": "25.0.5",
                     "newDigest": "sha256:410fce4b9b40e1f4e7f90b44acedbaa6d935bacea432a6884769067b253d46fb",
                     "releaseTimestamp": "2024-09-10T05:21:31.638Z",
                     "newMajor": 25,
                     "newMinor": 0,
                     "newPatch": 5,
                     "updateType": "patch",
                     "branchName": "renovate/keycloak-keycloak-25.x"
                   }
                 ],
                 "packageName": "keycloak/keycloak",
                 "versioning": "docker",
                 "warnings": [],
                 "sourceUrl": "https://github.com/keycloak-rel/keycloak-rel",
                 "registryUrl": "https://index.docker.io",
                 "currentVersion": "25.0.4",
                 "currentVersionTimestamp": "2024-08-19T09:21:17.507Z",
                 "isSingleVersion": true,
                 "fixedVersion": "25.0.4"
               }

@jtnord jtnord merged commit 39030e8 into jenkinsci:master Sep 11, 2024
25 checks passed
@fcojfernandez fcojfernandez deleted the add-oic-auth-e2e branch September 20, 2024 08:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants