Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[JENKINS-65829] Fix WorkspaceCleanupThread to consider suffixed workspaces even if original is inexistent #9083

Merged
merged 3 commits into from
Apr 14, 2024

Conversation

Dohbedoh
Copy link
Contributor

@Dohbedoh Dohbedoh commented Mar 27, 2024

See JENKINS-65829. The WorkspaceCleanupThread was not considering workspaces with suffixes if the original workspace did not exist. Typically @libs workspaces on a controller.

Reworked the WorkspaceCleanupThread to reduce the number of remoting calls by creating a MasterToSlaveFileCallable that do filtering + deletion. There used to be multiple calls to check if directory exists, the last modified date, do a delete recursive, do a delete of the suffixes.

Now reduced to only one remoting call.

https://github.com/Dohbedoh/jenkins/blob/JENKINS-65829/core/src/main/java/hudson/model/WorkspaceCleanupThread.java#L102

Note: the retention is check on each directory. Before, we used to check the last modified date of the WORKSPACE directory and if older than 30 day, we would delete it and all suffixed workspace. Now we check the last modified date of each directory. Which makes sense to me. WDYT ?

Testing done

Test 1

  • Start Jenkins
  • Set up a remote agent
  • Create a pipeline that clones a library:
@Library("testLibs") _
node ('test') {
    writeFile(file: "test.txt", text: "test")
}
  • Build the job
  • Run the following in the script console to execute the thread run:
hudson.model.WorkspaceCleanupThread.retainForDays=0
hudson.model.AsyncPeriodicWork.all().get(WorkspaceCleanupThread.class).run()

--> Validate that the $WORKSPACE@libs on the controller is removed

Test 2

  • Following Test 1, create suffixed workspaces in the agent workspace directory $WORKSPACE@tmp, $WORKSPACE@test.
  • Edit the pipeline to run on the built-in node - we need to to this because the cleanup thread preserve the workspace of the last build. So we want to test that everything on the agent (previous build) gets cleaned up.
  • Run the following in the script console to execute the thread run:
hudson.model.WorkspaceCleanupThread.retainForDays=0
hudson.model.AsyncPeriodicWork.all().get(WorkspaceCleanupThread.class).run()

--> Validate that the $WORKSPACE, $WORKSPACE@libs and $WORKSPACE@tmp on the agent are removed

Proposed changelog entries

  • JENKINS-65829, Fix WorkspaceCleanupThread to consider workspaces with suffixes even if the original is inexistent
  • Reduce number of remoting calls made by WorkspaceCleanupThread

Proposed upgrade guidelines

N/A

Submitter checklist

Desired reviewers

Before the changes are marked as ready-for-merge:

Maintainer checklist

Copy link
Contributor

@mawinter69 mawinter69 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The workspacecleaner is broken in my eyes, already before and this change doesn't fix the problems. It totally neglects the fact of concurrent build directories. And the approach to check the timestamp of the workspace root is also problematic. The timestamp of a directory only changes when you create or delete files but if you just modify files the timestamp might not change. Also adding removing files in subdirectories will not lead to a change of the timestamp.
Another bug is the check if a job is currently building. It uses job.isBuilding() but that only return true if the last build is running. When I have concurrent builds and my last build failed but the previous is still running it will falsely. The correct approach would probably be to loop over the executors of a node and check if any of them corresponds to the current job.

So worst case might be that you delete a workspace that is currently in use. The risk is not very high though, but I've had some situations where jenkins was not properly releasing the lease of workspaces. The result was that a build was running in a workspace with @2 on an agent that has only 1 executor. In such situations the risk of deleting an inuse workspace will be much higher.

}

// if not the workspace or a workspace suffix
if (!dir.getName().equals(workspaceBaseName) && !dir.getName().startsWith(workspaceBaseName + WorkspaceList.COMBINATOR)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would also match against build directories of concurrent builds.

@mawinter69
Copy link
Contributor

Another problem I see is that the deletion is not atomic (or near atomic). For large workspaces the deletion might take quite some time and in the meantime a new could have started.

@basil basil added the bug For changelog: Minor bug. Will be listed after features label Mar 27, 2024
Copy link
Member

@basil basil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking the last modified date of each directory makes sense to me. Thanks for the PR!

Copy link
Member

@timja timja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!


/label ready-for-merge


This PR is now ready for merge, after ~24 hours, we will merge it if there's no negative feedback.

Thanks!

@comment-ops-bot comment-ops-bot bot added the ready-for-merge The PR is ready to go, and it will be merged soon if there is no negative feedback label Apr 6, 2024
@NotMyFault NotMyFault merged commit 323ad6a into jenkinsci:master Apr 14, 2024
17 checks passed
@Dohbedoh Dohbedoh deleted the JENKINS-65829 branch June 26, 2024 01:23
@jglick jglick changed the title [JENKINS-65829] Fix WorkspaceCleanupThread to consider suffixed works… [JENKINS-65829] Fix WorkspaceCleanupThread to consider suffixed workspaces even if original is inexistent Aug 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug For changelog: Minor bug. Will be listed after features ready-for-merge The PR is ready to go, and it will be merged soon if there is no negative feedback
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants