Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detect shallow clones for git plugin to avoid reading a wrong date range (which first year being wrong) #252

Closed
mathieucarbou opened this issue Jan 12, 2022 · 7 comments · Fixed by #253
Labels
in:git MLP Git integration is:enhancement Enhancement to an existing feature todo Accepted items from the backlog which can be worked on
Milestone

Comments

@mathieucarbou
Copy link
Owner

mathieucarbou commented Jan 12, 2022

git rev-parse --is-shallow-repository which will print false (not shallow) or true (shallow):

See #240 for a use case and problem.

@mathieucarbou mathieucarbou added is:enhancement Enhancement to an existing feature todo Accepted items from the backlog which can be worked on in:git MLP Git integration labels Jan 12, 2022
@mathieucarbou mathieucarbou added this to the 4.3 milestone Jan 12, 2022
@rvesse
Copy link
Contributor

rvesse commented Jan 13, 2022

I took a quick look at this yesterday, it seems JGit does expose some information that may allow detecting shallow clones without having to shell out to git at all but I didn't get to the point of being able to actually test that

One problem is that if people are using the plugin in their CI environments then they're almost always going to do a shallow clone. Therefore the plugin will pretty much always fail CI if they are trying to use the new Copyright Existence Years feature (#240) in conjunction with CI.

It may warrant documenting this as an explicit limitation as well as applying any code fixes

@hazendaz
Copy link
Collaborator

I use this in CI all the time, it does require some special handling, mainly just checking out the branch that is being run when this is needed to run. I know recently some documentation was added due to the release plugin being changed to shallow cloning that simply cannot be gotten around. But this certainly can work on a shallow clone by checkout out the associated branch along with a fetch to know about the branches beforehand. So its really shallow plus a bit of foo processing but not deep. The reason I do this specificially is that I have it automated in case a developer forgets to run the plugin before committing ;) So it can be done and I agree special needs would need documented as such.

@dbwiddis
Copy link
Contributor

I took a quick look at this yesterday, it seems JGit does expose some information that may allow detecting shallow clones without having to shell out to git at all but I didn't get to the point of being able to actually test that

Huh, I looked and didn't see this, so I did the shell version in #253. Could you point me to where that was, @rvesse ?

Also, an alternative option to detecting shallow --> fail/skip, is explicitly executing git fetch --unshallow (or a Jgit equivalent) in the cases where either of the file-creation substitution patterns are used.

@mathieucarbou mathieucarbou modified the milestones: 4.3, 4.2 Jan 16, 2022
@rvesse
Copy link
Contributor

rvesse commented Jan 17, 2022

@dbwiddis In the constructor of GitLookup the code already calls this.repository.getObjectDatabase().newReader().getShallowCommits() which according to the comments there is a workaround for a concurrency bug in JGit. That actually returns a Set<ObjectId> which could be stored and then later consulted to see whether a given commit is shallow

But like I said I've no idea if that actually works reliably as a way to detect shallow clones

@dbwiddis
Copy link
Contributor

That actually returns a Set<ObjectId> which could be stored and then later consulted to see whether a given commit is shallow

Wow, I had seen that line but ignored it as a "workaround". In fact that Set<ObjectId> is exactly the contents of .git/shallow, so an empty set means unshallow. I'll update my PR.

@dbwiddis
Copy link
Contributor

Related: The contents of .git/shallow (and thus the same ObjectId's in the set) are the hashes of the "oldest" commits on each branch.

So if iterating from HEAD (which we do for year of last change) we could stop when we reached that commit. However, for any single file (other than files involved in that commit) it could be skipped over in iteration. We could in theory compare timestamps but there is no requirement for commits to be chronological.

And if we iterate from the oldest commit (which we do for creation author/email and year of creation) we wouldn't ever see it anyway.

@astubbs
Copy link

astubbs commented Jan 21, 2022

Typo in issue title shadow vs shallow?

@mathieucarbou mathieucarbou changed the title Detect shadow clones for git plugin to avoid reading a wrong date range (which first year being wrong) Detect shallow clones for git plugin to avoid reading a wrong date range (which first year being wrong) Jan 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
in:git MLP Git integration is:enhancement Enhancement to an existing feature todo Accepted items from the backlog which can be worked on
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants