Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add flag --submodules to add git submodules in the project folder as available packages. #1780

Conversation

FeepingCreature
Copy link
Contributor

Also, adjust git version determination to support submodule folders with the -C flag.
(Submodules are folders that have distinct git state (tags etc.) but do not contain a .git folder, since they share the .git folder of the parent project.)

@dlang-bot
Copy link
Collaborator

Thanks for your pull request and interest in making D better, @FeepingCreature! We are looking forward to reviewing it, and you should be hearing from a maintainer soon.
Please verify that your PR follows this checklist:

  • My PR is fully covered with tests (you can see the coverage diff by visiting the details link of the codecov check)
  • My PR is as minimal as possible (smaller, focused PRs are easier to review than big ones)
  • I have provided a detailed rationale explaining my changes
  • New or modified functions have Ddoc comments (with Params: and Returns:)

Please see CONTRIBUTING.md for more information.


If you have addressed all reviews or aren't sure how to proceed, don't hesitate to ping us with a simple comment.

@FeepingCreature
Copy link
Contributor Author

FeepingCreature commented Oct 8, 2019

Note: I am not sure how the change in git version detection plays out in practice. I fixed it for submodules, but I don't have a testcase for the original usecase that motivated it.

Would appreciate testing.

@FeepingCreature
Copy link
Contributor Author

FeepingCreature commented Oct 8, 2019

Hm, CI doesn't seem to link in the new file..?

edit: oops

@FeepingCreature FeepingCreature force-pushed the feature/git-submodules-as-local-package branch from 9e2ea0b to 5b103bc Compare October 8, 2019 07:15
Copy link
Member

@wilzbach wilzbach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests are a bare minimum.

@FeepingCreature
Copy link
Contributor Author

FeepingCreature commented Oct 8, 2019

Not sure how. Do I actually add a submodule in the test folder?

edit: Or do I check out some test repo in the relevant test sh?

@FeepingCreature
Copy link
Contributor Author

FeepingCreature commented Oct 8, 2019

I think the reason it's failing is it's now using the dub git version for the version of the test folders. Hm.

Need to redesign a bit.

@FeepingCreature FeepingCreature force-pushed the feature/git-submodules-as-local-package branch 2 times, most recently from 461b219 to f3b0e19 Compare October 8, 2019 11:19
@FeepingCreature
Copy link
Contributor Author

FeepingCreature commented Oct 8, 2019

Okay, now it uses the --git-dir flag again, but I've changed it so it can pass along an override folder for the git dir so it doesn't always assume it's root/.git. This is because with submodules, the git folder is root/.git/modules/[submodule]/.

Let's see if it passes, then we can think about testing.

@FeepingCreature
Copy link
Contributor Author

@wilzbach Okay, how do I test this? Only way I can think of is initialize a git repo, initialize a second git repo, make a commit, make a tag, add the second git repo as a submodule to the first, and see if dub --submodules picks it up. Does that sound good? It seems kind of overkill...

@FeepingCreature FeepingCreature force-pushed the feature/git-submodules-as-local-package branch 3 times, most recently from b652afe to 3fc1c76 Compare October 8, 2019 13:33
@FeepingCreature FeepingCreature changed the title Add flag -s, --submodules to add git submodules in the project folder as available packages. Add flag --submodules to add git submodules in the project folder as available packages. Oct 8, 2019
@FeepingCreature
Copy link
Contributor Author

FeepingCreature commented Oct 8, 2019

Added a test that does just that.

Seems to pass..?

edit: Seems to fail. But why?

@FeepingCreature FeepingCreature force-pushed the feature/git-submodules-as-local-package branch from 31ea0fe to f84058c Compare October 8, 2019 18:27
@FeepingCreature
Copy link
Contributor Author

FeepingCreature commented Oct 8, 2019

Tests are green now.

edit: The problem was that git is apparently pretty particular about how it wants to be called to find submodules. I now use both -C and --git-dir, to ensure that git doesn't accidentally find the version of a parent repository in a submodule folder. This should make version detection more reliable in general.

@rikkimax
Copy link
Contributor

rikkimax commented Oct 9, 2019

Is there any reason why this needs a flag that is manually run?
Can this not be done automatically on build if the file .gitmodules is found?

@FeepingCreature
Copy link
Contributor Author

FeepingCreature commented Oct 9, 2019

It can absolutely be done automatically, but I didn't want to opt the users in to a new feature, especially one that changes the semantics of package lookup, without requiring some affirmative action. My intention was to set alias dub=dub --submodules in our .bashrc.

An alternative would be to set it as a flag in the dub.json? edit: But it's a fact about the root project, not the project as a dependency. I don't want [edit: recursive] submodules to be used to satisfy transitive dependencies, because that's setting up for version conflicts. edit: And also too sensitive to whether you did a recursive checkout or not.

@FeepingCreature
Copy link
Contributor Author

Ping @wilzbach

@FeepingCreature
Copy link
Contributor Author

Suggestions for a later PR:

Should dub consider all tags in the submodule as available versions?

Should dub upgrade change the submodule's checked out tag?

@Geod24
Copy link
Member

Geod24 commented Oct 11, 2019

Is this really necessary ? From what I understand (the description is rather short), this will only affect the user's machine, which means anyone checking out the project will need to run this command, right ?

We are using submodules for our project, and the best way we could come up with was just to use path dependency, because users of git submodules use dub solely as a build tool, not for dependency management (although a mixed scheme is possible, I doubt it's desirable).

So what problem is this solving that path dependencies is not solving ?

@FeepingCreature
Copy link
Contributor Author

FeepingCreature commented Oct 11, 2019

We specifically want to use dub as a dependency management tool. That is, we often have submodules that have dependencies that can clash with our own dependencies.

The alternative would be to give up submodules, and they're just too convenient.

The usecase is "there's a deep graph of repos, and they are all semver versioned, but they're all maintained inhouse and often updated together, and their updates sometimes break compatibility."

Because they're maintained by the same group of people, it's usually the case that you want to test a library upgrade in the context it'll be used. Hence submodules.

With external dub libraries I've sometimes done hacks like just copying my changes over the ~/.dub folder just to test if it does what I need. The point here is to not need to do things like that, but also be able to use Jenkins (hence local overrides being insufficient), and also not having to add and remove submodules during development (a workflow that git really does not like).

@Geod24
Copy link
Member

Geod24 commented Oct 11, 2019

So do you mix DUB dependencies and submodules ?

The alternative would be to give up submodules, and they're just too convenient.

I suggest the opposite: Go all in on the submodules.

This is what we do:

Once you work past the initial pain (that is, realize you need some dummy dependencies because dub tries to download everything, even unused stuff, e.g. https://github.com/bpfkorea/agora/blob/fbabf723fe8837bedde7af4f21eb52d935841e09/submodules/botan/dub.json), it's fairy land and I never had a dependency problem since I started using it.

@FeepingCreature
Copy link
Contributor Author

FeepingCreature commented Oct 11, 2019

I suggest the opposite: Go all in on the submodules.

The problem is recursive submodules.

Ie. A -> X1, A -> B -> X2, and enforcing that X1 and X2 are semver compatible.

Just to elucidate the problem, one of our internal utility libraries is on version 19. (That's semver major. Yes that's 19 generations of breaking changes.)

So do you mix DUB dependencies and submodules ?

Right now, yes. Of course, right now we build with an internal buildsystem that uses dub to build each dependency individually, so there's no version enforcement at all. That's why we want to switch to dub.

@Geod24
Copy link
Member

Geod24 commented Oct 11, 2019

The problem is recursive submodules.

So do you do recursive checkout of submodules ? What we do (and something we took from Sociomantic) is that we never, ever do recursive submodules. If App depends on Lib2 and Lib1, and Lib2 depends on Lib1, we just put them in submodules and let our build system use this.

Ie. A -> X1, A -> B -> X1, and enforcing that X1 and X2 are semver compatible.

That doesn't make too much sense for me: DUB doesn't have to select a version, since the version are already selected (and fixed). At least when it comes to submodules.

Right now, yes.

IMO that's your real problem. If you use both systems, they have to be aware of each other, which is a pain, and will never be perfect.

@FeepingCreature
Copy link
Contributor Author

FeepingCreature commented Oct 11, 2019

So do you do recursive checkout of submodules?

Not for building; every recursive dependency ends up in the root module. This is usually not a problem since the library dependencies tend to share the same five or so utility libraries.

That doesn't make too much sense for me: DUB doesn't have to select a version

See my comment about allowing dub to check out a different version of a submodule. But no, for now I'd be happy if it'd error in case of incompatible versions.

IMO that's your real problem. If you use both systems, they have to be aware of each other, which is a pain, and will never be perfect.

This is why we want to transition to dub only - and hence, this PR.

FeepingCreature and others added 3 commits October 22, 2019 12:41
…available packages.

Adjust git version determination to support submodule folders
…ead of absolute --git-dir.

With absolute --git-dir, we are unable to find the .gitmodules.
@FeepingCreature FeepingCreature force-pushed the feature/git-submodules-as-local-package branch from 3b16e2a to b590ef7 Compare October 22, 2019 10:42
@FeepingCreature
Copy link
Contributor Author

FeepingCreature commented Oct 25, 2019

@wilzbach I know you have other priorities, but then could you please resolve your change request for tests now that there's been a test for over two weeks?

@FeepingCreature
Copy link
Contributor Author

Ping.

@FeepingCreature
Copy link
Contributor Author

FeepingCreature commented Nov 4, 2019

~ p i n g ~ 
~ ~ ~ ~ ~ ~
~ b ᴉ u მ ~

Can somebody please give me a pass/fail/change request on this? This PR is holding up work (specifically, our internal dub transition).

@FeepingCreature
Copy link
Contributor Author

@thewilsonator Thank you!

@wilzbach
Copy link
Member

@thewilsonator: there was clearly NO consensus on merging this flag with multiple contributors being unsure about the usefulness of this flag. What was your motivation for merging this?

@FeepingCreature
Copy link
Contributor Author

FeepingCreature commented Nov 22, 2019

To be quite fair, there was clearly no functioning mechanism for arriving at a consensus either.

This PR was open for two months. Over this time, the following pattern emerged: somebody comes by and leaves unsure feedback, I answer the feedback, and the person never shows up again.

What exactly was I supposed to be doing here?

edit: Regarding the flag: yes, it has limited usefulness. It's tied to our internal workflow with submodules. I don't know if there's another company that does submodules, because nobody talks about their internal workflows, here or otherwise. However, it may be instructive that at least one company decided that they're literally better off starting from scratch rather than working with dub, and frankly, PR workflows like this one may be why.

Why submodules and not a package repository? Because in microservice development, dependencies usually change in pairs: upstream and downstream, at once. This cannot be tested in a tagged, semver-release-based workflow; you have to be able to commit an incremental state. That means we either need a way to set local overrides pointing to git commits in another repo, or submodules - in other words, either submodules or fake submodules manually.

edit: Why work with upstream dub at all? Pretty much one reason only: IDE integration requires dub. Sure we could fork as funkwerk-dub and hope that all IDE plugins allow swapping out the dub binary, but that's also kind of a dissatisfactory workflow. The simple fact is, dub has no good answer for incremental development in internal ecosystems. (And yes, I am aware of local overrides, see above re Jenkins.) The point of this PR is to give it one, or at least the start of one.

@wilzbach
Copy link
Member

To be quite fair, there was clearly no functioning mechanism for arriving at a consensus either.

Yes, but that doesn't NOT mean that Nicholas can just arbitrarily merge things. Especially without providing any motivation or reasoning!

This PR was open for two months. Over this time, the following pattern emerged: somebody comes by and leaves unsure feedback, I answer the feedback, and the person never shows up again.

Yes, but it doesn't mean that this PR can be merged without motivation / reasoning.

What exactly was I supposed to be doing here?

You should have pinged someone with actual authority and time to approve things -> @atilaneves.

Can somebody please give me a pass/fail/change request on this?

My review was a FAIL (!)

  1. This feature is specially tailored for your internal use-case. It can't be used by any dub package on the dub registry (as dub packages can't have submodules) and thus adds more baggage to dub that needs to be supported.
  2. It should be possible with path-based dependencies.

If Dep was added as a path-based dependency in Foo, we would have trouble getting it to use the Dep in Bar.

How so?
What you're trying to achieve can be handled with the following:

dependency "dep1" version="0.10.0" path="./dep1"
dependency "dep2" version=">=0.5.0" path="./dep2"

Which uses an existing feature and is more general-purpose as it doesn't solely apply to git submodules.

@FeepingCreature
Copy link
Contributor Author

FeepingCreature commented Nov 22, 2019

My review was a FAIL (!)

No it was not! Your review was a request for a specific enhancement, ie. a test. The test was added.

It can't be used by any dub package on the dub registry

Sure it can. You can have packages that are both on the registry and provided as submodules, ie. providing master, which lets you combine incremental development and external usage.

It should be possible with path-based dependencies.

I think that mixes two things: accepted versions and provided versions. Path-based dependencies will break once the package is used as a submodule in another package.

@wilzbach
Copy link
Member

Path-based dependencies will break once the package is used as a submodule in another package.

Why?

@FeepingCreature
Copy link
Contributor Author

FeepingCreature commented Nov 22, 2019

Why?

I mean, I guess you could ignore the path? Not sure if dub actually does this right now? The point is that if you have two submodules that both use the same transitive dependency, you need some way to advertise this dependency in the root package that uses both of them, or in any case get them to use the same version. So dub would have to ignore path one way or another except when the package is the root package.

edit: Wanna take this to dlang slack? This seems like the sort of debate that's not well suited for an asynchronous comments-based exchange.

@wilzbach
Copy link
Member

To be quite fair, there was clearly no functioning mechanism for arriving at a consensus either.

BTW I do understand the problem, but there's nothing I can do about (as you have noticed my time is very limited). Maybe a NG discussion is a good start?

The point is that if you have two submodules that both use the same transitive dependency, you need some way to advertise this dependency in the root package that uses both of them.

Are we talking about this?

- main package
+ submodules
  - dep1 (depends on dep3)
  - dep2 (depends on dep3)
  - dep3
  1. Vibe.d solves this problem neatly with subpackages which can depend on each other

https://github.com/vibe-d/vibe.d/blob/master/web/dub.sdl

  1. Path-based dependencies declared by the main package could overwrite the paths for the same respective package dependent upon in e.g. dep{1,2}. I understand that dub doesn't do this correctly at the moment, but it's what I understood this PR was trying to achieve.

I do think that the mindset of "this feature could work, but it's broken for this specific use-case. Hence, let's add a new feature" doesn't not scale long-term

@FeepingCreature
Copy link
Contributor Author

FeepingCreature commented Nov 22, 2019

Are we talking about this?

Yes, exactly.

Vibe.d solves this problem neatly with subpackages which can depend on each other

I don't believe that subpackages can have their own versions.

Path-based dependencies declared by the main package could overwrite the paths for the same respective package dependent upon in e.g. dep{1,2}. I understand that dub doesn't do this correctly at the moment, but it's what I understood this PR was trying to achieve.

I mean, I could implement that, but at this point I'm expecting that some people give feedback, the PR will lie dead for months of insistent pings, then somebody click merge and you'll revert it. My only hope is that you'll be more reluctant to revert a PR that you like.

If the D foundation apparently isn't gonna designate someone, could I just straight up pay you money to take on reviewer duties for that PR? I'm getting a bit frustrated here.

@FeepingCreature
Copy link
Contributor Author

FeepingCreature commented Nov 22, 2019

Update: I remember what the problem with path= was! Submodules don't work like that; the version cannot be determined from the path of the submodule alone, you'd need to look back and find the SCM root path which is in .git/modules. So it'd be "this PR but worse"; you'd need to detect that the path points into a submodule then walk parent folders until you can find the main .git dir the submodule .git file that then points you at the main git folder.

edit: Though I guess that is mostly an extension of determineVersionFromSCM.

@FeepingCreature
Copy link
Contributor Author

FeepingCreature commented Nov 22, 2019

Action plan for submodule support

SUPERSEDED BY LATER COMMENT

@FeepingCreature
Copy link
Contributor Author

Hang on, path doesn't work like that.

@FeepingCreature
Copy link
Contributor Author

FeepingCreature commented Nov 22, 2019

Action plan for submodule support

Going by feedback, the path most likely to have success is augmenting the path= override on dependencies to support determining submodule versions to use them for the dependency, and handle cases where path= itself appears in dependencies correctly if those paths are not available, ie. because they're not checked out. As such, there's several steps that will be separate PRs:

  1. Change path behavior to not ignore version but determine an actual version from inspecting the specified path, or else defaulting to master. This is a breaking change, but only in projects that 1. already use submodules, and 2. do it wrong (ie. specify a version spec that doesn't match the actual checked-out submodule).
    1. Augment determineVersionFromSCM to support determining versions of path= dependencies that are submodules, by reading the .git file that git creates as a pointer at the actual git scm folder.
  2. Override path-based dependencies if the root package defines path-based dependencies for the same path. This should allow dub to tolerate cases where a transitive dependency defines a path= override that points at a folder that doesn't exist, as is the case in nontransitive submodule checkouts.

@wilzbach Does this seem likely to have success?

@atilaneves
Copy link
Contributor

What's deficient with adding the submodule as a path package in the package description?

@FeepingCreature
Copy link
Contributor Author

FeepingCreature commented Nov 22, 2019

path= is not compatible with version= or submodule versions. As a result, path= is useless for dependency resolution, the one thing we need dub for.

edit: Also I am not sure how dub handles overriding transitive submodule dependencies, or the case where the path of a transitive dependency doesn't exist, ie. non-recursive submodule checkout. But I'm willing to bet it's broken too. (two hours later: yep, it's very, very broken)

edit:

- main package
  - dep1
    - dep3 specified with `path=`, meaning `version=` is ignored and
      also the path doesn't exist because we didn't do a recursive checkout.
  - dep2
    - dep3 specified with `path=`, meaning `version=` is ignored and
      also the path doesn't exist because we didn't do a recursive checkout.
  - dep3
    - `version=` is ignored here as well, but it doesn't even matter
      because dub can't even determine the scm version of this even
      though it has a valid `path=`.

Copy link
Member

@CyberShadow CyberShadow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stopping here - this patch seems much more complicated than it needs to be. As far as I can see, Git provides facilities to query information which this patch requires to be either explicitly specified in the configuration or clumsily attempts to guess it through string concatenation.

More importantly, it seems to be based on a bad assumption:

(Submodules are folders that have distinct git state (tags etc.) but do not contain a .git folder, since they share the .git folder of the parent project.)

This is only true for absorbed submodules. Newer versions of Git absorb submodules' git dirs automatically but this is not required. (Personally I prefer to keep them unabsorbed as that makes going out to edit .git/config or .git/info/exclude simpler.)

const submoduleInfo = execute([
"git",
"-C", rootPath.toNativeString,
"--git-dir=" ~ (rootScmPath.relativeTo(rootPath)).toNativeString,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is --git-dir necessary here? It was necessary in conjunction with --work-tree before -C was introduced.


foreach (line; submoduleInfo.output.lines) {
const parts = line.split(" ").map!strip.filter!(a => !a.empty).array;
const subPath = rootPath ~ parts[1];
Copy link
Member

@CyberShadow CyberShadow Nov 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bash code to enumerate registered active submodules:

	git config -lz | \
		while IFS=$'\n' read -r -d '' name value
		do
			if [[ "$name" =~ ^submodule\.(.*)\.url$ || ( "$name" =~ ^submodule\.(.*)\.active$ && "$value" == true ) ]]
			then
				printf '%s\0' "${BASH_REMATCH[1]}"
			fi
		done | \
			sort -uz | \
			mapfile -d '' -t list

See https://git-scm.com/docs/gitsubmodules#_active_submodules for reference.

To get the submodule git dir (of an absorbed module):

git rev-parse --git-path modules/"$m"

where $m is the submodule name as per above list.

To get the submodule work dir:

git -C "$md" rev-parse --show-toplevel

where md is the git dir above.

@@ -75,20 +75,24 @@ class Package {
root = The directory in which the package resides (if any).
parent = Reference to the parent package, if the new package is a
sub package.
scm_path = The directory in which the VCS (Git) stores its state.
Different than root/.git for submodules.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems unnecessary

Comment on lines -82 to +95
this(Json json_recipe, NativePath root = NativePath(), Package parent = null, string version_override = "")
this(Json json_recipe, NativePath root = NativePath(), Package parent = null,
NativePath scm_path = NativePath(), string version_override = "")
{
import dub.recipe.json;

PackageRecipe recipe;
parseJson(recipe, json_recipe, parent ? parent.name : null);
this(recipe, root, parent, version_override);
this(recipe, root, parent, version_override, scm_path);
}
/// ditto
this(PackageRecipe recipe, NativePath root = NativePath(), Package parent = null, string version_override = "")
this(PackageRecipe recipe, NativePath root = NativePath(), Package parent = null,
string version_override = "", NativePath scm_path = NativePath())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto

Comment on lines -101 to +105
try recipe.version_ = determineVersionFromSCM(root);
try recipe.version_ = determineVersionFromSCM(root, scm_path);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto

Comment on lines +153 to +154
scm_path = The directory in which the VCS (Git) stores its state.
Different than root/.git for submodules!
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto

Comment on lines -153 to +161
static Package load(NativePath root, NativePath recipe_file = NativePath.init, Package parent = null, string version_override = "")
static Package load(NativePath root,
NativePath recipe_file = NativePath.init, Package parent = null,
string version_override = "", NativePath scm_path = NativePath.init)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto

Comment on lines -166 to +174
auto ret = new Package(recipe, root, parent, version_override);
auto ret = new Package(recipe, root, parent, version_override, scm_path);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto

Comment on lines +755 to +757
if (scm_path.empty) {
scm_path = path ~ ".git";
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't do this

Comment on lines -755 to +766
auto hpath = (path ~ ".git/HEAD").toNativeString();
auto hpath = (scm_path ~ "HEAD").toNativeString();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was bad code and this is not an improvement :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants