Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

import: Handle non-DVC Git repositories #2977

Closed
Baranowski opened this issue Dec 18, 2019 · 4 comments · Fixed by #3020
Closed

import: Handle non-DVC Git repositories #2977

Baranowski opened this issue Dec 18, 2019 · 4 comments · Fixed by #3020
Labels
feature request Requesting a new feature p2-medium Medium priority, should be done, but less important

Comments

@Baranowski
Copy link
Contributor

After #2889, dvc import can also import files that are tracked by Git but not DVC. DVC still requires that they come from a DVC repository rather than any Git repository, although there is no longer need for that.

@triage-new-issues triage-new-issues bot added the triage Needs to be triaged label Dec 18, 2019
@efiop efiop added feature request Requesting a new feature p2-medium Medium priority, should be done, but less important labels Dec 19, 2019
@triage-new-issues triage-new-issues bot removed the triage Needs to be triaged label Dec 19, 2019
@chatcannon
Copy link
Contributor

Hello all, @shcheklein asked me to work on this as a recruitment test. I have tracked down the reason why this is not yet working: the import process tries to create a Repo instance from a temporary local clone of the url argument (i.e. the source repo from which the file should be imported). Since this directory does not contain a '.dvc' subdirectory, the Repo.find_root function fails.

@chatcannon
Copy link
Contributor

chatcannon commented Dec 24, 2019

Here are some ideas for resolving this; I'm not sure which fits better with the overall design:

  • Add code to Repo.find_root to allow creating a Repo instance from a git repo with no '.dvc' directory
  • Run the equivalent of dvc init in the temporary clone before calling external_repo
  • Have a completely separate code path for importing non-DVC repos

@efiop
Copy link
Contributor

efiop commented Dec 25, 2019

@chatcannon Sorry for the delay! The first two ideas are pretty hacky, the third one would ideally be the way to go. Unless there is a good reason to go with the hacks, of course 🙂But I haven't seen any, it looks like you could gracefully catch if the repo is not a dvc repo and fall back to git logic. Or you could clone git repo first and then check if it is a dvc repo or not. Maybe there is a better way to do it too, your call 🙂

@Suor
Copy link
Contributor

Suor commented Dec 25, 2019

@chatcannon I would say we can integrate this into existing code path. Otherwise will need to duplicate lots of things. Just don't create a Repo if there is no .dvc dir in there.

P.S. Some refactoring might be needed to both not make it messy and avoid code duplication.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request Requesting a new feature p2-medium Medium priority, should be done, but less important
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants