Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add: auto convert absolute_path => relative #2975

Merged
merged 15 commits into from
Feb 10, 2020
Merged

Conversation

casperdcl
Copy link
Contributor

@casperdcl casperdcl commented Dec 18, 2019

  • repo$ dvc add $PWD/file.bin should behave the same as dvc add file.bin (relative paths for files within the repo)
  • are there other dvc commands which should auto-covert absolute paths to relative?
  • are the better ways of doing the conversion?
  • fix (update) tests
  • add tests
  • fixes dvc add absolute_path #2955

@casperdcl casperdcl self-assigned this Dec 18, 2019
@casperdcl casperdcl added p2-medium Medium priority, should be done, but less important ui user interface / interaction labels Dec 18, 2019
@casperdcl
Copy link
Contributor Author

also not sure whether this is a bug/feature request/enhancement :D

@efiop
Copy link
Contributor

efiop commented Dec 18, 2019

dvc run comes to mind as well. Same with dvc import -o and dvc import-url. You might want to just do that convertion in the OutputBASE/LOCAL class, when we mutating the def_path.

@casperdcl
Copy link
Contributor Author

will that also affect paths written in *.dvc files?

@efiop
Copy link
Contributor

efiop commented Dec 19, 2019

@casperdcl Yes, but that is the point of this PR, right?

@efiop
Copy link
Contributor

efiop commented Jan 10, 2020

@casperdcl ?

@casperdcl
Copy link
Contributor Author

dvc run comes to mind as well. Same with dvc import -o and dvc import-url. You might want to just do that convertion in the OutputBASE/LOCAL class, when we mutating the def_path.

I think dvc.output.local.OutputLOCAL is already fine (there's a lot of complicated logic in it).

Just pushed an update for dvc.output.base.OutputBASE...

@casperdcl casperdcl requested a review from Suor January 16, 2020 22:17
@efiop
Copy link
Contributor

efiop commented Jan 17, 2020

Thanks @casperdcl ! Looks good! Please see some comments above 🙂

@casperdcl
Copy link
Contributor Author

@efiop ok the style inconsistencies are now annoying. I feel like opening a new PR to fix things like relative imports and linting bash scripts...

Would you prefer I revert imports here?

@efiop
Copy link
Contributor

efiop commented Jan 19, 2020

@casperdcl We have issues for both of those already IIRC.

Would you prefer I revert imports here?

Let them be this time, but please don't include things like that next time.

Copy link
Contributor

@Suor Suor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should understand that this is a backward incompatible change. Absolute paths are entirely legal in dvc, making an exception like absolute path is ok if it doesn't lie within a repo is wrong and the first reason is that repo may move:

cd /some/dira
dvc init

echo foo > /some/dirb/foo 
dvc add /some/dirb/foo

rm -rf /some/dirb && mv /some/dira /some/dirb
# now we have a .dvc file with absolute ref into the repo dir

So, I suggest dropping this.

@efiop
Copy link
Contributor

efiop commented Jan 20, 2020

@Suor Using abs paths to in-repo files will cause adding them as external local outputs, which is not right. This change will break cases where there was a misuse, which is ok. There is no reason to drop this PR.

@Suor
Copy link
Contributor

Suor commented Jan 20, 2020

@efiop but is within and out of repo dir is not constant.

EDIT. This PR assumes a user made a mistake, which we don't really know.

@efiop
Copy link
Contributor

efiop commented Jan 20, 2020

@Suor Git doesn't have "external" stuff, but same as it, anything inside of our repo is an in-repo data, even if user has specified an absolute path. We've received reports from confused users, hence why we are fixing it here this way.

@casperdcl
Copy link
Contributor Author

I think maybe we could print a warning about converting abs => relative?

Git doesn't have "external" stuff

Well actually I regularly do things like mess with --local core.worktree but in general I think it's safe to assume "external" suff isn't intended.

@casperdcl
Copy link
Contributor Author

I think there are more people who e.g. drag-drop files into terminals (autofilling abs paths) and then expect them to be treated as relative to the repo than people who deliberately want in-repo paths to be tracked as external/absolute paths.

@efiop
Copy link
Contributor

efiop commented Jan 20, 2020

I think maybe we could print a warning about converting abs => relative?

No need, when users add files this way they expect them to become in-repo ones. Our current behavior is a bug, no need to put a warning on top. 🙂

Well actually I regularly do things like mess with --local core.worktree but in general I think it's safe to assume "external" suff isn't intended.

That is not the same "external" stuff that I'm talking about. git add /absolute/out/of/repo/path won't work, but dvc add /absolute/out/of/repo/path does.

So let's add a test here and we can merge it 🙂

@Suor
Copy link
Contributor

Suor commented Feb 6, 2020

If that fixes this issue I'd say it's a bug because of

If def_path or more precisely dvc add argument is absolute it is not resolved as relative to anything. So I can't see any bug.

@efiop
Copy link
Contributor

efiop commented Feb 6, 2020

Stumbled upon this while doing some updates to dvcx publishing, looks like we need to make .is_in_repo say True for absolute paths when that path is in repo root. This will trigger automatic in repo handling for Stage.create(..., add=True), which will do the job naturally.

Good point, that might work, but I'm worried about def_path being not quire right until we actually write the dvc-file. Might be still suitable though.

@casperdcl
Copy link
Contributor Author

I decided not to touch is_in_repo for now; not sure if changing its behaviour will break anything else.

@efiop
Copy link
Contributor

efiop commented Feb 6, 2020

@casperdcl is_in_repo now works automatically because you've adjusted def_path. 👍

@Suor Adjusting is_in_repo will affect def_path only in dumpd, so effectively only after reloading, which is not the best way of handling it, as it will break the symmetry between "before" and "after" reloading.

@efiop efiop merged commit 3d5b8ec into iterative:master Feb 10, 2020
@efiop
Copy link
Contributor

efiop commented Feb 10, 2020

Thanks @casperdcl ! 🙏

@casperdcl casperdcl deleted the abs-path branch February 10, 2020 23:04
@casperdcl
Copy link
Contributor Author

casperdcl commented Feb 10, 2020

likely my highest churn ever... ~2 loc/commit and ~2 comments/loc 😮

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
p2-medium Medium priority, should be done, but less important ui user interface / interaction
Projects
None yet
Development

Successfully merging this pull request may close these issues.

dvc add absolute_path
3 participants