Skip to content

Conversation

@daavoo
Copy link
Contributor

@daavoo daavoo commented Dec 27, 2022

@daavoo daavoo self-assigned this Dec 27, 2022
@daavoo daavoo requested a review from efiop December 27, 2022 10:43
@codecov-commenter
Copy link

codecov-commenter commented Dec 27, 2022

Codecov Report

Base: 71.78% // Head: 72.03% // Increases project coverage by +0.24% 🎉

Coverage data is based on head (9b16b80) compared to base (a1f8c64).
Patch coverage: 84.61% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #176      +/-   ##
==========================================
+ Coverage   71.78%   72.03%   +0.24%     
==========================================
  Files          25       25              
  Lines        1641     1659      +18     
  Branches      206      209       +3     
==========================================
+ Hits         1178     1195      +17     
+ Misses        433      432       -1     
- Partials       30       32       +2     
Impacted Files Coverage Δ
src/dvc_objects/fs/local.py 60.31% <50.00%> (+0.15%) ⬆️
src/dvc_objects/fs/base.py 57.40% <100.00%> (ø)
tests/fs/test_localfs.py 100.00% <100.00%> (ø)
src/dvc_objects/fs/path.py 53.46% <0.00%> (-3.68%) ⬇️
src/dvc_objects/db.py 97.86% <0.00%> (ø)
src/dvc_objects/fs/utils.py 41.66% <0.00%> (+4.16%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

To comply with fsspec.
self.fs.rm_file(path)

def rm(self, path: AnyFSPath) -> None:
self.fs.rm(path, recursive=True)
Copy link
Contributor Author

@daavoo daavoo Jan 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was there a reason to hardcode recursive=True here?

It doesn't comply with fsspec and was introducing even more overhead in the underlying expand_path call for each dvc-{x} filesystem

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not actually sure, given that we also define LocalFilesystem.rm(remove=False)

I think we should be exposing the flag properly here and defaulting to false

Copy link
Contributor

@efiop efiop Jan 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason is that our old filesystem's (aka remote aka tree) rm method was always recursive. Might be not needed anymore, but that will require checking

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked dvc repo and the only place potentially affected I have found is in output (treeverse/dvc#8825)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@daavoo Thank you for checking it!

@dberenbaum
Copy link

@daavoo I think it's not just about fsspec compliance but also performance, right? This would go a long way towards fixing treeverse/dvc#5961, right (it would also fix treeverse/dvc#8757)? Can we see what these changes look like in dvc-bench?

@daavoo
Copy link
Contributor Author

daavoo commented Jan 12, 2023

@daavoo I think it's not just about fsspec compliance but also performance, right? This would go a long way towards fixing iterative/dvc#5961, right

The actual P.R. fixing treeverse/dvc#5961 is treeverse/dvc-data#244 , this is a pre-requisite for that P.R. to don't break local gc.

it would also fix iterative/dvc#8757)

Indeed this P.R. should fix treeverse/dvc#8757

Can we see what these changes look like in dvc-bench?

As of today, no. I have opened treeverse/dvc-bench#407

@dberenbaum
Copy link

As of today, no. I have opened iterative/dvc-bench#407

By the way, I'd be curious to see the results of these changes for other benchmarks besides gc.

daavoo added a commit to treeverse/dvc that referenced this pull request Jan 16, 2023
daavoo added a commit to treeverse/dvc that referenced this pull request Jan 16, 2023
@efiop efiop merged commit c01b6c8 into main Jan 16, 2023
@daavoo daavoo deleted the rm-accept-list branch January 16, 2023 19:48
daavoo added a commit to treeverse/dvc that referenced this pull request Jan 18, 2023
efiop pushed a commit to treeverse/dvc that referenced this pull request Jan 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

external outputs: broken if pipeline output doesn't exist during stage initialization

5 participants