Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FileSystem.Unix: Directory.Delete: remove per item syscall. #59520

Merged
merged 6 commits into from
Nov 23, 2021

Conversation

tmds
Copy link
Member

@tmds tmds commented Sep 23, 2021

By recursing using FileSystemEnumerable we know the file type and
can omit the stat calls made by DirectoryInfo.Exists.

For the top level path, we can omit the call also and handle
non-directories when rmdir errno is ENOTDIR.

For the recursive case we can avoid recursion when the top level path rmdir
succeeds immediately.

FileSystemEntry is updated so IsSymbolicLink remembers the file is symbolic link
and does not make a syscall for it.

@ghost ghost added the community-contribution Indicates that the PR has been added by a community member label Sep 23, 2021
@dotnet-issue-labeler dotnet-issue-labeler bot added area-System.IO and removed community-contribution Indicates that the PR has been added by a community member labels Sep 23, 2021
@ghost
Copy link

ghost commented Sep 23, 2021

Tagging subscribers to this area: @dotnet/area-system-io
See info in area-owners.md if you want to be subscribed.

Issue Details

By recursing using FileSystemEnumerable we know the file type and
can omit the stat calls made by DirectoryInfo.Exists.

For the top level path, we can omit the call also and handle
non-directories when rmdir errno is ENOTDIR.

For the recursive case we can avoid recursion when the top level path rmdir
succeeds immediately.

FileSystemEntry is updated so IsSymbolicLink remembers the file is symbolic link
and does not make a syscall for it.

Author: tmds
Assignees: -
Labels:

area-System.IO

Milestone: -

{
if (!DirectoryExists(fullPath))
{
throw Interop.GetExceptionForIoErrno(Interop.Error.ENOENT.Info(), fullPath, isDirectory: true);
Copy link
Member Author

@tmds tmds Sep 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kept ENOENT for compatibility. ENOTDIR would be more appropriate since something exists.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adamsitnik I would prefer to change this error code because I think the previous behavior was unintentional.
If you try to delete a socket file using Directory.Delete, the code used to throw ENOENT.
I think it is more appropriate to throw ENOTDIR. Do you agree?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adamsitnik can you share your opinion?

@tmds
Copy link
Member Author

tmds commented Sep 23, 2021

CI fails are not related: RegexMatchTests.Match_ExcessPrefix(engine: SourceGenerated) times out on Windows and Linux.

@tmds
Copy link
Member Author

tmds commented Oct 13, 2021

Some changes overlap with #60214. Please review that first, and I'll rebase here to remove the overlap.

@adamsitnik
Copy link
Member

As soon as we merge #60214 I am going to review this PR.

@adamsitnik adamsitnik self-assigned this Nov 15, 2021
@adamsitnik adamsitnik added this to the 7.0.0 milestone Nov 15, 2021
@adamsitnik adamsitnik added the tenet-performance Performance related issue label Nov 15, 2021
@jeffhandley jeffhandley added the community-contribution Indicates that the PR has been added by a community member label Nov 15, 2021
@tmds tmds force-pushed the rmdir branch 2 times, most recently from 3a1d58e to 34a3869 Compare November 18, 2021 12:39
By recursing using FileSystemEnumerable we know the file type and
can omit the stat calls made by DirectoryInfo.Exists.

For the top level path, we can omit the call also and handle
non-directories when rmdir errno is ENOTDIR.

For the recursive case we can avoid recursion when the top level path rmdir
succeeds immediately.

FileSystemEntry is updated so IsSymbolicLink remembers the file is symbolic link
and does not make a syscall for it.
@tmds
Copy link
Member Author

tmds commented Nov 18, 2021

I've rebased and removed the overlap with #60214. This is up for review.

@carlossanlop carlossanlop self-assigned this Nov 18, 2021
Copy link
Member

@carlossanlop carlossanlop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good so far. I left some comments for you to consider. Thanks for the change, @tmds.

Can you please run the enumeration benchmarks to ensure there's no perf regression?

@tmds
Copy link
Member Author

tmds commented Nov 18, 2021

@adamsitnik @carlossanlop I've renamed the methods and added some comments. I hope it is more clear now what the implementation does. I still need to add the test.

Copy link
Member

@adamsitnik adamsitnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tmds thank you for another great contribution!

I've benchmarked this PR for removing a copy of a dotnet/runtime repository with all the build artifacts.

On average it's 8% faster and allocates 50% less memory:

Method Job Mean Ratio Allocated
RecursiveDeleteDirectory before 2.550 s 1.00 78 MB
RecursiveDeleteDirectory after 2.356 s 0.92 34 MB

Just out of curiosity I've compared .NET to the the rm core utility, .NET is still 10-20% slower. Most likely because rm is not using recursive code (it uses fts):

https://github.com/coreutils/coreutils/blob/3e2d64448300ea91be258af5b64ed3b97b865bd6/src/rm.c#L17-L19

But the implementation is non-trivial ;)

@tmds
Copy link
Member Author

tmds commented Nov 22, 2021

@carlossanlop I've updated the test so it also runs against the DirectoryInfo.Delete API.
The PR should now be good to merge (when the CI run finishes).

@adamsitnik
Copy link
Member

@tmds some of the tests are failing now. Could you PTAL?

@tmds
Copy link
Member Author

tmds commented Nov 22, 2021

@adamsitnik it seems that this fails on Windows. Is that expected?

            var target = GetTestFilePath();
            Directory.CreateDirectory(target);

            var linkParent = GetTestFilePath();
            Directory.CreateDirectory(linkParent);

            var linkPath = Path.Combine(linkParent, GetTestFileName());
            Directory.CreateSymbolicLink(linkPath, target);

            Assert.True(Directory.Exists(linkPath), "linkPath should exist");

@adamsitnik
Copy link
Member

it seems that this fails on Windows. Is that expected?

It's not expected to me, but I am not a symlink expert. @carlossanlop @jozkee is that expected?

@tmds Since you have not touched the Windows part, could you please just mark this test as Unix-specific for now so we can merge the PR?

@tmds
Copy link
Member Author

tmds commented Nov 23, 2021

It's not expected to me, but I am not a symlink expert. @carlossanlop @jozkee is that expected?

nvmd, I had unintentionally made changes to another similar test which caused it to fail.

The test should pass now on Linux and Windows.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.IO community-contribution Indicates that the PR has been added by a community member tenet-performance Performance related issue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants