Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow watch of symbolic links to folders on Unix #52679

Merged
merged 9 commits into from
May 26, 2021
Merged

Conversation

jozkee
Copy link
Member

@jozkee jozkee commented May 12, 2021

Fixes #25078

@jozkee jozkee added this to the 6.0.0 milestone May 12, 2021
@jozkee jozkee requested a review from carlossanlop May 12, 2021 23:35
@jozkee jozkee self-assigned this May 12, 2021
@ghost
Copy link

ghost commented May 12, 2021

Tagging subscribers to this area: @carlossanlop
See info in area-owners.md if you want to be subscribed.

Issue Details

Fixes #25078

Author: Jozkee
Assignees: Jozkee
Labels:

area-System.IO

Milestone: 6.0.0

@@ -28,11 +31,11 @@ public void FileSystemWatcher_Directory_Changed_WatchedFolder()
{
using (var testDirectory = new TempDirectory(GetTestFilePath()))
using (var dir = new TempDirectory(Path.Combine(testDirectory.Path, "dir")))
using (var watcher = new FileSystemWatcher(dir.Path, "*"))
using (var watcher = new FileSystemWatcher(GetDirectoryOrSymLink(dir.Path), "*"))
{
Action action = () => Directory.SetLastWriteTime(dir.Path, DateTime.Now + TimeSpan.FromSeconds(10));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not related to your change, but I'm having trouble understanding why this test is not expecting any events after changing the last write time.

Copy link
Member Author

@jozkee jozkee May 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The FSW in this case is watching the same folder whose LastWriteTime is being changed and this test demonstrates that changes to the watched directory are not being caught.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a second thought, I think these tests that use ExpectEvent(watcher, 0, ...) are wrong and instead should use ExpectNoEvent.

@carlossanlop
Copy link
Member

@jozkee, in the AddJsonFile issue #36091 , @ericstj shared a workaround for the symlinks issue. The full working program can be found here: https://github.com/ericstj/sample-code/tree/runtime36091/symLinkConfig

I suggest you clone the code from that program, remove the workaround code, consume your compiled bits containing the changes in this PR, and see if the bug is fixed.

@ericstj
Copy link
Member

ericstj commented May 13, 2021

remove the workaround code, consume your compiled bits containing the changes in this PR, and see if the bug is fixed.

This won't fix that bug. This fix is only allowing the watched directory to be a symlink. It's not following symlinks in child directories, nor following symlink'ed files. There's even a more complicated scenario with #36091 where a chain of symlinks is changed in the middle. I think the best fix for #36091 would be to make it use a polling file watcher that follows symlinks to poll the target.

@jozkee
Copy link
Member Author

jozkee commented May 13, 2021

It's not following symlinks in child directories, nor following symlink'ed files.

This fix it's actually allowing the follow of symlinks within the watched directory since it removes IN_DONT_FOLLOW .

But I am wondering if doing so could be a problem, specially for cyclic symlinks but it appears to me that cycles are not a problem since the code uses a set to track the child directories that are being wached:

// Then store the path information into our map.
WatchedDirectory? directoryEntry;
bool isNewDirectory = false;
if (_wdToPathMap.TryGetValue(wd, out directoryEntry))

@ericstj do you think that could be a problem?
As far as I have observed the only issue with this change is that will make Linux (haven't checked OSX) behave different to Windows which doesn't follow symlinks within the watched directory.

@ericstj
Copy link
Member

ericstj commented May 13, 2021

This fix it's actually allowing the follow of symlinks within the watched directory since it removes IN_DONT_FOLLOW .

That wasn't my understanding on the docs on INotify, though I'm far from a Linux expert. The way I read the docs they were talking about pathname which I read to mean only the directory passed in. If a file within that directory is a symlink outside the directory are you seeing that it's still watched? I guess child directories would be followed since those require independent additions https://github.com/dotnet/runtime/blob/6a09b93a2cf72049b896dcfe6c437732bc78a6fc/src/libraries/System.IO.FileSystem.Watcher/src/System/IO/FileSystemWatcher.Linux.cs#L428-L430

@ericstj do you think that could be a problem?

Test it out and see 😄 I guess it depends on this:
https://github.com/dotnet/runtime/blob/6a09b93a2cf72049b896dcfe6c437732bc78a6fc/src/libraries/System.IO.FileSystem.Watcher/src/System/IO/FileSystemWatcher.Linux.cs#L342-L344
So if that gives the same result in case of a symlink cycle then I think the dictionary will break the cycle.

WRT the differences between platforms, we could make this a capability that's just different between the platforms, or we could introduce an API and have that API throw on platforms where we can't honor it. Do you imagine there are folks who would want to disallow this behavior: that would speak to having an API? I remember @GrabYourPitchforks having an opinion about things that follow symlinks by default.

@jozkee
Copy link
Member Author

jozkee commented May 19, 2021

This fix it's actually allowing the follow of symlinks within the watched directory since it removes IN_DONT_FOLLOW .

That wasn't my understanding on the docs on INotify, though I'm far from a Linux expert. The way I read the docs they were talking about pathname which I read to mean only the directory passed in. If a file within that directory is a symlink outside the directory are you seeing that it's still watched? I guess child directories would be followed since those require independent additions

Ah yes, the symlinks within the watched directory were being followed with my previous change because we are manually calling inotify_add_watch for each subdir when IncludeSubdirectories is set. Thus my statement was wrong. I have undone the changes that were causing following child symlinks by default, this in order to behave similar to windows.

we could introduce an API and have that API throw on platforms where we can't honor it.

Agree, follow symlinks should not be the default, or at least should not be addressed by this PR, it would be a discussion of it's own.

Do you imagine there are folks who would want to disallow this behavior

We already have a very similar discussion in the issue for symbolic link APIs:
#24271 (comment)

It seems there is people in both sides of "having follow semantics as the default" or "make follow links opt-in" so I think it would not be great to allow that here.

@jozkee jozkee marked this pull request as ready for review May 19, 2021 19:33
Copy link
Member

@carlossanlop carlossanlop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I left some small, optional suggestions and a couple of questions.

using var watcher = new FileSystemWatcher(linkPath);

// Act - Assert
Assert.Throws<FileNotFoundException>(() => watcher.EnableRaisingEvents = true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious: What's the exception message? I find it strange that we throw FileNotFoundException instead of DirectoryNotFoundException. The symbolic link was created with isDirectory: true, and the file exists, and we already know it doesn't make sense to watch a file, so I would've expected an error telling me a directory could not be found.

Same question applies for line 55, where the link targets itself, but it's created with isDirectory: true too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the exception message?

Error reading the {directory name} directory.

Same question applies for line 55, where the link targets itself, but it's created with isDirectory: true too.

Same error is thrown.

@ericstj
Copy link
Member

ericstj commented May 21, 2021

I have undone the changes that were causing following child symlinks by default, this in order to behave similar to windows.

So then this PR is just making a change to follow the symlink if it is the root directory watched, is that correct? Is that the same as Windows?

I didn't want to imply any sort of right/wrong judgement with my previous comment around following symlinks of child directories. I actually think that could be seen as an interesting feature that is easier for us to do on Linux due to the way the watching APIs work (we already need to traverse and add and their is a flag for following). The feature could also be added to the Windows implementation if we shared some of the code that traverses, just rather than add directories it finds (not necessary since ReadDirectoryChanges can be recursive), it could add symlinks it finds. Interestingly enough if we did that, it would actually be similar to what would be needed to make a complete symlink following solution for Linux as well, since it could cover the symlink-ed file case.

It seems there is people in both sides of "having follow semantics as the default" or "make follow links opt-in" so I think it would not be great to allow that here.

Agreed. I think it'd be possible to have a complete symlink following file-watcher, thought it should probably be opt-in. IMHO such a feature would be a nice complement to the other symlink API that's being worked on and would probably be more impactful than this change. You might consider doing it instead of this change if this change was causing our behaviors to diverge on platforms. If not then this seems fine.

@jozkee
Copy link
Member Author

jozkee commented May 21, 2021

So then this PR is just making a change to follow the symlink if it is the root directory watched, is that correct? Is that the same as Windows?

After my last changes, that's correct.

You might consider doing it instead of this change if this change was causing our behaviors to diverge on platforms. If not then this seems fine.

It is no longer diverging, so I think we can merge this and consider adding the "follow symlinks" feature flag on top.

@ericstj
Copy link
Member

ericstj commented May 21, 2021

consider adding the "follow symlinks" feature flag on top.

Such a feature would help solve #36091, when not using polling, and is probably in line with a user's expectations when they don't know anything about the symlinks in the file-system they run on.

@jozkee
Copy link
Member Author

jozkee commented May 21, 2021

@jaredpar, infra question: many OSX builds are failing to unzip test assets, do you know if this is a known issue? I already tried re-running a couple times and the error persists.

@jozkee
Copy link
Member Author

jozkee commented May 21, 2021

cc @dotnet/runtime-infrastructure.

@ericstj
Copy link
Member

ericstj commented May 21, 2021

I believe what happened is that the zip produced by libraries build has a problem:
https://dev.azure.com/dnceng/public/_build/results?buildId=1149196&view=logs&j=0ce95488-260d-5e86-47b9-82bc4e703e0b&t=a9a5369a-5269-5579-8bb0-b76ef1819668

Since that build was successful your reruns aren't fixing it. The legs which are failing are all trying to consume the artifact produced by this build leg.

It's not clear to me why that artifact is bad, I don't see a failure in zipping. Maybe we can examine the artifact in AzDo and see if something is wrong with it.

@ericstj
Copy link
Member

ericstj commented May 21, 2021

I can reproduce locally consuming that artifact:

x helix/tests/OSX.AnyCPU.Debug/System.Globalization.CalendarsWithConfigSwitch.Tests.zip: gzip decompression failed
tar: Error exit delayed from previous errors.

@steveisok
Copy link
Member

steveisok commented May 21, 2021

There's a failure in the iOS simulator leg that may be significant

Actually it's in iOS, tvOS, and MacCatalyst. A crash in System.IO.FileSystem.Watcher.Tests

https://dev.azure.com/dnceng/public/_build/results?buildId=1149199&view=ms.vss-test-web.build-test-results-tab

@jozkee
Copy link
Member Author

jozkee commented May 21, 2021

It is failing without even running tests, right? [link]:

[23:21:17] dbug: 23:21:17.4174790 Xamarin.Hosting: Launching net.dot.System.IO.FileSystem.Watcher.Tests async on 'tvOS 13.4 (17L255) - Apple TV' with: {
[23:21:17] dbug: 23:21:17.4175660     arguments =     (
[23:21:17] dbug: 23:21:17.4175820     );
[23:21:17] dbug: 23:21:17.4175920     environment =     {
[23:21:17] dbug: 23:21:17.4176030         NSUnbufferedIO = YES;
[23:21:17] dbug: 23:21:17.4176150         "NUNIT_AUTOEXIT" = true;
[23:21:17] dbug: 23:21:17.4176280         "NUNIT_ENABLE_XML_OUTPUT" = true;
[23:21:17] dbug: 23:21:17.4176390         "NUNIT_HOSTNAME" = "127.0.0.1";
[23:21:17] dbug: 23:21:17.4176510         "NUNIT_HOSTPORT" = 61627;
[23:21:17] dbug: 23:21:17.4176710         "NUNIT_XML_VERSION" = xUnit;
[23:21:17] dbug: 23:21:17.4176840         "OS_ACTIVITY_DT_MODE" = YES;
[23:21:17] dbug: 23:21:17.4176960     };
[23:21:17] dbug: 23:21:17.4177080     stderr = "/tmp/helix/working/B2B20951/w/AC3709FC/uploads/net.dot.System.IO.FileSystem.Watcher.Tests.err.log";
[23:21:17] dbug: 23:21:17.4177220     stdout = "/tmp/helix/working/B2B20951/w/AC3709FC/uploads/net.dot.System.IO.FileSystem.Watcher.Tests.log";
[23:21:17] dbug: 23:21:17.4177430 }
[23:21:17] dbug: 23:21:17.5383860 Xamarin.Hosting: Launched net.dot.System.IO.FileSystem.Watcher.Tests with pid 57969
[23:21:24] dbug: 23:21:24.0286310 Connection from 127.0.0.1:61628 saving logs to /tmp/helix/working/B2B20951/w/AC3709FC/uploads/test-tvos-simulator-20210520_232116.log
[23:21:24] dbug: 23:21:24.0292290 Test execution started
[23:21:24] dbug: 23:21:24.5487130 Test log server listening on: 0.0.0.0:61627
[23:21:24] dbug: 23:21:24.6867240 Xamarin.Hosting: Simulated process has exited.
[23:21:25] dbug: 23:21:25.0267760 Process mlaunch exited with 0
[23:21:25] dbug: 23:21:25.0288540 Test run completed
[23:21:26] dbug: 23:21:26.1915890 Test run crashed before it started (no log file produced)
[23:21:26] dbug: 23:21:26.1943200 No crash reports, waiting 30 seconds for the crash report service...
[23:21:56] fail: Application run crashed
                 No test log file was produced
                 
                 Check logs for more information

Apple TV (tvOS 13.4) - created by XHarness.log what does this mean?:

May 20 23:21:17 dci-mac-build-139 System.IO.FileSystem.Watcher.Tests[57969]: assertion failed: 19H2 17L255: libxpc.dylib + 83746 [F3BF99B7-EDF1-3A8E-818A-284AAB332A13]: 0x7d

The source changes I made should only affect Linux.
The new tests I added are supposedly running in all platforms, but the log says that tests are not even running.

I'm kinda lost now, any guidance on how to repro these errors locally?

@steveisok
Copy link
Member

The tests run, but crash and unfortunately the logs don't tell much. I'll try and pull your branch to see if I can tell.

@ericstj
Copy link
Member

ericstj commented May 21, 2021

FWIW I think @steveisok and I am looking at two different things. @steveisok is looking at test failures from the staging build that don't actually fail the PR, though we'd want to understand them if this PR were introducing them. I was commenting on the failures in build leg that launches tests on OSX, those appear to be stemming from a corrupt test tar-ball, which is corrupt in the build artifacts.

If you wanted to see if that corrupt tar-ball was a fluke you could try to retrigger the entire build. My guess is there was an unreported corruption in upload of the tar-ball, which would be an AzDo task issue. cc @MattGal in case he's seen that before.

@ericstj
Copy link
Member

ericstj commented May 21, 2021

/azp run runtime

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@steveisok
Copy link
Member

FWIW I think @steveisok and I am looking at two different things.

Correct, I was curious if there were any staging pipeline failures with this change and wanted to help validate.

@jozkee, it looks like what you're doing in the symlinks tests isn't supported on iOS. I wish this was printed in the logs and that's something I can take away from this and try to improve upon.

System.Reflection.TargetInvocationException: Exception has been thrown by the target of an invocation.
 ---> System.PlatformNotSupportedException: Operation is not supported on this platform.
   at System.Diagnostics.Process.StartCore(ProcessStartInfo startInfo) in System.Diagnostics.Process.dll:token 0x6000105+0xe
   at System.Diagnostics.Process.Start() in System.Diagnostics.Process.dll:token 0x60000de+0xab
   at System.IO.Tests.FileSystemWatcherTest.CreateSymLink(String targetPath, String linkPath, Boolean isDirectory) in System.IO.FileSystem.Watcher.Tests.dll:token 0x6000129+0x8c

@jozkee
Copy link
Member Author

jozkee commented May 21, 2021

@steveisok that's weird, I was supposedly using [ConditionalClass] in order to skip the tests inside the class if SymLinks are not supported.
https://github.com/dotnet/runtime/blob/68e64589ccceb2ff07e7f9cc94f610bef983dad4/src/libraries/System.IO.FileSystem.Watcher/tests/FileSystemWatcher.SymbolicLink.cs#L9-L10

Is the attribute not working properly?

@carlossanlop
Copy link
Member

Is the attribute not working properly?

Apparently not. My symlinks PR had the same issue: I was using ConditionalClass and the CI just failed. I switched to ConditionalTheory and ConditionalFact, just like all the other System.IO.FileSystem that depend on creating symlinks (none is using ConditionalClass, probably due to the same root cause).

@steveisok
Copy link
Member

I think the attribute is working, it's just that CanCreateSymbolicLinks is trying to create a process.

@jozkee
Copy link
Member Author

jozkee commented May 24, 2021

I've avoided hitting Process.Start() on unsupported platforms (iOS, tvOS and MacCatalyst) in order to fix the CI errors.
0d586e5

@@ -461,6 +461,11 @@ protected static bool CanCreateSymbolicLinks

public static bool CreateSymLink(string targetPath, string linkPath, bool isDirectory)
{
if (OperatingSystem.IsIOS() || OperatingSystem.IsTvOS() || OperatingSystem.IsMacCatalyst()) // OSes that don't support Process.Start()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Android and Browser can't start processes either:

Suggested change
if (OperatingSystem.IsIOS() || OperatingSystem.IsTvOS() || OperatingSystem.IsMacCatalyst()) // OSes that don't support Process.Start()
if (OperatingSystem.IsIOS() || OperatingSystem.IsTvOS() || OperatingSystem.IsMacCatalyst() || OperatingSystem.IsAndroid() || OperatingSystem.IsBrowser()) // OSes that don't support Process.Start()

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't those platforms be specified with the attribute as the other three?

[UnsupportedOSPlatform("ios")]
[UnsupportedOSPlatform("maccatalyst")]
[UnsupportedOSPlatform("tvos")]
public bool Start()

cc @jeffhandley

@jozkee jozkee merged commit 3bd75ba into dotnet:main May 26, 2021
@jozkee jozkee deleted the fsw-symlink branch May 26, 2021 14:47
@ghost ghost locked as resolved and limited conversation to collaborators Jun 25, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

FileSystemWatcher does not raise events when target directory is symlink (on linux)
5 participants