-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8293067: (fs) Implement WatchService using system library (macOS) #10140
Conversation
👋 Welcome back mkartashev! A progress list of the required criteria for merging this PR into |
@mkartashev The following label will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command. |
Webrevs
|
Thanks for this PR, I plan to review this but it will take some time. Just to set expectations that the patch will likely through through several iterations as there will be many issues to discuss. |
I think it is still used on on AIX but there is someone from IBM working on that. It may be that we will have to keep it for new ports. For macOS then we might have to introduce a system property to use it, at least until there is confidence with the new implementation. |
I actually had that property, but pulled it out before submitting the pull request, so it's going to be very easy to re-introduce it. |
In our CI system |
Can you provide the |
I can't give the whole .jtr file due to internal content, but here are some excerpts.
The node in question is a MacPro6_1 running Mac_OS_X_12.4 on a Intel_R__Xeon_R__CPU_E5-1620_v2___3.70GHz with total memory 17179869184 (16.00 GB). Sadly, with timeouts, there is often little information to go on. |
Mailing list message from Michael Hall on nio-dev: For me. move subtree elapsed 1035 Elapsed times in millis -------------- next part -------------- |
I did a first pass over this. It's a good start and demonstrates that the file system events / FSEventStream* API can be used to implement WatchService. It's really unfortunate that it requires calling CFRunLoopRun to do the run loop as that means an upcall and JNI code that we would normally try to keep out of this area. One concern is that there is a lot of UnixPath (bytes) <-> String <-> CFString conversions going on and these will need time to work through (even with UTF-8). Also One architectural issue is that the WatchService implementation should not be using Path.of or Files.* methods as that will cause a loop when interposing on the default provider. The provider implementation should only use use the Unix provider methods directly. The use of toRealPath().normalize() looks odd, I would expect the normalize to be a no-op here. This is an area that will require careful study as it's just not clear how the macOS work when the sym link is to another directory on another file system. Look at ptr_to_jlong and jlong_to_jptr to go between points and jlong and that should eliminate some of the casts. The code includes SUBTREE support. I had hope this wouldn't be included initially as it adds a bit more complexity and could be added later. Style-wise the code is very different to the existing code and I think we'll need to do a few cleanup passes (overly long lines, naming, formatting, overdue of final, ... all minor stuff and there is a lot to go through before these things). |
@bplb Thanks for the log. It's really strange, though, that there is nothing from |
Mailing list message from Michael Hall on nio-dev: Excuse my unofficial interest. But I had attempted a kqueue based OS/X watch service sometime back that failed, so it?s interesting to see one working. While again the Move test runs in like? So why would tests consistently time out in Move runs when LotsOfEvents takes longer for me? I am of course unfamiliar with the JDK testing but this seems odd. |
@AlanBateman
Done,
Duly noted and, I think, corrected.
I may be misinterpreting you here, but to me it seems
All such cases were eliminated.
The recursive subdirectory watch support was removed (for now); let's focus on the initial support for FSEvents.
I also reformatted the native code and a bit of Java mostly to shorten the lines. The rest will have to be on case-by-case basis, I'm afraid. This time around I only ran the |
@mkartashev When a test times out, often not a lot of information is captured, including |
I reran the |
Thanks! Let's hope it stays this way. BTW, my initial version did have a thread cleanup problem that was caught by other tests. That was because I assumed the run loop would exit as soon as all input sources are pulled from it; that assumption proved to be false, so now there's |
Mailing list message from Brian Burkhalter on nio-dev: On Sep 12, 2022, at 4:24 AM, Michael Hall <mik3hall at gmail.com<mailto:mik3hall at gmail.com>> wrote: I had attempted a kqueue based OS/X watch service sometime back that failed, so it?s interesting to see one working. Concerning kqueue, please see my comment here: https://bugs.openjdk.org/browse/JDK-7133447?focusedCommentId=14522734&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14522734 Brian |
Mailing list message from Michael Hall on nio-dev:
I suppose it could of been number of file descriptors. Do you remember specific problems with the LotsOfEvents test? I remember the java would block waiting for something to happen. The kqueue native thread would loop on it?s on thread waiting for events but the kernel would just stop sending them. It?s been a while since I looked at the code so I don?t remember if I had any cleanup that would release file descriptors. I probably should have. But I don?t remember on what conditions it ran if I did. |
* Invoked on the CFRunLoopThread by the native code to report directories | ||
* that need to be re-scanned. | ||
*/ | ||
private void callback(final long eventStreamRef, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this be named something more descriptive than callback
, perhaps even handleEvents
even though there is a method of the same name defined in MacOSXWatchKey
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
|
||
if (!relativeRootPath.equals(path)) { | ||
// Ignore events from subdirectories for now. | ||
continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should a line
eventFlagsPtr += SIZEOF_FS_EVENT_STREAM_EVENT_FLAGS;
be added here? (The definition of SIZEOF_FS_EVENT_STREAM_EVENT_FLAGS
would have to be moved up.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, most certainly. Thanks for catching this!
Thanks! No luck, then... |
Would using the non-deprecated API be something worth investigating? In the meantime, I will try and find some more time to see if we can narrow down these failures on the systems where this is failing. I don't know how soon I can get to that. |
Mailing list message from Michael Hall on nio-dev:
I assume this infrastructure is not generally available? |
Mailing list message from Michael Hall on nio-dev:
Are there any errors after 12.2. I am 12.6. I am not sure what Maxim is. Is it possible it was an Apple issue that is fixed in recent releases? -------------- next part -------------- |
Mailing list message from Maxim Kartashev on nio-dev: FWIW I'm on 10.15 On Wed, Nov 30, 2022 at 4:31 PM Michael Hall <mik3hall at gmail.com> wrote:
-------------- next part -------------- |
That's on my to-do list.
Thanks anyway, much appreciated! |
Mailing list message from Michael Hall on nio-dev:
I guess we?re lucky or there?s something different about the failing infrastructure. |
Mailing list message from Michael Hall on nio-dev:
Another, maybe remote, possibility - if it?s not the testing framework, or a fixed Apple bug, possibly it?s some quirk in different Xcode versions being used to build the jdk? https://git.openjdk.org/jdk/pull/11115 <https://git.openjdk.org/jdk/pull/11115> -------------- next part -------------- |
@mkartashev This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration! |
@mkartashev This pull request has been inactive for more than 8 weeks and will now be automatically closed. If you would like to continue working on this pull request in the future, feel free to reopen it! This can be done using the |
Mailing list message from Michael Hall on nio-dev: Is there any chance this will be made available in some other way? |
JetBrains Runtime has FSEvents-based implementation of WatchService on macOS. |
Is this similar to what was proposed here, I'm curious if the same issues were encountered when running the tests across a wider range of macOS releases. |
@AlanBateman It is the implementation this one was based upon. No known bugs exist there at this moment in time (tests were run on IntelliJ infrastructure, of course). |
@mkartashev Do you intend to continue work on this PR? If not, then we could potentially take over the work if that is acceptable. |
Mailing list message from Michael Hall on nio-dev: Thanks, |
As was previously discussed (I think) I don't believe that a |
I plan to have a shot at using newer (not declared deprecated) API for scheduling the event stream. If that fails, this implementation is up for grabs for all I care. |
Mailing list message from Michael Hall on nio-dev: This was some time ago and I have no current intention to continue work on that although I?m not sure I ever removed the code from the project. |
Mailing list message from Michael Hall on nio-dev: As I remember working on this was difficult because Maxim and I weren?t able to reproduce whatever errors the jdk testing framework was getting. If you say you are done, and the PR has been closed by openjdk, I?ll grab a copy. It would probably be easier than trying to extract it from the GitHub project you posted. How will your changes be tested if the PR has been closed? |
I guess I'll re-open it, perhaps as a new one if necessary. As you correctly noted, I can't test this fully by myself so I have to make the code available to others somehow. |
I think that we can help with testing. |
Mailing list message from Michael Hall on nio-dev: I decided to try and do a standalone version along with my OS/X default FileSystemProvider project. Just finished throwing something together for that. I based it on the JetBrains runtime since the ownership status here seems still sort of undetermined. If this isn?t the place to discuss this let me know. If that is the case, would you be interested at all in discussing it off-list @Maxim? My first real test with no debugging is getting? Exception in thread "FileSystemWatcher" Exception in thread "FileSystemWatcher" java.lang.InternalError: platform encoding not initialized Googling the error turns up occurrences but nothing I?m really understanding as connecting to what I?m doing. Any thoughts on this? |
Mailing list message from Michael Hall on nio-dev:
I had to add a JNI_OnLoad to resolve jvm references which seemed to work. Initializing the encoding from there also seems to work. |
Mailing list message from Michael Hall on nio-dev:
Still a little thrown together but I have updated my GitHub project for this. https://github.com/mik3hall/trz <https://github.com/mik3hall/trz> My original code is a custom default FileSystemProvider. It is almost all pass through except I added some OS/X specific FileAttributeView?s for file related native api's. For your code this means you are basically a plugin WatchService replacing the default platform polling one for whatever jdk you choose to use it on. No other actual changes to the platform provider are involved. I have test now with the nio Move, Basic, and a modified extra heavy duty LotsOfEvents. It seems to work fine. Surprisingly easily. So, this should mean you or anyone else who wants to can try this out against any recent openjdk version. If anyone does, I would of course be interested in hearing about it. Especially if there are any problems related to what I?ve done. |
Mailing list message from Michael Hall on nio-dev:
And to possibly make it a little easier for anyone who does want to try it I added a release for the first time. -------------- next part -------------- |
For the record: I finally got around to trying the new dispatch queues API. This reduces the complexity of the current implementation quite a bit and passes all tests except one. The obstacle I have not been able to overcome is this: the new implementation consistently fails the test I have not been able to find any solution to this; perhaps we are hitting some OS resource limit, even though not that many streams are active at any given moment (sometimes you can also see the "too many open files" exception from that test when it creates new temporary files). So far it loos like this new API is no go. |
This is an implementation of
WatchService
based on File System Events API that is capable of generating events whenever a change occurs in an interesting directory or underneath it. Since the API naturally supports "recursive" watch, theFILE_TREE
is supported by the watch service.Some things of note:
WatchService
instance that is inactive unless changes occur in the watched directory. The changes are grouped by introducing a time delay between when they occurred and when they are reported, which is controlled by the sensitivity modifier of the watch service.BasicFileAttributes.lastModifiedTime()
).Move.java
).This commit leaves
PollingWatchService
unused. I'm not sure if I should/can do anything about it. Any advice is welcomed.Testing
test/jdk/java/nio/file
andtest/jdk/jdk/nio
on MacOS 10.15.7 (x64) andtest/jdk/java/nio/
plustest/jdk/jdk/nio
on MacOS 12.5.1 (aarch64).Progress
Issues
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk pull/10140/head:pull/10140
$ git checkout pull/10140
Update a local copy of the PR:
$ git checkout pull/10140
$ git pull https://git.openjdk.org/jdk pull/10140/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 10140
View PR using the GUI difftool:
$ git pr show -t 10140
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/10140.diff