Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

filewatch: rewrite inotify backend #339

Merged
merged 9 commits into from
Jul 11, 2022
Merged

filewatch: rewrite inotify backend #339

merged 9 commits into from
Jul 11, 2022

Conversation

Akaricchi
Copy link
Member

It's now based on directory monitoring, which is more complicated, but also much more reliable, especially when dealing with move events and hard links. It is also closer to the semantics we actually want (monitoring fs paths for changes, not inodes). It will still get confused if the directory itself gets moved (directly or indirectly), though, but handling that correctly is absurdly difficult and not important for our use-case.

Importantly, FileWatch no longer becomes useless after reporting a FILEWATCH_FILE_DELETED event: if another file is placed onto the same path later, the FileWatch will report FILEWATCH_FILE_UPDATED. Some client logic can be simplified because of that.

@StarWitch
Copy link
Member

StarWitch commented Feb 28, 2022

Currently, stage reloading fails with the following error on macOS:

9297      F: filewatch_process_events: ../src/filewatch/filewatch_inotify.c:384: assertion `((uintptr_t)_assume_aligned_ptr & ((_Alignof(struct inotify_event)) - 1)) == 0` failed

This is with a simple operation like removing a INVOKE_TASK call from a timeline.c.

edit: Now with debug statements:

43553     D: filewatch_process: READ 69
43553     D: filewatch_process_events: wd 9 :: 4913 :: IN_CREATE
43553     D: filewatch_process_events: Match dir /Users/aliced/Code/dev-taisei/taisei.upstream/src/stages/stage1
43553     D: filewatch_process_events: * 0x6000027244e0 draw.h != 4913
43553     D: filewatch_process_events: * 0x6000027244b0 timeline.h != 4913
43553     D: filewatch_process_events: * 0x600002724480 misc.c != 4913
43553     D: filewatch_process_events: * 0x600002724240 cirno.c != 4913
43553     D: filewatch_process_events: * 0x600002724210 stage1.h != 4913
43553     D: filewatch_process_events: * 0x6000027241e0 background_anim.c != 4913
43553     D: filewatch_process_events: * 0x6000027241b0 entities.h != 4913
43553     D: filewatch_process_events: * 0x600002724180 meson.build != 4913
43553     D: filewatch_process_events: * 0x600002724150 timeline.c != 4913
43553     D: filewatch_process_events: * 0x600002724120 draw.c != 4913
43553     D: filewatch_process_events: * 0x6000027240f0 misc.h != 4913
43553     D: filewatch_process_events: * 0x6000027240c0 cirno.h != 4913
43553     D: filewatch_process_events: * 0x600002724090 background_anim.h != 4913
43553     D: filewatch_process_events: * 0x600002747f60 stage1.c != 4913
43553     F: filewatch_process_events: ../src/filewatch/filewatch_inotify.c:385: assertion `((uintptr_t)_assume_aligned_ptr & ((_Alignof(struct inotify_event)) - 1)) == 0` failed

@Akaricchi Akaricchi force-pushed the rewrite-filewatch branch from 6af0e2a to eed6aba Compare March 5, 2022 20:07
@Akaricchi
Copy link
Member Author

The alignment issue smells like a bug in libinotify-kqueue. According to the Linux man page inotify(7) the name field should be appropriately zero-padded to preserve alignment. I've added a workaround; if that fixes the issue then we should file a bug report.

@StarWitch
Copy link
Member

Compiles without issue, and mostly-works now.

Things that do work:

  • recompiling the stages
  • stage changes working upon selecting Restart Game from the in-game menu, or using F3 with a quicksave

Things that are still a little bit janky:

  • I was editing Stage 1, mainly removing/re-adding INVOKE_TASK_DELAYED(100, burst_fairies_1); from the main timeline TASK, but it seems to recompile all stages every single time, including spellcards for other stage's bosses. (i.e: I can see src/stages/stage6/spells/toe.c being recompiled as well), which adds a few seconds to each loop
  • The watcher doesn't seem to reload at the last "quicksave" replay point set with F4 automatically, you have to either Restart Game or hit F3 to get the changes to appear

Aside from that, it did indeed fix the bug and compilation issue. Seems like a bug report is in order for the alignment issue.

@Akaricchi
Copy link
Member Author

I was editing Stage 1, mainly removing/re-adding INVOKE_TASK_DELAYED(100, burst_fairies_1); from the main timeline TASK, but it seems to recompile all stages every single time, including spellcards for other stage's bosses. (i.e: I can see src/stages/stage6/spells/toe.c being recompiled as well), which adds a few seconds to each loop

Try editing the stage while the game is not running and then run meson compile -v taisei-stages; do that a few times. Does it actually invoke the compiler for unmodified sources? If it happens every time then it's probably a ninja or meson bug specific to macOS, though that'd be surprising. I can't reproduce it on my system. Maybe what you saw was the src/stages/stage6/spells/toe.c.o on the linker command line?

The watcher doesn't seem to reload at the last "quicksave" replay point set with F4 automatically, you have to either Restart Game or hit F3 to get the changes to appear

That sounds like it doesn't detect the dynamic library file being updated/replaced, for whatever reason. To be honest, I'm losing my patience with libinotify-kqueue-related issues. I'll fix it if you can debug it and find out what's going on, but otherwise worksforme. You'll have to find out how your compiler/linker performs the update, which can happen in a frustrating myriad of ways. For example, on my system clang/lld writes the target into a temporary file (in the same directory), then deletes the original and moves the new file in its place, while gcc/gold yolos it and just opens the original for writing (risking a race condition if a reload is attempted before its done writing). You may want to enable FW_DEBUG and possibly add IN_ALL_EVENTS to WATCH_FLAGS to see all inotify events.

It's also possible that my alignment fix is just not correct and every event beyond the first in the queue is garbage. Compile with ASan and maybe add an assertion to check that rec is not NULL in filewatch_process_event (I can't think of any good reason why it would be, except if e->wd is garbage).

Alternatively, feel free to write a macOS-specific filewatch backend using Apple's blessed APIs.

@StarWitch
Copy link
Member

StarWitch commented Mar 8, 2022

It was always a long shot to get it running on macOS, so I'm fine with shipping it as-is, considering I primarily work on Linux now. If I come up with some fixes in the future, I'll create another PR.

@StarWitch StarWitch self-requested a review March 8, 2022 17:43
@StarWitch
Copy link
Member

Solved one mystery: it was precompiled headers causing the constant regenerate-everything behaviour. Setting -Db_pch=false fixes that at least.

@StarWitch
Copy link
Member

I don't understand the underlying mechanism all that thoroughly, but it seems that in dynstage_filewatch_event() the FILEWATCH_FILE_DELETED switch gets called instead of FILEWATCH_FILE_UPDATED for whatever reason. This might be a filesystem-level operation where the file is marked as deleted instead of overwritten, perhaps? Either way (and I know this is probably kludge-y), adding a second dynstage_bump_lib_generation() to the FILEWATCH_FILE_DELETED does cause the replay-reload functionality to work as expected on macOS.

In summary:

  1. setting -Db_pch=false in src/meson.build fixes the source-regeneration issue
  2. FILEWATCH_FILE_DELETED seems to be the event returned instead of FILEWATCH_FILE_UPDATED, and forcing a reload in FILEWATCH_FILE_DELETED fixes the replay-reload functionality

@Akaricchi
Copy link
Member Author

I don't understand the underlying mechanism all that thoroughly, but it seems that in dynstage_filewatch_event() the FILEWATCH_FILE_DELETED switch gets called instead of FILEWATCH_FILE_UPDATED for whatever reason.

You should be getting a FILEWATCH_FILE_UPDATED almost immediately after the FILEWATCH_FILE_DELETED, but it gets lost for some reason. I suggest logging all inotify events and verifying the alignment fix, as I said in the previous comment. If there are no inotify events corresponding to the creation/moving-into-place of the new file, and the alignment fix actually works (i.e. rec != NULL), then it must be an inotify-kqueue bug.

Either way (and I know this is probably kludge-y), adding a second dynstage_bump_lib_generation() to the FILEWATCH_FILE_DELETED does cause the replay-reload functionality to work as expected on macOS.

If we have to deal with missing creation events, a better way to do it (in filewatch_inotify.c) is to stat() the file shortly after a FILEWATCH_FILE_DELETED event, and synthesize a FILEWATCH_FILE_UPDATED if the file exists. Not pretty, but at least the filewatch API clients won't have to deal with this nonsense.

Akaricchi added 9 commits July 7, 2022 05:52
It's now based on directory monitoring, which is more complicated, but
also much more reliable, especially when dealing with move events and
hard links. It is also closer to the semantics we actually want
(monitoring fs paths for changes, not inodes). It will still get
confused if the directory itself gets moved (directly or indirectly),
though, but handling that correctly is absurdly difficult and not
important for our use-case.

Importantly, FileWatch no longer becomes useless after reporting a
FILEWATCH_FILE_DELETED event: if another file is placed onto the same
path later, the FileWatch will report FILEWATCH_FILE_UPDATED. Some
client logic can be simplified because of that.
This function is meant to flush any internal buffers and (hopefully)
ensure the data has reached its ultimate destination (e.g. a file on
disk) before returning. This is currently implemented as a crude hack
and only supports stdio-based RWops. It calls fflush() on the internal
FILE* pointer, and on POSIX-systems it also calls fsync() on the file
descriptor.
Previously this half-worked: fflush() was called on log_fatal, but not
in case of assertion failure, leaving the on-disk log truncated.
This shouldn't be needed, as according to the Linux inotify(7) man page,
the entries are supposed to be aligned. It looks like libinotify-kqueue
doesn't respect this and doesn't properly zero-pad the name field.
@Akaricchi Akaricchi force-pushed the rewrite-filewatch branch from f716f8e to c2f5ca5 Compare July 11, 2022 22:59
@Akaricchi Akaricchi merged commit c2f5ca5 into master Jul 11, 2022
@Akaricchi Akaricchi deleted the rewrite-filewatch branch August 30, 2024 10:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants