-
Notifications
You must be signed in to change notification settings - Fork 18
Simplify build log and let store set its location #90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- don't duplicate the log file file descriptor but switch to reading the file when the log finishes; - replace the `Finished state with the `Readonly state.
Checking the result for the Docker backend requires asynchronous calls to Docker.
For the currently available stores, the original location is retained.
talex5
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not quite sure how this works. I would expect that simply reopening the same log path (but read-only) would have the same problem.
I would expect get_build to:
- Create a new
Openlog file. - Perform the build.
- Close the file and change the log state to
Finalising. - Snapshot the result.
- Set the log state to
Readonlywith the final location, and wake any readers waiting for the end of theFinalisingstate.
Or possibly the Finalising state could simply be Busy. That would also work for creating the log, where we have a similar problem (currently solved messily by using set_log to resolve an Lwt promise).
| dst (Bytes.sub_string buf 0 n); | ||
| aux (i + avail) | ||
| | `Readonly path -> readonly_tail path buf i | ||
| | _ -> Lwt_result.return () |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| | _ -> Lwt_result.return () | |
| | `Empty -> Lwt_result.return () |
| | `Open (fd, cond) -> | ||
| t.state <- `Finished; | ||
| | `Open (path, fd, cond) -> | ||
| t.state <- `Readonly path; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this has the same problem: a reader will open the path in the temporary clone, preventing it from being removed (unless a read-only FD is OK somehow?).
Instead, I think this needs to transition to a new Finalising of unit Lwt.t state or similar, where the promise resolves once the log is ready for reading in its new location.
| (** [delete t id] removes [id] from the store, if present. *) | ||
|
|
||
| val result : t -> id -> string option | ||
| val result : t -> id -> string option Lwt.t |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this need to be async? It seems useful to have an atomic way of finding out whether something exists in the store. Otherwise, how do we know the result is still valid by the time it has returned?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the Docker backend I'm making a call to Docker to check whether the result as a Docker image exists. I use functions from Os which calls Lwt_process.exec and they're asynchronous. I could call Unix.create_process instead and wait for the termination to fix the TOCTOU.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's move that to a separate PR then. It doesn't have anything to do with the build log problem.
| val result : t -> id -> string option Lwt.t | ||
| (** [result t id] is the path of the build result for [id], if present. *) | ||
|
|
||
| val log_file : t -> id -> string Lwt.t |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we want a different log location for each store?
|
Thanks @MisterDA :)) I'm not sure if this fixes the log problem I was describing before (the moving log files bit is useful for patricoferris#5 however). I mean having log files outside of the build directory should in theory solve the problem, I'm just not sure if there is a reason the logs currently get saved in the build directory. The problem I was seeing is Line 193 in c89b876
but as far as I can tell at that point the log file is still open so trying to destroy the dataset gets a |
The first step is to remove the
dupfile in the log tailing to be sure that when/if we want to move the log file, there are no open file descriptors to it. This can be done by switching toReadonlywhen the log is finished.The second step is to let the store decide the location of the log file. There's a bit of a chicken-and-egg problem here: the store was responsible for moving the log file, but the
Db_storemodule creates it a sets its location, and the store doesn't have access to the location. By letting the store set the log file location, it becomes able to set it to a location independent of the results directories, or to the original location and move it if there's no problem with it.cc @patricoferris