-
Notifications
You must be signed in to change notification settings - Fork 20
fix: do not call git checkout unless necessary #64
Conversation
The core fix looks fine. However, we do still need to suppress straight piping of git output. As these functions may be run inside a build process that is not suppose to be writing to stderr directly. To properly preserve the output, the log monad would need to trickle down to these utilities. |
I'm not sure I understand what you're saying. We're already printing all kinds of informational messages to stderr (such as the current "trying to update" message). The git commands are also executed sequentially, so we don't need to buffer the output to prevent mixing up messages either. Does writing to stderr break anything? Or are you suggesting to use The big issue with buffering the git output is that this breaks the progress information. Some packages (like Unicode.lean) take a long time to download, and it's nice to see what's going on. |
It shouldn't be. If it is, that is a bug. The "trying to update" message is logged. Whether that log is actually printed to stderr is dependent upon the CLI configuration (and, more generally, the Log monad provided).
I think this could be better resolved by having Lake provide is own form of progress information (or maybe parsing Git's?). Furthermore, this will still break in many of Lake's current applications (e.g., |
I'm not sure I follow, the
|
At the moment, yes, this how it is practically used. But I do have plans to have more variation in log leveling (e.g., -q
This was about the progress information, which as you mentioned is broken when buffered. The stderr for More generally, at a conceptual level, it is (in my understanding) good practice in functional programming to relegate side-effects to a monad as it allows the code to be easily adapted to different environments. Lake does this by relegating all logging to the log monad. I am not alone in this, Shake uses a similar design. In my view, any logging, including that of the git output, should go through Lake's logging mechanisms. I am not against expanding this mechanisms to able to preserve / pass through more information if necessary, but these functions should not just directly be writing to the terminal. |
To clarify: by broken I simply meant that you don't see how far the download is along, if you display the output only after git is finished. The only broken part is the delay in progress reporting. There is never any broken formatting. Like most programs, git detects if it is connected to a tty and disables any fancy terminal features otherwise. This is what the server sees:
While this is what you see on the command line:
Good practices are typically a tradeoff, and should only matter insofar as they result in good outcomes. Disabling progress output from long-running commands so that it fits into a logging API seems like a questionable tradeoff to me, in particular when this adds no value to the user. One option could be to add a
AFAICT shake only does that in actions (to prevent interleaving), |
Ah, sorry, this is just miscommunication on my part. The absence of the progress information is what I meant by "broken formatting".
I disagree that it adds no potential value to the user. It makes it easier for users of the Lake library to redirect output of running processes should they desire. For instance, the server could eventually use this split up the log lines into VSCode info/warning/error messages. The same is true for a potentially GUI-based user of the library. It also makes Lake's logs follow a consistent format, which is easier to parse / filter by end users. I am not against showing progress information, I would just like to be able to lift that information to Lake's level and present it in a consistent form, rather than just letting whatever git decides to output get piped through (and thus ending any chance of consistent formatting).
That is what I meant. Note the Git utilities are available to user builds as well and I have already seen instances where libraries are using them. The same interleaving problem can thus arise in Lake. Furthermore, it is already possible for Lake, to be checking out multiple dependencies simultaneously, thus interleaving can already occur there. In summary, I think I understand where you are coming from and I share your desire to provide more information about the Git processes. I just don't think this is the way to do it at the moment. |
Okay, I've split off the stderr changes to #67. I'm not sure we'll reach a consensus on that point any time soon.
This is a reasonable point, but I've never seen lake check out dependencies in parallel. Is there anything I have to do to enable it?
It is just as consistent as the output of the build commands FWIW. There we also have a header |
I checked over the code and appears I was mistaken. I guess I confused what I wanted to do with what I have done so far. So I guess it is not a good reason at the moment, but it will hopefully be eventually.
Note that command output for builds does go through the logging interface, though. See the code for Regardless, thanks for splitting it up! The remaining changes look good so I will merge, |
If the dependency is already checked out at the right revision, then we only call
git rev-parse HEAD
. In particular, we don't callgit checkout --detach $rev
, which must not be called from multiple processes in parallel.See #63.