Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DISCUSS: Policy on including lock files in repo #4795

Closed
snoyberg opened this issue May 2, 2019 · 16 comments
Closed

DISCUSS: Policy on including lock files in repo #4795

snoyberg opened this issue May 2, 2019 · 16 comments

Comments

@snoyberg
Copy link
Contributor

snoyberg commented May 2, 2019

Now that lock files have landed (yay!), we need to establish a policy for the Stack repo on how to deal with them. We should also consider whether we make this policy a general recommendation for Stack users. Possible options I see:

  1. Don't check in the lock files at all, no other change. I don't like this: it means we lose out on reproducible build plans.
  2. Obviate the need for lock files by insisting that all additional hashes be included in the snapshot files. I don't like this either: the new format is backwards incompatible with Stack <2 (at least producing warnings), and adding all of the hashes manually is a pain.
  3. Check in the lock files in the repo. This is what I favor. However I'd like to take it one half-step further: add a step to the CI scripts to ensure that no new lock files have been generated during the build, and that no existing lock files have been modified. This ensures that the fully reproducible plan is checked into the repo.

CC @borsboom @qrilka @psibi

@snoyberg snoyberg added this to the Stack 2 milestone May 2, 2019
@borsboom
Copy link
Contributor

borsboom commented May 2, 2019

Our recommendation should probably be similar to that used by Rust's cargo tool: check in lock files for binaries, but not for libraries, so in the case of stack that would be option 3.

I agree it'd make sense to enforce no addition/modification of lock files in CI. Not sure if you already have thoughts on how to implement that, but here's what I'm coming up with: run something like find . -name '*.lock' -print0|xargs -0 shasum -a 256 before and after the build, and compare the output. Alternatively, could just check git for any dirty or untracked/unignored files after the build, which would also catch other mistakes (like forgetting to update .gitignore), but might also lead to false positives.

Might it make sense to add an option to stack that will cause it to fail if it would need to add or update lock files? I could see this coming in handy for lots of projects, and the shell script logic above is a bit hairy.

@snoyberg
Copy link
Contributor Author

snoyberg commented May 2, 2019

Alternatively, could just check git for any dirty or untracked/unignored files after the build, which would also catch other mistakes (like forgetting to update .gitignore), but might also lead to false positives.

I was leaning in this direction. We may actually consider a forgotten .gitignore update to be a proper bug that blocks CI from passing.

Might it make sense to add an option to stack that will cause it to fail if it would need to add or update lock files? I could see this coming in handy for lots of projects, and the shell script logic above is a bit hairy.

I thought about this too. It should be pretty easy to add such a feature, essentially checking a flag before writing the lock file. I don't see a downside to this, so we may as well add it. How about having a --lock-file flag with a bunch of different options:

  • default: what we do now, reading and writing lock files
  • ignore: don't read the lock file or write it
  • error-on-write: the behavior we just described

@snoyberg
Copy link
Contributor Author

snoyberg commented May 2, 2019

Actually, some more thoughts to go along with the --lock-file concept. There are likely some flags which should imply that a lock file should not be written, such as overriding the --resolver or --compiler. Perhaps we can tie in that logic with a new flag.

@mihaimaruseac
Copy link
Contributor

I'm in favor of checking them in with a comment that they are autogenerated. And checking that no process writes to them during built.

@qrilka
Copy link
Contributor

qrilka commented May 6, 2019

@mihaimaruseac why do you want to add some extra care for other processes writing to a lock file during builds? Prevent some race condition maybe?
I like the idea of --lock-file option and agree with @borsboom about reusing cargo's recommendations.
@snoyberg what's the idea of flags preventing lock file to be changed? Is it about CLI flags changing project's source map to prevent them from modifying the "default" source map by overwriting a lock file? I suppose we don't have any issues about that yet?

@snoyberg
Copy link
Contributor Author

snoyberg commented May 6, 2019

what's the idea of flags preventing lock file to be changed? Is it about CLI flags changing project's source map to prevent them from modifying the "default" source map by overwriting a lock file? I suppose we don't have any issues about that yet?

No issues yet, this is the discussion for whether we want the feature. The idea is that, in CI, we would want to prevent Stack from making updates to the lock file. Any such an update would indicate that the files in the repo do not guarantee a fully reproducible build.

@qrilka
Copy link
Contributor

qrilka commented May 6, 2019

Oh, it's the other way around, aimed at reproducibility, that makes sense

@dbaynard
Copy link
Contributor

dbaynard commented May 6, 2019

Our recommendation should probably be similar to that used by Rust's cargo tool: check in lock files for binaries, but not for libraries, so in the case of stack that would be option 3.

👍 from me — although the yaml package, for example, provides both the yaml library and the json2yaml and yaml2json executables. What should we recommend, there?

(Edit: I guess the main purpose is the library, so you'd probably avoid checking in the lock file?)

snoyberg added a commit that referenced this issue May 7, 2019
@snoyberg
Copy link
Contributor Author

snoyberg commented May 7, 2019

I very quickly threw together a PR for a --lock-file option, this may still warrant some improvements #4811

snoyberg added a commit that referenced this issue May 7, 2019
@borsboom
Copy link
Contributor

borsboom commented May 7, 2019

although the yaml package, for example, provides both the yaml library and the json2yaml and yaml2json executables. What should we recommend, there?

I suppose it comes down to what the primary purpose of the package is. In the case of yaml it's probably as a library (I didn't even know about those two exes -- although now that I do I will probably use them instead of the ruby script I hacked together once to do these conversions). stack provides a library too, but the primary purpose is clearly the executable.

Rust's documentation uses the term "end product" rather than "executable":

If you’re building a non-end product, such as a rust library that other rust packages will depend on, put Cargo.lock in your .gitignore. If you’re building an end product, which are executable like command-line tool or an application, or a system library with crate-type of staticlib or cdylib, check Cargo.lock into git

snoyberg added a commit that referenced this issue May 8, 2019
@snoyberg
Copy link
Contributor Author

I'm going to move this from the Stack 2 milestone to P0. I suggest we complete this issue once Stack 2 is released, and we can update our own CI to demonstrate how things should work.

@lehins
Copy link
Contributor

lehins commented Jul 15, 2019

There is an issue I ran into with committing lock files into the repository. Specifically when the dependency doesn't come from hackage and uses package.yaml instead of .cabal file. In such a situation hash of cabal file and thus the lock file contents depend on the version of the hpack, which not necessarily make the builds non-reproducable, but it does affect the contents of the lock file. Here is a good example of what I mean from a git diff:

     url: https://github.com/tclem/twirp-haskell/archive/b7f2b30cfc4d4aa29af8d907426f12df45260e60.tar.gz
     cabal-file:
       size: 3184
-      sha256: 2fac6fb4e49677372ef748f5731f527c5a0f137a2ba797462f7507d13d4f2ac8
+      sha256: 986d75b90f1ba84024f2c684ebb20e71e26abfef0a79c3269e880d1547827c0b
     name: twirp
     version: 0.1.0.0
     sha256: 106cf3aa46a4ce7f82a0b896670cea6477c690d4c77f2387d4428348027cf3ca

As you can see, the git commit didn't change but the sha256 of cabal file did.

Making it a policy of including the lock file into the repo would make it very inconvenient for the projects with many developer, who are very likely to be using different versions of stack and hpack.

@snoyberg Any thoughts on this? One potential workaround I see around this, while still enforcing this policy would be to insert sha256 and size of package.yaml instead of project-name.cabal into the lock file when the former is present. If that is a reasonable way to go about it I can create a separate issue for this suggestion.

@borsboom
Copy link
Contributor

Yeah, using the SHA of silently auto-generated file seems like a bad idea, and it does seem like it would make more sense to hash the actual source. I could see there being some weird edge cases though, so for example it'd be important to match Stack's build behaviour when there's a remote repo that has both a package.yaml and a .cabal file (I recall there was some discussion about this, but I can't remember what the conclusion was).

@lehins
Copy link
Contributor

lehins commented Jul 15, 2019

Oh yeah, hpack will refuse to generate a new cabal file from the package.yaml if the cabal was modified or created manually, which means without asking hpack we do not know which file to use if they are both present. @borsboom, I am not familiar with the discussion you are mentioning, but I am sure it discusses something related to this.

I know that this ticket is not dedicated to the discussion of hpack and stack integration, but considering it directly affects reproducible builds, I think it is kinda related and somewhat important. My opinion is that this issue could be solved in two ways:

  • at hpack level: if a package.yaml file is present and the cabal file was modified manually, hpack should issue a warning that the cabal file doesn't match, make a backup copy of the modified cabal file, but write a new cabal file anyways. CC @sol, any suggestions?
  • at stack level: stack should stop the build with an error when hpack is complaining. When it does, then the solution is simple delete one of the two.

egor-tensin added a commit to egor-tensin/windows-env that referenced this issue Aug 21, 2019
Stack 2.1.1 introduced lock files. I'm not sure whether to include them
in the repository, here's some discussion on the subject:
commercialhaskell/stack#4795
egor-tensin added a commit to egor-tensin/windows-env that referenced this issue Aug 21, 2019
Stack 2.1.1 introduced lock files. I'm not sure whether to include them
in the repository, here's some discussion on the subject:
commercialhaskell/stack#4795
@NorfairKing
Copy link
Contributor

I've now changed my mind on whether we should include this file. It's caused real problems with clients already and they don't know that they can just remove the lock file when those happen.

michaelheyman added a commit to michaelheyman/haskell-wav that referenced this issue Apr 21, 2020
Per this comment, the recommendation is to not commit lock files for
libraries:

commercialhaskell/stack#4795 (comment)
tumist added a commit to bjartur/master that referenced this issue Oct 1, 2020
.stack-work: These are temporary build files
*.cabal: The cabal files are generated by stack
stack.yaml.lock: commercialhaskell/stack#4795
tumist added a commit to bjartur/master that referenced this issue Oct 1, 2020
.stack-work: These are temporary build files
*.cabal: The cabal files are generated by stack
stack.yaml.lock: commercialhaskell/stack#4795
iustin added a commit to iustin/corydalis that referenced this issue May 29, 2021
There is a long discussion in commercialhaskell/stack/issues/4795
about whether this should be done or not (especially see comments in
Jan 2020), but for a single-user like me where hpack version is
stable, committing this to the repo saves 2½ minutes of the total CI
build time, which is currently dominating (total time: 4m10s). So it
is definitely worth doing…
@mpilgrem
Copy link
Member

mpilgrem commented Nov 3, 2023

I am going to close this issue, because lock files have been committed here for over a year, and nobody has complained about that practice.

@mpilgrem mpilgrem closed this as completed Nov 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants