Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bundler cache (including compact index) is written to /app and into the slug #1117

Open
edmorley opened this issue Feb 1, 2021 · 5 comments

Comments

@edmorley
Copy link
Member

edmorley commented Feb 1, 2021

I'm currently auditing official/popular buildpacks for compatibility with potentially changing the build directory to /app in the future.

One of the potential source of problems for such a move, is that files written to /app (or to$HOME, which is /app during the build) will now be included in the slug, when previously they were not. As such, I'm checking what files are left behind by buildpacks in /app, using this buildpack which lists the contents of /app at build time:
https://github.com/edmorley/heroku-buildpack-list-app-dir

Testing the Ruby getting started guide with the Ruby buildpack + the above buildpack, I see that the bundler cache (~/.bundle/cache/compact_index) is being written to /app. Once the build directory is /app, this would cause the slug size to increase, potentially pushing apps closer to the limit. For the getting started guide this cache is only 18MB, but it doesn't have as many dependencies as some typical Rails apps.

It seems there are few options:

  1. If it's useful to actually keep the bundler cache (if it's not already being kept), move it to $CACHE_DIR instead
  2. If it's not useful to keep the bundler cache, then either (a) try and have it be written to a directory under /tmp instead of $HOME, or else (b) delete it from $HOME at the end of the build
@schneems
Copy link
Contributor

schneems commented Feb 3, 2021

Related support ticket 958280

@schneems
Copy link
Contributor

schneems commented Feb 4, 2021

I looked into this heavily for #1118

The cache is only used at bundle install time. And it appears that it's only used when dependencies are not satisfied. I.e. if you deploy to heroku, then do a heroku run bash followed by bundle install it won't download the cache for "reasons". I'm assuming the reason is that it's dependencies are already satisfied. However I'm not totally sure of the behavior.

On first bundle install the cache is downloaded and written here https://github.com/rubygems/rubygems/blob/be08d8307eda3b61f0ec0460fe7fbcf647b526e6/bundler/lib/bundler/compact_index_client/updater.rb#L64

Where local_path is something in ~/.bundle/cache/compact_index. The path name includes an etag of the compact index. Before downloading a new index bundler will check to see if a prior index's etag is satisfactory.

Based on this it seems that making these files available at runtime add nothing (because people don't bundle install at runtime) so they could be stripped out before launch.

The other question is: Is it helpful to preserve these between deploys? It depends on how frequently the etag is invalidated. @hone knows more about the whole compact index so he might have some insight. My very unscientific attempt to answer this question was to deploy an app to heroku today and see if it has the same etag or not:

  • Yesterday: remote: /app/.bundle/cache/compact_index/rubygems.org.443.29b0360b937aa4d161703e6160654e47/versions.
  • Today :remote: /app/.bundle/cache/compact_index/rubygems.org.443.29b0360b937aa4d161703e6160654e47/versions.

So it looks like there may be some benefit to keeping them around. I think it's worth benchmarking the download of the index. If it's already on us-east and coming from S3 then there's no speed benefit from putting it in the cache. For CNB where there's the local install case to think about, it's likely a good idea to cache it (even if it's fast).

This does make me vaguely wish there was some kind of cross-app cache or mechanism since it seems wasteful to duplicate this across N caches (where N is number of apps on the platform).

@schneems
Copy link
Contributor

schneems commented Feb 5, 2021

Etag still valid:

remote:        `/app/.bundle/cache/compact_index/rubygems.org.443.29b0360b937aa4d161703e6160654e47/versions`.

@schneems
Copy link
Contributor

schneems commented Feb 8, 2021

Etag still valid

remote:        `/app/.bundle/cache/compact_index/rubygems.org.443.29b0360b937aa4d161703e6160654e47/versions`.

@schneems schneems changed the title Bundler cache is written to /app Bundler cache (including compact index) is written to /app and into the slug Oct 6, 2023
@schneems
Copy link
Contributor

schneems commented Oct 6, 2023

I'm unsure if this also happens in the CNB as well. Need to investigate if this is still an issue or not

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants