GitHub rate limit hit when running MFE image building in CI/CD pipelines #161

gabor-boros · 2023-10-30T17:55:49Z

It seems that the tutor-mfe plugin uses the GitHub API 1 that is limited to 60 requests per hour for unauthenticated requests (and 5k for authenticated ones).

Adding credentials to these requests seems to require tutor-mfe PR to add a curl or wget command call instead of the ADD defined in the Dockerfile at the moment. Docker documentation says:

If your URL files are protected using authentication, you need to use RUN wget, RUN curl or use another tool from within the container as the ADD instruction does not support authentication. – reference

Also, this would require adding optional build arguments to the Dockerfile to set the credentials.

-- related forum discussion: https://discuss.openedx.org/t/tutor-mfe-build-github-api-authentication/11542

gabor-boros · 2023-10-31T15:11:59Z

@regisb @arbrandes

In my opinion a possible solution could be (what I already mentioned in the issue description) to use RUN curl ... instead of ADD in the Dockerfile. We could have extra arguments for the Dockerfile to define the auth credentials for the call.

By default the auth credential argument could be empty, hence being backward compatible.

arbrandes · 2023-11-01T15:53:02Z

Presuming curl or wget is already in the image, it could work. Note the implied doubt: we're trying to second-guess the caching mechanism by forcing the file to be re-fetched via ADD at every single run. I don't know if RUN will have the same effect - it pretty much doesn't for the git clone line immediately following.

To test the change, you have to re-run the build after pushing a single commit to the target branch. If the commit makes it into the build, it worked. If it didn't... it didn't.

gabor-boros · 2023-11-02T03:46:03Z

@arbrandes It probably won't as the RUN command will be cached. The only way to work that around (and not disabling the cache) is to pass the commit reference directly to the Dockerfile instead of the branch name (i.e. do the api call right before building the image). That way we would still have the possibility to authenticate the API calls and to have proper layer caching.

FTR: I don't know how many Tutor plugins (or even the main Tutor repo) are using GitHub APIs like this, but the issue should be fixed at all occurrences.

gabor-boros · 2023-11-03T05:03:42Z

@regisb & @arbrandes Would you mind I'm opening a PR to work this around using the method I mentioned above?

regisb · 2023-11-07T06:25:57Z

@gabor-boros the whole point of this ADD statement is that it is not cached. Cache for subsequent steps will be cleared if there was a change upstream. This feature has proved very useful in re-building the mfe image when we target a branch, and not a tag. Such a use case happens for instance when running on nightly, or building the candidate release branch (open-release/quince.master).

Please see this PR for more information: #131

I understand that hitting the GitHub rate limit is an issue, and we need to address that. Can we do that in a way that preserves this feature?

gabor-boros · 2023-11-07T06:31:27Z

@regisb how about what I wrote above?

pass the commit reference directly to the Dockerfile instead of the branch name (i.e. do the api call right before building the image). That way we would still have the possibility to authenticate the API calls and to have proper layer caching.

regisb · 2023-11-07T06:42:13Z

I usually lean very strongly towards pinned dependencies, and this includes explicit commit sha1 or tags. But in the present case we do not have access to the commit sha1, precisely because we are building the image for a branch. This is unusual, but a useful feature nonetheless. Does that make sense? If not, maybe we can talk about it in a call.

gabor-boros · 2023-11-07T06:58:27Z

[..] in the present case we do not have access to the commit sha1 [...]

@regisb the API response explicitly contains the commit SHA1. At least the call to https://api.github.com/repos/openedx/frontend-app-profile/git/refs/heads/master returns the following:

{
  "ref": "refs/heads/master",
  "node_id": "MDM6UmVmMTY1MTA2NDMwOnJlZnMvaGVhZHMvbWFzdGVy",
  "url": "https://api.github.com/repos/openedx/frontend-app-profile/git/refs/heads/master",
  "object": {
    "sha": "c6c5521ecf720dafe292874327716bf3b70ac7c0",
    "type": "commit",
    "url": "https://api.github.com/repos/openedx/frontend-app-profile/git/commits/c6c5521ecf720dafe292874327716bf3b70ac7c0"
  }
}

So the following would do the trick to retrieve the SHA1 from the API:

$ curl -s https://api.github.com/repos/openedx/frontend-app-profile/git/refs/heads/master | jq -r '.object.sha'
c6c5521ecf720dafe292874327716bf3b70ac7c0

Did I misunderstand something? If you feel a call would be beneficial, it is good for me too.

regisb · 2023-11-07T07:37:00Z

Just got out of a chat with @gabor-boros. Thanks for taking the time Gábor!

@arbrandes The ADD statement is causing two important issues:

GitHub API rate limits during build.
Breaks the build with private GitHub repositories.

For these reasons, I would lean towards the solution proposed by Gábor: move the API call outside of the build, run it on the host (only for openedx/* repos and open.release/*.master branches) and pass the git commit sha1 to the Docker build as an argument that would bust the cache.

Am I making sense? @gabor-boros would you like to open a PR or should we do it?

gabor-boros · 2023-11-07T07:55:25Z

@regisb It was a really useful discussion I believe!

Am I making sense?

Absolutely, just one question (I probably missed something): Why only for openedx/* repos and open.release/*.master? Although that's not an issue yet, that wouldn't fix the second point: private GitHub repositories.

So if the private repo is github.com/example/frontend-app-something you wouldn't do an (optionally) authenticated call? How would you handle those cases?

regisb · 2023-11-07T08:30:29Z

Why only for openedx/* repos and open.release/*.master?

The goal would be to address the case of GitHub private repositories. Basically, we would use that caching feature only for open source repositories and branches.

So if the private repo is github.com/example/frontend-app-something you wouldn't do an (optionally) authenticated call?

That is correct.

How would you handle those cases?

I wouldn't. Private repositories would not benefit from the caching feature.

gabor-boros · 2023-11-07T09:15:58Z

I wouldn't. Private repositories would not benefit from the caching feature.

@regisb Fair enough, though it would mean that the API calls can still be authenticated, right?

regisb · 2023-11-07T09:57:23Z

I wouldn't even try to make authenticated calls. It's just not worth the trouble. Would you still be affected by the rate limit though?

gabor-boros · 2023-11-07T10:34:26Z

So you mean to build the image, it will be the responsibility of the caller to provide the commit sha1 as a build argument for the Dockerfile. Right? By caller, I mean the CI job or developer who is calling tutor MFE build using CLI.

If that's the case, then I guess that covers everyone's needs as authentication can be done when getting the commit sha1 from GitHub API.

gabor-boros · 2023-11-13T14:13:58Z

would you like to open a PR or should we do it?

I'm happy to do it.

@regisb I'm back to work. Would you have some time tomorrow to sync about the exact solution to be on the same page before getting started on the implementation?

regisb · 2023-11-14T11:13:48Z

We agreed to change the refs dynamically in plugin.py, such that GitHub API calls are only made when we build the "open-release/*.master" or "master" branches from the openedx upstream repos. Thus, something like:

@MFE_APPS.add(priority=tutor_hooks.priorities.LOW)
def _add_mfe_apps_refs(apps: dict[str, MFE_ATTRS_TYPE]) -> dict[str, MFE_ATTRS_TYPE]:
    for mfe, app in apps.items():
        if app["repository"].startswith("https://github.com/openedx/") and app["version"] == "master":
            app["refs"] = app["repository"].replace("github.com", "api.github.com/repos") + "/git/refs/heads/" + app["version"]
    return apps

See the extended conversation here: #163 Close #161

regisb · 2023-12-09T14:47:32Z

Closed by c4ed70c

regisb added this to Tutor project management Nov 7, 2023

github-project-automation bot moved this to Backlog in Tutor project management Nov 7, 2023

regisb added the bug Something isn't working label Nov 7, 2023

regisb self-assigned this Nov 14, 2023

regisb moved this from Backlog to In Progress in Tutor project management Nov 14, 2023

regisb assigned gabor-boros Nov 14, 2023

gabor-boros mentioned this issue Nov 14, 2023

refactor: use MFE refs caching for master branches #163

Closed

regisb removed their assignment Nov 14, 2023

regisb pushed a commit that referenced this issue Dec 8, 2023

fix: rate limiting caused by git repos cache check

c4ed70c

See the extended conversation here: #163 Close #161

regisb closed this as completed Dec 9, 2023

github-project-automation bot moved this from In Progress to Done in Tutor project management Dec 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GitHub rate limit hit when running MFE image building in CI/CD pipelines #161

GitHub rate limit hit when running MFE image building in CI/CD pipelines #161

gabor-boros commented Oct 30, 2023

gabor-boros commented Oct 31, 2023 •

edited

Loading

arbrandes commented Nov 1, 2023 •

edited

Loading

gabor-boros commented Nov 2, 2023

gabor-boros commented Nov 3, 2023

regisb commented Nov 7, 2023

gabor-boros commented Nov 7, 2023

regisb commented Nov 7, 2023

gabor-boros commented Nov 7, 2023

regisb commented Nov 7, 2023

gabor-boros commented Nov 7, 2023

regisb commented Nov 7, 2023

gabor-boros commented Nov 7, 2023

regisb commented Nov 7, 2023

gabor-boros commented Nov 7, 2023

gabor-boros commented Nov 13, 2023

regisb commented Nov 14, 2023

regisb commented Dec 9, 2023

GitHub rate limit hit when running MFE image building in CI/CD pipelines #161

GitHub rate limit hit when running MFE image building in CI/CD pipelines #161

Comments

gabor-boros commented Oct 30, 2023

gabor-boros commented Oct 31, 2023 • edited Loading

arbrandes commented Nov 1, 2023 • edited Loading

gabor-boros commented Nov 2, 2023

gabor-boros commented Nov 3, 2023

regisb commented Nov 7, 2023

gabor-boros commented Nov 7, 2023

regisb commented Nov 7, 2023

gabor-boros commented Nov 7, 2023

regisb commented Nov 7, 2023

gabor-boros commented Nov 7, 2023

regisb commented Nov 7, 2023

gabor-boros commented Nov 7, 2023

regisb commented Nov 7, 2023

gabor-boros commented Nov 7, 2023

gabor-boros commented Nov 13, 2023

regisb commented Nov 14, 2023

regisb commented Dec 9, 2023

gabor-boros commented Oct 31, 2023 •

edited

Loading

arbrandes commented Nov 1, 2023 •

edited

Loading