Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ci] make MiKTeX downloads more reliable #3124

Merged
merged 6 commits into from
Jun 2, 2020

Conversation

jameslamb
Copy link
Collaborator

Moved this over from #3119 once we decided not to introduce Windows tests in that PR.


I learned in #3113 that the download link from the main MiKTeX site redirects to individual mirrors. I'm not sure what strategy it used but it seems to change pretty frequently. That makes me think the strategy is some very simple random load balancing that doesn't account for the health of the target mirror.

Here's an example from four requests executed within 15 seconds:

image

I think that pegging to a single mirror will reduce the frequency of errors like this downloading MiKTeX:

image

In theory that redirect mechanism they have is great because it means the download link should work whenever a particular mirror fails, but as we've seen it regularly routes us to a mirror that we can't successfully download from (for whatever reason).

Since the r-package jobs already rely on the University of Illinois mirror being up (to install additional LaTeX packages), I'd like to propose that we should just go directly to that mirror for MiKTeX too.

@jameslamb jameslamb requested a review from StrikerRUS May 28, 2020 01:23
@jameslamb jameslamb requested a review from Laurae2 as a code owner May 28, 2020 01:23
@jameslamb
Copy link
Collaborator Author

Ok this did fail on AppVeyor with an error related to MiKTeX but NOT one that is caused by (or could be solved by) the changes in this PR.

image

This PR occurs after the MIKTeX installer has been downloaded, when its being set up.

Copy link
Collaborator

@StrikerRUS StrikerRUS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer doing it via PS script as a general maintenance routine.

.ci/test_r_package_windows.ps1 Outdated Show resolved Hide resolved
.ci/test_r_package_windows.ps1 Outdated Show resolved Hide resolved
@jameslamb
Copy link
Collaborator Author

Since this is only used by R builds, what is the benefit of re-writing it into Powershell? Web scraping in Powershell does not sound fun to me

@StrikerRUS
Copy link
Collaborator

Since this is only used by R builds, what is the benefit of re-writing it into Powershell?

As this particular script is needed only for R builds, it is OK from the first glance. But in general we should avoid using any specific languages for administration stuff. It will allow collaborators without knowledge of one specific programming language to maintain those tools in the future.

Going this way we may end up using dozens of languages calling one script from another for repo maintenance purposes and one ought to know all of them to just fix failing CI job.

Some other bad examples of unjustified use of Python:
https://github.com/microsoft/LightGBM/tree/master/helpers
https://github.com/microsoft/LightGBM/tree/master/.nuget

@jameslamb
Copy link
Collaborator Author

Since this is only used by R builds, what is the benefit of re-writing it into Powershell?

As this particular script is needed only for R builds, it is OK from the first glance. But in general we should avoid using any specific languages for administration stuff. It will allow collaborators without knowledge of one specific programming language to maintain those tools in the future.

Going this way we may end up using dozens of languages calling one script from another for repo maintenance purposes and one ought to know all of them to just fix failing CI job.

Some other bad examples of unjustified use of Python:
https://github.com/microsoft/LightGBM/tree/master/helpers
https://github.com/microsoft/LightGBM/tree/master/.nuget

Makes sense, thank you for the explanation!

In this case I really do think R code makes sense since this is specific to the R package, and since the task (scraping the source of an archive page to find the name of a file) is one that I personally don't know how to do in Powershell. I know it's possible, but it would take me a while and I'm guessing that what I come up with would be error prone.

We already have other R code in R-specific CI administration (https://github.com/microsoft/LightGBM/blob/master/.ci/lint_r_code.R), and I know that any time r-package tasks fail in CI, I'll get @-ed.

@StrikerRUS
Copy link
Collaborator

StrikerRUS commented May 29, 2020

We already have other R code in R-specific CI administration

That script uses API of lintr package, so it is required to be written in R.

and I know that any time r-package tasks fail in CI, I'll get @-ed.

Yeah, this is exactly what I'm saying about! And it is not good, because maintainers can leave projects for some reasdon and it is a natural process. It will result in a some kind of "bus factor" (e.g. similar problems: dmlc/xgboost#4958).

Please treat all my thoughts above as a my general opinion to the whole repo management. I'm not against that this particular PR will add parsing a web page code in R.

@jameslamb
Copy link
Collaborator Author

I'm not against that this particular PR will add parsing a web page code in R.

Thanks for this @StrikerRUS . I decided to take learning how to do this in Powershell as a learning opportunity, and I'm really glad that I did! It was a lot easier than I expected. See the changes in 23cf95a

It was easier than I expected because I learned something amazing...you can run Powershell on a Mac!

image

This made it really easy to test and experiment in my usual dev environment 😀

Write-Output "Installing dependencies"
$packages = "c('data.table', 'jsonlite', 'Matrix', 'processx', 'R6', 'testthat'), dependencies = c('Imports', 'Depends', 'LinkingTo')"
Rscript --vanilla -e "options(install.packages.check.source = 'no'); install.packages($packages, repos = '$env:CRAN_MIRROR', type = 'binary', lib = '$env:R_LIB_PATH')" ; Check-Output $?

Copy link
Collaborator Author

@jameslamb jameslamb May 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll undo this change. This doesn't have to move now that we aren't installing {httr} for the MiKTeX download!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

undid these changes in 10e32fa, so now all that is left is the MiKTeX stuff.

If it build successfully I'll merge this! Hopefully it will help reduce frequency failing R builds that affect other PRs

@StrikerRUS
Copy link
Collaborator

@jameslamb

Thanks a lot for rewriting this in PowerShell!

It was easier than I expected because I learned something amazing...you can run Powershell on a Mac!

Yeah! PS Core is a great thing along with .NET Core 😃 ,

Copy link
Collaborator

@StrikerRUS StrikerRUS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clean solution!

.ci/test_r_package_windows.ps1 Outdated Show resolved Hide resolved
.ci/test_r_package_windows.ps1 Show resolved Hide resolved
@jameslamb jameslamb force-pushed the ci/download-miktex branch from a01dfd6 to cf93d50 Compare June 2, 2020 01:42
@jameslamb
Copy link
Collaborator Author

I just rebased to master to get the latest changes, especially #3133

Happy to see that GitHub Actions got triggered on this PR!

image

Since it isn't required yet (#3119 (comment)), I'll be sure to check them manually before merging.

@jameslamb jameslamb merged commit 075513f into microsoft:master Jun 2, 2020
ChipKerchner pushed a commit to ChipKerchner/LightGBM that referenced this pull request Jun 10, 2020
* [ci] make MiKTeX downloads more reliable

* switch to Powershell

* remove  extra slash

* fixing stuff

* revert install change

* Update .ci/test_r_package_windows.ps1

Co-authored-by: Nikita Titov <[email protected]>

Co-authored-by: Nikita Titov <[email protected]>
ChipKerchner pushed a commit to ChipKerchner/LightGBM that referenced this pull request Jun 11, 2020
* [ci] make MiKTeX downloads more reliable

* switch to Powershell

* remove  extra slash

* fixing stuff

* revert install change

* Update .ci/test_r_package_windows.ps1

Co-authored-by: Nikita Titov <[email protected]>

Co-authored-by: Nikita Titov <[email protected]>
@jameslamb jameslamb deleted the ci/download-miktex branch September 7, 2020 05:26
@github-actions
Copy link

This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 24, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants