skip some gpt_neox tests that require 80G RAM#17923
skip some gpt_neox tests that require 80G RAM#17923sgugger merged 4 commits intohuggingface:mainfrom
Conversation
|
The documentation is not available anymore as the PR was closed or merged. |
sgugger
left a comment
There was a problem hiding this comment.
We will never have a runner with 80GB of RAM so if there is no better alternative, we should jsut delete those tests.
|
What Sylvain said and I'd ask an even more different question - why are we running the same test on many identical models of different sizes. The purpose of our test suite is not to test models on the hub, it's to test the model's code. So such tests should never be there in the first place.
Of course, the % breakdown is symbolic, the point I was trying to convey is that most tests should be really fast in download and execution. If there is a need to test models on the hub, there should be another CI that all it does is loading the models and performs some basic test on them. That CI would need to have a ton of CPU and GPU memory and # of GPUs for obvious reasons - e.g. t5-11b and other huge models. |
|
Hi @stas00 The related tests here are decorated with For Note that on scheduled CI, we use a cache server (FileStore on GCP), so there is no real downloading (e.g. the downloading is very fast, happening between GCP's network). They also have 16 vCPUs and 60G RAM. |
|
Ah, good point, I missed probably should write out explicitly the desired smallest real model then and perhaps it's small enough to fit? |
|
The main point is that GPT-Neo-X does not come with a smaller pretrained model, there is only the 20B version. |
I think this is from old code. We don't want to maintain I will just remove the 2 tests here. |
|
Removed. Will rebase on main later to see if tests all pass |
|
I am ready for the merge :-) |
* skip some gpt_neox tests that require 80G RAM * remove tests * fix quality Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
What does this PR do?
GPT-NeoX requires ~80G RAM to run. Our CI runners have only 60G RAM. Skip a few tests for now.
Do you think it's better to use something like
@unittest.skipUnless(psutil.virtual_memory().total / 1024 ** 3 > 80, "GPT-NeoX requires 80G RAM for testing")The problem is that
psutilis not in the requirements.