Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nightly CUDA test takes too long #1073

Open
msaroufim opened this issue Oct 14, 2024 · 0 comments
Open

Nightly CUDA test takes too long #1073

msaroufim opened this issue Oct 14, 2024 · 0 comments
Labels

Comments

@msaroufim
Copy link
Member

Right now the test is exceeding the default timeout period of 60 min. This doesn't feel great and there's some simple solutions

  1. Parallel tests
  2. Automated sharing tests
  3. Manually sharded tests

We tried 1 and 2 in the past before and both of them gave surprising results so 3 feels like the most natural thing to do for me

@jainapurva jainapurva added the ci label Oct 18, 2024
yanbing-j pushed a commit to yanbing-j/ao that referenced this issue Dec 9, 2024
…at/ folder (pytorch#1076)

* [Hackability Refactor] Move known_model_params under torchchat (pytorch#1073)

* [Hackability Refactor] Migrate CLI call sites to explicitly go through torchchat.py (pytorch#1075)

* [Hackability Refactor] Move model.py underneath torchchat/ (pytorch#1077)

* Move model.py

* Clear out init to avoid package circular import

* [Hackability Refactor] Move select top level docs into folders within torchchat (pytorch#1080)

* [Hackability Refactor] Move the top level util folder into torchchat/utils (pytorch#1079)

* [Hackability Refactor] Move the top level util file into torchchat/utils/

* Cleared out init to avoid packing

* [Hackability Refactor] Collapse gguf_util into gguf_loader (pytorch#1078)

* [Hackability Refactor] Collapse gguf_util into gguf_loader

* Update bad import

* [Hackability Refactor] Move model_config into torchchat/model_config (pytorch#1082)

* [Hackability Refactor] Move cli related files under torchchat/cli (pytorch#1083)

* [Hackability Refactor] Move build/util into torchchat/utils (pytorch#1084)

* [Hackability Refactor] Easy Moves: eval, gguf_loader, quantize, model_dist (pytorch#1085)

* [Hackability Refactor] Easy Cheap Moves: eval, gguf_loader, quantize, model_dist

* Update eval.py call sites that slipped through the initial pass

* [Hackability Refactor] Update missed direct file calls to use torchchat.py (pytorch#1088)

* [Hackability Refactor] Move export and generate under torchchat/ (pytorch#1089)

* [Hackability Refactor] Move scripts under torchchat/utils (pytorch#1090)

* [Hackability Refactor] Move scripts under torchchat/utils

* Fix install script for AOTI

* Update referenced path in build_android

* Adding missing utils path

* Add another layer for torchchat

* Move the source command depending on if TC root is defined

* [Hackability Refactor] Move installation related files into install/ (pytorch#1081)

* [Hackability Refactor] Move installation related files into install/

* Fix install req path

* Test fix with install path for bash

* Debug messages

* Remove changes to install in et_python_libs

* Remove debug echo

* Fix pin path for et

* [Hackability Refactor] Restricted Lint (pytorch#1091)

* [Hackability Refactor] Removing __main__ from export/generate/eval (pytorch#1092)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants