Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a 2-slice pallas training test in pre-submit CI #8850

Open
tengyifei opened this issue Mar 18, 2025 · 0 comments
Open

Add a 2-slice pallas training test in pre-submit CI #8850

tengyifei opened this issue Mar 18, 2025 · 0 comments
Labels
testing Testing and coverage related issues. xla:tpu TPU specific issues and PRs

Comments

@tengyifei
Copy link
Collaborator

We should have a test that trains a very simple model with a pallas kernel across two slices of TPUv4 and checks that it doesn't hang.

Currently our pre-submit CI only runs things on 1 slice of TPUv4 and that doesn't cover cases like multi-slice training.

Post-submit CI requires human diligence to monitor and revert changes, which has proven to be ineffective. As long as we can afford it, we should test things in pre-submit and not post-submit.

@ysiraichi ysiraichi added testing Testing and coverage related issues. xla:tpu TPU specific issues and PRs labels Mar 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
testing Testing and coverage related issues. xla:tpu TPU specific issues and PRs
Projects
None yet
Development

No branches or pull requests

2 participants