Skip to content

Issues: pytorch/torchtitan

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Assignee
Filter by who’s assigned
Sort

Issues list

Mitigation to HuggingFace Trainer enhancement New feature or request
#824 opened Feb 6, 2025 by huyiwen
HSDP causes loss instability module: fsdp question Further information is requested
#813 opened Jan 31, 2025 by apkumar
Loss metrics dramatically change after resuming from checkpoint bug Something isn't working enhancement New feature or request module: checkpoint release_blocking Issues that are blocking the milestone / release completion
#809 opened Jan 28, 2025 by darkmirage torchtitan v1.0.0 release
unable to run 8b llama bug Something isn't working
#807 opened Jan 27, 2025 by asahni04
Gradient Scaling With Pipeline Parallelism module: pipelining question Further information is requested
#803 opened Jan 24, 2025 by windsornguyen
ZBVZeroBubble error bug Something isn't working module: pipelining
#774 opened Jan 3, 2025 by hhaAndroid
PP InterleavedZeroBubble schedule shows low TPS and high memory usage bug Something isn't working module: pipelining release_blocking Issues that are blocking the milestone / release completion
#773 opened Jan 3, 2025 by tianyu-l torchtitan v1.0.0 release
HunyuanVideo Support enhancement New feature or request
#768 opened Jan 2, 2025 by bendanzzc
FSDP 2 doesn't pad tensors? question Further information is requested
#764 opened Dec 29, 2024 by cassanof
DeepSeek V3 Support enhancement New feature or request
#760 opened Dec 26, 2024 by casper-hansen
Checkpoint conversion module: checkpoint question Further information is requested
#758 opened Dec 20, 2024 by MaxiBoether
Any plans to support DPO training? enhancement New feature or request
#756 opened Dec 20, 2024 by xs1997zju
JobConfig does not support typing enhancement New feature or request
#753 opened Dec 18, 2024 by greeneggsandyaml
Model init with HuggingFace model bug Something isn't working question Further information is requested
#743 opened Dec 16, 2024 by neeldani
Low bit Optimizers & FA-3 bug Something isn't working question Further information is requested
#742 opened Dec 16, 2024 by asahni04
Issue: Loss Discrepancy Between FSDP1 and FSDP2 with AdamW Optimizer question Further information is requested
#724 opened Dec 9, 2024 by Teng-xu
ProTip! Type g i on any issue or pull request to go back to the issue listing page.