-
Notifications
You must be signed in to change notification settings - Fork 459
Issues: pytorch/torchtune
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
GPU Middle Class?
discussion
Start a discussion
distributed
Anything related to distributed env (multi-GPU, multi-node)
triaged
This issue has been assigned an owner and appropriate label
#2161
opened Dec 16, 2024 by
EugenHotaj
DPO not saving adapter config
bug
Something isn't working
#2159
opened Dec 15, 2024 by
SalmanMohammadi
Move Things we should be doing but aren't
triaged
This issue has been assigned an owner and appropriate label
update_recipe_state
to its own util
best practice
#2158
opened Dec 13, 2024 by
joecummings
what should I do if I want to improve the performance of hellaswag?
discussion
Start a discussion
#2154
opened Dec 12, 2024 by
mathCrazyy
Qwen2.5 does not support Qlora?
discussion
Start a discussion
enhancement
New feature or request
#2153
opened Dec 12, 2024 by
mathCrazyy
Invalid kwarg fused passed to bitsandbytes AdamW8bit
better engineering
Tasks which help improve eng productivity e.g. building tools, cleaning up code, writing docs
#2152
opened Dec 12, 2024 by
mlazos
How to retrieve the distilled model in a manner similar to the OpenAI API interface ?
discussion
Start a discussion
#2148
opened Dec 11, 2024 by
lingq1
Loss becomes NaN during finetuning when turning on optimizer_in_bwd=True
#2145
opened Dec 10, 2024 by
acisseJZhong
70B Fine-tuning GPUs Utilization
discussion
Start a discussion
distributed
Anything related to distributed env (multi-GPU, multi-node)
#2142
opened Dec 10, 2024 by
fabiogeraci
Are there any plans to support context parallel?
enhancement
New feature or request
#2141
opened Dec 10, 2024 by
dz1iang
[small bug + generalization] saving config.yaml to output_dir
better engineering
Tasks which help improve eng productivity e.g. building tools, cleaning up code, writing docs
bug
Something isn't working
#2137
opened Dec 9, 2024 by
felipemello1
Query on Gradient accumulation
discussion
Start a discussion
#2134
opened Dec 9, 2024 by
Vattikondadheeraj
want to fine-tuned llama3.2.1b on MMLU and Arc_challenge and gsm8k(maths)
discussion
Start a discussion
#2132
opened Dec 8, 2024 by
sorobedio
Make it possible to distill into a full finetune model
enhancement
New feature or request
#2122
opened Dec 6, 2024 by
joecummings
Does it support distillation for large models like Qwen2-72B and LLaMA 3.1-70B?
discussion
Start a discussion
#2117
opened Dec 6, 2024 by
lingq1
[RFC] Remove automatic weight merging when training LoRA
discussion
Start a discussion
#2115
opened Dec 5, 2024 by
felipemello1
[RFC] Unify activation checkpointing APIs
rfc
Request for comments
#2114
opened Dec 5, 2024 by
ebsmothers
Multi GPU timeout on save checkpoint (WorkNCCL, Watchdog, timeout)
distributed
Anything related to distributed env (multi-GPU, multi-node)
inference
Anything related to our inference capabilities
#2093
opened Nov 29, 2024 by
albertbn
TensorBoardLogger: AttributeError: module 'tensorflow' has no attribute 'io'
bug
Something isn't working
#2090
opened Nov 28, 2024 by
fabiogeraci
Previous Next
ProTip!
Exclude everything labeled
bug
with -label:bug.