Skip to content

Faster dataloader merge#1

Merged
shoeybi merged 2 commits intomasterfrom
fast_dataloader
Apr 23, 2019
Merged

Faster dataloader merge#1
shoeybi merged 2 commits intomasterfrom
fast_dataloader

Conversation

@raulpuric
Copy link
Copy Markdown
Contributor

No description provided.

@shoeybi
Copy link
Copy Markdown
Contributor

shoeybi commented Apr 23, 2019

LGTM

@shoeybi shoeybi merged commit 66719e9 into master Apr 23, 2019
@raulpuric raulpuric deleted the fast_dataloader branch April 23, 2019 21:00
shjwudp pushed a commit to shjwudp/Megatron-LM that referenced this pull request Apr 18, 2022
Megatron + DeepSpeed + Pipeline Parallelism
shjwudp pushed a commit to shjwudp/Megatron-LM that referenced this pull request Apr 18, 2022
rraminen added a commit to rraminen/Megatron-LM that referenced this pull request Aug 30, 2022
* Enable Megatron workload on ROCm

* Added ds_pretrain_gpt_350M_dense_pipeclean.sh

* removed a file

* Removed an extra line

* Fix to resolve the below rsqrtf() error on ROCm

/root/Megatron-DeepSpeed/megatron/fused_kernels/layer_norm_hip_kernel.hip:298:10: error: no matching function for call to 'rsqrtf'
  return rsqrtf(v);
         ^~~~~~
/opt/rocm-5.2.0/llvm/lib/clang/14.0.0/include/__clang_hip_math.h:521:7: note: candidate function not viable: call to __device__ function from __host__ function
float rsqrtf(float __x) { return __ocml_rsqrt_f32(__x); }
      ^
thomasw21 pushed a commit to thomasw21/Megatron-LM that referenced this pull request Mar 20, 2023
szhengac pushed a commit to szhengac/Megatron-LM that referenced this pull request Dec 21, 2023
haidark pushed a commit to haidark/Megatron-LM that referenced this pull request Mar 8, 2024
…rans_from_scratch

exp script of llama_en_reasoning_ar_with_trans_from_scratch
haidark pushed a commit to haidark/Megatron-LM that referenced this pull request Mar 8, 2024
Edenzzzz pushed a commit to Edenzzzz/Megatron-LM that referenced this pull request Aug 20, 2024
Don't generate precede matrix if ILP is not required
shjwudp added a commit to shjwudp/Megatron-LM that referenced this pull request Nov 8, 2024
kunlunl referenced this pull request in kunlunl/Megatron-LM May 7, 2025
jiemingz pushed a commit to jiemingz/Megatron-LM that referenced this pull request Jul 28, 2025
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
Co-authored-by: oliver könig <okoenig@nvidia.com>
copy-pr-bot bot pushed a commit that referenced this pull request Oct 28, 2025
shjwudp added a commit to shjwudp/Megatron-LM that referenced this pull request Nov 6, 2025
init fused linear cross-entropy interface
shjwudp added a commit to shjwudp/Megatron-LM that referenced this pull request Nov 21, 2025
init fused linear cross-entropy interface
copy-pr-bot bot pushed a commit that referenced this pull request Dec 12, 2025
Add option to only log inference every N steps
copy-pr-bot bot pushed a commit that referenced this pull request Dec 17, 2025
copy-pr-bot bot pushed a commit that referenced this pull request Jan 6, 2026
[MoE] Apply grouped gemm bias before unpadding for FP8
AndyBug0 referenced this pull request in xiaoyao0115/Megatron-LM Jan 8, 2026
asFeng added a commit to Graph-and-Geometric-Learning/Megatron-LM that referenced this pull request Feb 4, 2026
- test_lorentz_gpt_e2e.py: Full E2E test suite with:
  - LorentzGPTQwen3: Complete model (embedding, layers, output)
  - LorentzSelfAttention: GQA-compatible hyperbolic attention
  - LorentzTransformerLayer: Pre-norm transformer layer
  - Tests: forward, backward, manifold constraint, training step

- DEBUG_AND_FIXES.md: Document tracking bugs and fixes
  - Issue NVIDIA#1: Tensor view incompatibility after slicing time coord
  - Fix: Add .contiguous() before view()

All tests pass: manifold error < 1e-4, gradients flow correctly.
copy-pr-bot bot pushed a commit that referenced this pull request Feb 25, 2026
Fix regression in linear cross-entropy fusion caused by merging main branch
copy-pr-bot bot pushed a commit that referenced this pull request Mar 11, 2026
…transformer-engine

Fix conditional TransformerEngine imports to properly detect package availability
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants