diff --git a/.github/PULL_REQUEST_TEMPLATE.md b/.github/PULL_REQUEST_TEMPLATE.md
index 3e3e4fb3fe..b83ec70073 100644
--- a/.github/PULL_REQUEST_TEMPLATE.md
+++ b/.github/PULL_REQUEST_TEMPLATE.md
@@ -10,15 +10,15 @@ List issues that this PR closes ([syntax](https://docs.github.com/en/issues/trac
 * **You can potentially add a usage example below**
 
 ```python
-# Add a code snippet demonstrating how to use this 
+# Add a code snippet demonstrating how to use this
 ```
 
 # Before your PR is "Ready for review"
 **Pre checks**:
-- [ ] Make sure you read and followed [Contributor guidelines](/NVIDIA/NeMo-RL/blob/main/CONTRIBUTING.md)
+- [ ] Make sure you read and followed [Contributor guidelines](/NVIDIA-NeMo/RL/blob/main/CONTRIBUTING.md)
 - [ ] Did you write any new necessary tests?
-- [ ] Did you run the unit tests and functional tests locally? Visit our [Testing Guide](/NVIDIA/NeMo-RL/blob/main/docs/testing.md) for how to run tests
-- [ ] Did you add or update any necessary documentation? Visit our [Document Development Guide](/NVIDIA/NeMo-RL/blob/main/docs/documentation.md) for how to write, build and test the docs.
+- [ ] Did you run the unit tests and functional tests locally? Visit our [Testing Guide](/NVIDIA-NeMo/RL/blob/main/docs/testing.md) for how to run tests
+- [ ] Did you add or update any necessary documentation? Visit our [Document Development Guide](/NVIDIA-NeMo/RL/blob/main/docs/documentation.md) for how to write, build and test the docs.
 
 # Additional Information
 * ...
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
index 2cc6a3051b..3dc065655a 100644
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -31,7 +31,7 @@ We follow a direct clone and branch workflow for now:
 
 1. Clone the repository directly:
    ```bash
-   git clone https://github.com/NVIDIA/NeMo-RL
+   git clone https://github.com/NVIDIA-NeMo/RL
    cd nemo-rl
    ```
 
diff --git a/README.md b/README.md
index 9f605b1b5c..cdf9404834 100644
--- a/README.md
+++ b/README.md
@@ -73,7 +73,7 @@ cd nemo-rl
 # by running (This is not necessary if you are using the pure Pytorch/DTensor path):
 git submodule update --init --recursive
 
-# Different branches of the repo can have different pinned versions of these third-party submodules. Ensure 
+# Different branches of the repo can have different pinned versions of these third-party submodules. Ensure
 # submodules are automatically updated after switching branches or pulling updates by configuring git with:
 # git config submodule.recurse true
 
@@ -226,7 +226,7 @@ sbatch \
 We also support multi-turn generation and training (tool use, games, etc.).
 Reference example for training to play a Sliding Puzzle Game:
 ```sh
-uv run python examples/run_grpo_sliding_puzzle.py 
+uv run python examples/run_grpo_sliding_puzzle.py
 ```
 
 ## Supervised Fine-Tuning (SFT)
@@ -409,7 +409,7 @@ If you use NeMo RL in your research, please cite it using the following BibTeX e
 ```bibtex
 @misc{nemo-rl,
 title = {NeMo RL: A Scalable and Efficient Post-Training Library},
-howpublished = {\url{https://github.com/NVIDIA/NeMo-RL}},
+howpublished = {\url{https://github.com/NVIDIA-NeMo/RL}},
 year = {2025},
 note = {GitHub repository},
 }
@@ -417,8 +417,8 @@ note = {GitHub repository},
 
 ## Contributing
 
-We welcome contributions to NeMo RL\! Please see our [Contributing Guidelines](https://github.com/NVIDIA/NeMo-RL/blob/main/CONTRIBUTING.md) for more information on how to get involved.
+We welcome contributions to NeMo RL\! Please see our [Contributing Guidelines](https://github.com/NVIDIA-NeMo/RL/blob/main/CONTRIBUTING.md) for more information on how to get involved.
 
 ## Licenses
 
-NVIDIA NeMo RL is licensed under the [Apache License 2.0](https://github.com/NVIDIA/NeMo-RL/blob/main/LICENSE).
+NVIDIA NeMo RL is licensed under the [Apache License 2.0](https://github.com/NVIDIA-NeMo/RL/blob/main/LICENSE).
diff --git a/docs/adding-new-models.md b/docs/adding-new-models.md
index 155a012f47..e0de97ae40 100644
--- a/docs/adding-new-models.md
+++ b/docs/adding-new-models.md
@@ -12,7 +12,7 @@ $$\text{KL} = E_{x \sim \pi}[\pi(x) - \pi_{\text{ref}}(x)]$$
 
 When summed/integrated, replacing the $x \sim \pi$ with $x \sim \pi_{\text{wrong}}$ leads to an error of:
 
-$$\sum_{x} \left( \pi(x) - \pi_{\text{ref}}(x) \right) \left( \pi_{\text{wrong}}(x) - \pi(x) \right)$$  
+$$\sum_{x} \left( \pi(x) - \pi_{\text{ref}}(x) \right) \left( \pi_{\text{wrong}}(x) - \pi(x) \right)$$
 
 So, to verify correctness, we calculate:
 
@@ -65,28 +65,28 @@ When investigating discrepancies beyond the acceptable threshold, focus on these
 
 When validating Hugging Face-based models, perform the following checks:
 
-- **Compare log probabilities**  
+- **Compare log probabilities**
   Ensure the generation log probabilities from inference backends like **vLLM** match those computed by Hugging Face. This comparison helps diagnose potential mismatches.
 
-- **Test parallelism**  
+- **Test parallelism**
   Verify consistency with other parallelism settings.
 
-- **Variance**  
+- **Variance**
   Repeat tests multiple times (e.g., 10 runs) to confirm that behavior is deterministic or within acceptable variance.
 
-- **Check sequence lengths**  
-  Perform inference on sequence lengths of 100, 1,000, and 10,000 tokens.  
+- **Check sequence lengths**
+  Perform inference on sequence lengths of 100, 1,000, and 10,000 tokens.
   Ensure the model behaves consistently at each length.
 
-- **Use real and dummy data**  
-  - **Real data:** Tokenize and generate from actual text samples.  
+- **Use real and dummy data**
+  - **Real data:** Tokenize and generate from actual text samples.
   - **Dummy data:** Simple numeric sequences to test basic generation.
 
-- **Vary sampling parameters**  
-  Test both greedy and sampling generation modes.  
+- **Vary sampling parameters**
+  Test both greedy and sampling generation modes.
   Adjust temperature and top-p to confirm output consistency across backends.
 
-- **Test different batch sizes**  
+- **Test different batch sizes**
   Try with batch sizes of 1, 8, and 32 to ensure consistent behavior across different batch configurations.
 
 ---
@@ -95,11 +95,11 @@ When validating Hugging Face-based models, perform the following checks:
 
 ### Additional Validation
 
-- **Compare Megatron outputs**  
+- **Compare Megatron outputs**
   Ensure the Megatron forward pass aligns with Hugging Face and the generation log probabilities from inference backends like **vLLM**.
 
-- **Parallel settings**  
-  Match the same parallelism configurations used for the HuggingFace-based tests.  
+- **Parallel settings**
+  Match the same parallelism configurations used for the HuggingFace-based tests.
   Confirm outputs remain consistent across repeated runs.
 
 ---
@@ -128,7 +128,7 @@ By following these validation steps and ensuring your model's outputs remain con
 We also maintain a set of standalone scripts that can be used to diagnose issues related to correctness that
 we have encountered before.
 
-## [1.max_model_len_respected.py](https://github.com/NVIDIA/NeMo-RL/blob/main/tools/model_diagnostics/1.max_model_len_respected.py)
+## [1.max_model_len_respected.py](https://github.com/NVIDIA-NeMo/RL/blob/main/tools/model_diagnostics/1.max_model_len_respected.py)
 
 Test if a new model respects the `max_model_len` passed to vllm:
 
@@ -142,7 +142,7 @@ uv run --extra vllm tools/model_diagnostics/1.max_model_len_respected.py Qwen/Qw
 # [Qwen/Qwen2.5-1.5B] ALL GOOD!
 ```
 
-## [2.long_generation_decode_vs_prefill](https://github.com/NVIDIA/NeMo-RL/blob/main/tools/model_diagnostics/2.long_generation_decode_vs_prefill.py)
+## [2.long_generation_decode_vs_prefill](https://github.com/NVIDIA-NeMo/RL/blob/main/tools/model_diagnostics/2.long_generation_decode_vs_prefill.py)
 
 Test that vLLM yields near-identical token log-probabilities when comparing decoding with a single prefill pass across multiple prompts.
 
diff --git a/docs/model-quirks.md b/docs/model-quirks.md
index fa2b181c7e..ca08b2741b 100644
--- a/docs/model-quirks.md
+++ b/docs/model-quirks.md
@@ -6,7 +6,7 @@ This document outlines special cases and model-specific behaviors that require c
 
 ### Tied Weights
 
-Weight tying between the embedding layer (`model.embed_tokens`) and output layer (`lm_head`) is currently not respected when using the FSDP1 policy or the DTensor policy when TP > 1 (See [this issue](https://github.com/NVIDIA/NeMo-RL/issues/227)). To avoid errors when training these models, we only allow training models with tied weights using the DTensor policy with TP=1. For Llama-3 and Qwen2.5 models, weight-tying is only enabled for the smaller models (< 2B), which can typically be trained without tensor parallelism. For Gemma-3, all model sizes have weight-tying enabled, including the larger models which require tensor parallelism. To support training of these models, we specially handle the Gemma-3 models by allowing training using the DTensor policy with TP > 1.
+Weight tying between the embedding layer (`model.embed_tokens`) and output layer (`lm_head`) is currently not respected when using the FSDP1 policy or the DTensor policy when TP > 1 (See [this issue](https://github.com/NVIDIA-NeMo/RL/issues/227)). To avoid errors when training these models, we only allow training models with tied weights using the DTensor policy with TP=1. For Llama-3 and Qwen2.5 models, weight-tying is only enabled for the smaller models (< 2B), which can typically be trained without tensor parallelism. For Gemma-3, all model sizes have weight-tying enabled, including the larger models which require tensor parallelism. To support training of these models, we specially handle the Gemma-3 models by allowing training using the DTensor policy with TP > 1.
 
 **Special Handling:**
 - We skip the tied weights check for all Gemma-3 models when using the DTensor policy, allowing training using TP > 1.
diff --git a/examples/configs/grpo-deepscaler-1.5b-8K.yaml b/examples/configs/grpo-deepscaler-1.5b-8K.yaml
index 96bc7f2e76..1013f3d4c2 100644
--- a/examples/configs/grpo-deepscaler-1.5b-8K.yaml
+++ b/examples/configs/grpo-deepscaler-1.5b-8K.yaml
@@ -30,7 +30,7 @@ checkpointing:
   save_period: 10
 
 policy:
-  # Qwen/Qwen2.5-1.5B has tied weights which are only supported with dtensor policy with tp size 1 (https://github.com/NVIDIA/NeMo-RL/issues/227)
+  # Qwen/Qwen2.5-1.5B has tied weights which are only supported with dtensor policy with tp size 1 (https://github.com/NVIDIA-NeMo/RL/issues/227)
   model_name: "deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B"
   tokenizer:
     name: ${policy.model_name} ## specify if you'd like to use a tokenizer different from the model's default
diff --git a/examples/configs/grpo_math_1B.yaml b/examples/configs/grpo_math_1B.yaml
index 85cc620b62..1842b01497 100644
--- a/examples/configs/grpo_math_1B.yaml
+++ b/examples/configs/grpo_math_1B.yaml
@@ -30,7 +30,7 @@ checkpointing:
   save_period: 10
 
 policy:
-  # Qwen/Qwen2.5-1.5B has tied weights which are only supported with dtensor policy with tp size 1 (https://github.com/NVIDIA/NeMo-RL/issues/227)
+  # Qwen/Qwen2.5-1.5B has tied weights which are only supported with dtensor policy with tp size 1 (https://github.com/NVIDIA-NeMo/RL/issues/227)
   model_name: "Qwen/Qwen2.5-1.5B"
   tokenizer:
     name: ${policy.model_name} ## specify if you'd like to use a tokenizer different from the model's default
diff --git a/examples/configs/grpo_sliding_puzzle.yaml b/examples/configs/grpo_sliding_puzzle.yaml
index 0b99e750e8..8493bfc40e 100644
--- a/examples/configs/grpo_sliding_puzzle.yaml
+++ b/examples/configs/grpo_sliding_puzzle.yaml
@@ -24,7 +24,7 @@ policy:
     max_new_tokens: ${policy.max_total_sequence_length}
     temperature: 1.0
     # Setting top_p/top_k to 0.999/10000 to strip out Qwen's special/illegal tokens
-    # https://github.com/NVIDIA/NeMo-RL/issues/237
+    # https://github.com/NVIDIA-NeMo/RL/issues/237
     top_p: 0.999
     top_k: 10000
     stop_token_ids: null
@@ -38,7 +38,7 @@ policy:
 
 data:
   add_system_prompt: false
- 
+
 env:
   sliding_puzzle_game:
     cfg:
diff --git a/examples/configs/recipes/llm/grpo-gemma3-27b-it-16n8g-fsdp2tp8sp-actckpt-long.yaml b/examples/configs/recipes/llm/grpo-gemma3-27b-it-16n8g-fsdp2tp8sp-actckpt-long.yaml
index 0ec0ef477a..2458739e2e 100644
--- a/examples/configs/recipes/llm/grpo-gemma3-27b-it-16n8g-fsdp2tp8sp-actckpt-long.yaml
+++ b/examples/configs/recipes/llm/grpo-gemma3-27b-it-16n8g-fsdp2tp8sp-actckpt-long.yaml
@@ -45,7 +45,7 @@ policy:
     context_parallel_size: 1
     custom_parallel_plan: null
   dynamic_batching:
-    # TODO: OOMs if enabled https://github.com/NVIDIA/NeMo-RL/issues/383
+    # TODO: OOMs if enabled https://github.com/NVIDIA-NeMo/RL/issues/383
     enabled: False
     train_mb_tokens: ${mul:${policy.max_total_sequence_length}, ${policy.train_micro_batch_size}}
     logprob_mb_tokens: ${mul:${policy.max_total_sequence_length}, ${policy.logprob_batch_size}}
diff --git a/examples/converters/convert_dcp_to_hf.py b/examples/converters/convert_dcp_to_hf.py
index fc53418696..d87d97a64e 100644
--- a/examples/converters/convert_dcp_to_hf.py
+++ b/examples/converters/convert_dcp_to_hf.py
@@ -51,7 +51,7 @@ def main():
 
     model_name_or_path = config["policy"]["model_name"]
     # TODO: After the following PR gets merged:
-    # https://github.com/NVIDIA/NeMo-RL/pull/148/files
+    # https://github.com/NVIDIA-NeMo/RL/pull/148/files
     # tokenizer should be copied from policy/tokenizer/* instead of relying on the model name
     # We can expose a arg at the top level --tokenizer_path to plumb that through.
     # This is more stable than relying on the current NeMo-RL get_tokenizer() which can
diff --git a/nemo_rl/algorithms/dpo.py b/nemo_rl/algorithms/dpo.py
index c7b3de9f5f..3883328216 100644
--- a/nemo_rl/algorithms/dpo.py
+++ b/nemo_rl/algorithms/dpo.py
@@ -71,7 +71,7 @@ class DPOConfig(TypedDict):
     preference_average_log_probs: bool
     sft_average_log_probs: bool
     ## TODO(@ashors) support other loss functions
-    ## https://github.com/NVIDIA/NeMo-RL/issues/193
+    ## https://github.com/NVIDIA-NeMo/RL/issues/193
     # preference_loss: str
     # gt_reward_scale: float
     preference_loss_weight: float
diff --git a/nemo_rl/distributed/ray_actor_environment_registry.py b/nemo_rl/distributed/ray_actor_environment_registry.py
index 4c7eebee13..1f1937729d 100644
--- a/nemo_rl/distributed/ray_actor_environment_registry.py
+++ b/nemo_rl/distributed/ray_actor_environment_registry.py
@@ -17,7 +17,7 @@
 ACTOR_ENVIRONMENT_REGISTRY: dict[str, str] = {
     "nemo_rl.models.generation.vllm.VllmGenerationWorker": PY_EXECUTABLES.VLLM,
     # Temporary workaround for the coupled implementation of DTensorPolicyWorker and vLLM.
-    # This will be reverted to PY_EXECUTABLES.BASE once https://github.com/NVIDIA/NeMo-RL/issues/501 is resolved.
+    # This will be reverted to PY_EXECUTABLES.BASE once https://github.com/NVIDIA-NeMo/RL/issues/501 is resolved.
     "nemo_rl.models.policy.dtensor_policy_worker.DTensorPolicyWorker": PY_EXECUTABLES.VLLM,
     "nemo_rl.models.policy.fsdp1_policy_worker.FSDP1PolicyWorker": PY_EXECUTABLES.BASE,
     "nemo_rl.models.policy.megatron_policy_worker.MegatronPolicyWorker": PY_EXECUTABLES.MCORE,
diff --git a/nemo_rl/models/generation/vllm.py b/nemo_rl/models/generation/vllm.py
index f0cd5eb50b..966ad0a205 100644
--- a/nemo_rl/models/generation/vllm.py
+++ b/nemo_rl/models/generation/vllm.py
@@ -330,7 +330,7 @@ def _patch_vllm_init_workers_ray():
             enable_prefix_caching=torch.cuda.get_device_capability()[0] >= 8,
             dtype=self.cfg["vllm_cfg"]["precision"],
             seed=seed,
-            # Don't use cuda-graph by default as it leads to convergence issues (see https://github.com/NVIDIA/NeMo-RL/issues/186)
+            # Don't use cuda-graph by default as it leads to convergence issues (see https://github.com/NVIDIA-NeMo/RL/issues/186)
             enforce_eager=True,
             max_model_len=self.cfg["vllm_cfg"]["max_model_len"],
             trust_remote_code=True,
diff --git a/nemo_rl/models/policy/dtensor_policy_worker.py b/nemo_rl/models/policy/dtensor_policy_worker.py
index 61dcd9a127..a5e1d9259d 100644
--- a/nemo_rl/models/policy/dtensor_policy_worker.py
+++ b/nemo_rl/models/policy/dtensor_policy_worker.py
@@ -162,7 +162,7 @@ def __init__(
             device_map="cpu",  # load weights onto CPU initially
             # Always load the model in float32 to keep master weights in float32.
             # Keeping the master weights in lower precision has shown to cause issues with convergence.
-            # https://github.com/NVIDIA/NeMo-RL/issues/279 will fix the issue of CPU OOM for larger models.
+            # https://github.com/NVIDIA-NeMo/RL/issues/279 will fix the issue of CPU OOM for larger models.
             torch_dtype=torch.float32,
             trust_remote_code=True,
             **sliding_window_overwrite(
@@ -381,7 +381,7 @@ def train(
             and not self.skip_tie_check
         ):
             raise ValueError(
-                f"Using dtensor policy with tp size {self.cfg['dtensor_cfg']['tensor_parallel_size']} for model ({self.cfg['model_name']}) that has tied weights (num_tied_weights={self.num_tied_weights}) is not supported (https://github.com/NVIDIA/NeMo-RL/issues/227). Please use dtensor policy with tensor parallel == 1 instead."
+                f"Using dtensor policy with tp size {self.cfg['dtensor_cfg']['tensor_parallel_size']} for model ({self.cfg['model_name']}) that has tied weights (num_tied_weights={self.num_tied_weights}) is not supported (https://github.com/NVIDIA-NeMo/RL/issues/227). Please use dtensor policy with tensor parallel == 1 instead."
             )
         if gbs is None:
             gbs = self.cfg["train_global_batch_size"]
diff --git a/nemo_rl/models/policy/fsdp1_policy_worker.py b/nemo_rl/models/policy/fsdp1_policy_worker.py
index ef0eb98720..f4ec53daa0 100644
--- a/nemo_rl/models/policy/fsdp1_policy_worker.py
+++ b/nemo_rl/models/policy/fsdp1_policy_worker.py
@@ -96,7 +96,7 @@ def __init__(
             device_map="cpu",  # load weights onto CPU initially
             # Always load the model in float32 to keep master weights in float32.
             # Keeping the master weights in lower precision has shown to cause issues with convergence.
-            # https://github.com/NVIDIA/NeMo-RL/issues/279 will fix the issue of CPU OOM for larger models.
+            # https://github.com/NVIDIA-NeMo/RL/issues/279 will fix the issue of CPU OOM for larger models.
             torch_dtype=torch.float32,
             trust_remote_code=True,
             **sliding_window_overwrite(
@@ -110,7 +110,7 @@ def __init__(
             self.reference_model = AutoModelForCausalLM.from_pretrained(
                 model_name,
                 device_map="cpu",  # load weights onto CPU initially
-                torch_dtype=torch.float32,  # use full precision in sft until https://github.com/NVIDIA/nemo-rl/issues/13 is fixed
+                torch_dtype=torch.float32,  # use full precision in sft until https://github.com/NVIDIA-NeMo/RL/issues/13 is fixed
                 trust_remote_code=True,
                 **sliding_window_overwrite(
                     model_name
@@ -249,7 +249,7 @@ def train(
         skip_tie_check = os.environ.get("NRL_SKIP_TIED_WEIGHT_CHECK")
         if self.num_tied_weights != 0 and not skip_tie_check:
             raise ValueError(
-                f"Using FSP1 with a model ({self.cfg['model_name']}) that has tied weights (num_tied_weights={self.num_tied_weights}) is not supported (https://github.com/NVIDIA/NeMo-RL/issues/227). Please use dtensor policy with tensor parallel == 1 instead."
+                f"Using FSP1 with a model ({self.cfg['model_name']}) that has tied weights (num_tied_weights={self.num_tied_weights}) is not supported (https://github.com/NVIDIA-NeMo/RL/issues/227). Please use dtensor policy with tensor parallel == 1 instead."
             )
 
         if gbs is None:
diff --git a/nemo_rl/package_info.py b/nemo_rl/package_info.py
index 3fcefc1375..29883366db 100644
--- a/nemo_rl/package_info.py
+++ b/nemo_rl/package_info.py
@@ -28,8 +28,8 @@
 __contact_names__ = "NVIDIA"
 __contact_emails__ = "nemo-tookit@nvidia.com"
 __homepage__ = "https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/"
-__repository_url__ = "https://github.com/NVIDIA/NeMo-RL"
-__download_url__ = "https://github.com/NVIDIA/NeMo-RL/releases"
+__repository_url__ = "https://github.com/NVIDIA-NeMo/RL"
+__download_url__ = "https://github.com/NVIDIA-NeMo/RL/releases"
 __description__ = "NeMo-RL - a toolkit for model alignment"
 __license__ = "Apache2"
 __keywords__ = "deep learning, machine learning, gpu, NLP, NeMo, nvidia, pytorch, torch, language, reinforcement learning, RLHF, preference modeling, SteerLM, DPO"
diff --git a/nemo_rl/utils/native_checkpoint.py b/nemo_rl/utils/native_checkpoint.py
index b857264d31..43d511bd74 100644
--- a/nemo_rl/utils/native_checkpoint.py
+++ b/nemo_rl/utils/native_checkpoint.py
@@ -248,7 +248,7 @@ def convert_dcp_to_hf(
     config.save_pretrained(hf_ckpt_path)
 
     # TODO: After the following PR gets merged:
-    # https://github.com/NVIDIA/NeMo-RL/pull/148/files
+    # https://github.com/NVIDIA-NeMo/RL/pull/148/files
     # tokenizer should be copied from policy/tokenizer/* instead of relying on the model name
     # We can expose a arg at the top level --tokenizer_path to plumb that through.
     # This is more stable than relying on the current NeMo-RL get_tokenizer() which can
diff --git a/pyproject.toml b/pyproject.toml
index 62095ae9fb..6b1371de83 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -101,7 +101,7 @@ test = [
 megatron-core = { workspace = true }
 nemo-tron = { workspace = true }
 # The NeMo Run source to be used by nemo-tron
-nemo_run = { git = "https://github.com/NVIDIA/NeMo-Run", rev = "414f0077c648fde2c71bb1186e97ccbf96d6844c" }
+nemo_run = { git = "https://github.com/NVIDIA-NeMo/Run", rev = "414f0077c648fde2c71bb1186e97ccbf96d6844c" }
 # torch/torchvision/triton all come from the torch index in order to pick up aarch64 wheels
 torch = [
   { index = "pytorch-cu128" },
diff --git a/tests/functional/dpo.sh b/tests/functional/dpo.sh
index 562f62a0b8..b03b611b25 100755
--- a/tests/functional/dpo.sh
+++ b/tests/functional/dpo.sh
@@ -36,7 +36,7 @@ uv run $PROJECT_ROOT/examples/run_dpo.py \
 uv run tests/json_dump_tb_logs.py $LOG_DIR --output_path $JSON_METRICS
 
 # TODO: threshold set higher since test is flaky
-# https://github.com/NVIDIA/NeMo-RL/issues/370
+# https://github.com/NVIDIA-NeMo/RL/issues/370
 uv run tests/check_metrics.py $JSON_METRICS \
   'data["train/loss"]["3"] < 0.8'
 
diff --git a/tests/functional/test_converter_roundtrip.py b/tests/functional/test_converter_roundtrip.py
index e551d0e6b5..8767e564f5 100644
--- a/tests/functional/test_converter_roundtrip.py
+++ b/tests/functional/test_converter_roundtrip.py
@@ -1,3 +1,16 @@
+# Copyright (c) 2025, NVIDIA CORPORATION.  All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
 #!/usr/bin/env python3
 """
 Functional test for converter roundtrip functionality.
diff --git a/tests/test_suites/llm/sft-llama3.1-8b-instruct-1n8g-fsdp2tp1-long.v2.sh b/tests/test_suites/llm/sft-llama3.1-8b-instruct-1n8g-fsdp2tp1-long.v2.sh
index 9e3a004460..b22c00dec0 100755
--- a/tests/test_suites/llm/sft-llama3.1-8b-instruct-1n8g-fsdp2tp1-long.v2.sh
+++ b/tests/test_suites/llm/sft-llama3.1-8b-instruct-1n8g-fsdp2tp1-long.v2.sh
@@ -32,7 +32,7 @@ uv run examples/run_sft.py \
 # Convert tensorboard logs to json
 uv run tests/json_dump_tb_logs.py $LOG_DIR --output_path $JSON_METRICS
 
-# TODO: the memory check is known to OOM. see https://github.com/NVIDIA/NeMo-RL/issues/263
+# TODO: the memory check is known to OOM. see https://github.com/NVIDIA-NeMo/RL/issues/263
 # Only run metrics if the target step is reached
 if [[ $(jq 'to_entries | .[] | select(.key == "train/loss") | .value | keys | map(tonumber) | max' $JSON_METRICS) -ge $MAX_STEPS ]]; then
     # TODO: FIGURE OUT CORRECT METRICS
diff --git a/tests/test_suites/llm/sft-llama3.1-8b-instruct-1n8g-fsdp2tp2sp.v2.sh b/tests/test_suites/llm/sft-llama3.1-8b-instruct-1n8g-fsdp2tp2sp.v2.sh
index 26c78649c8..abed80e5ed 100755
--- a/tests/test_suites/llm/sft-llama3.1-8b-instruct-1n8g-fsdp2tp2sp.v2.sh
+++ b/tests/test_suites/llm/sft-llama3.1-8b-instruct-1n8g-fsdp2tp2sp.v2.sh
@@ -31,7 +31,7 @@ uv run examples/run_sft.py \
 # Convert tensorboard logs to json
 uv run tests/json_dump_tb_logs.py $LOG_DIR --output_path $JSON_METRICS
 
-# TODO: memory check will fail due to OOM tracked here https://github.com/NVIDIA/NeMo-RL/issues/263
+# TODO: memory check will fail due to OOM tracked here https://github.com/NVIDIA-NeMo/RL/issues/263
 
 # Only run metrics if the target step is reached
 if [[ $(jq 'to_entries | .[] | select(.key == "train/loss") | .value | keys | map(tonumber) | max' $JSON_METRICS) -ge $MAX_STEPS ]]; then
diff --git a/tests/test_suites/llm/sft-qwen2.5-32b-4n8g-fsdp2tp8sp-actckpt.v2.sh b/tests/test_suites/llm/sft-qwen2.5-32b-4n8g-fsdp2tp8sp-actckpt.v2.sh
index eeaa9c8025..257add6fc5 100755
--- a/tests/test_suites/llm/sft-qwen2.5-32b-4n8g-fsdp2tp8sp-actckpt.v2.sh
+++ b/tests/test_suites/llm/sft-qwen2.5-32b-4n8g-fsdp2tp8sp-actckpt.v2.sh
@@ -3,7 +3,7 @@ SCRIPT_DIR=$( cd -- "$( dirname -- "${BASH_SOURCE[0]}" )" &> /dev/null && pwd)
 source $SCRIPT_DIR/common.env
 
 # TODO: this config can crash on OOM
-# https://github.com/NVIDIA/NeMo-RL/issues/263
+# https://github.com/NVIDIA-NeMo/RL/issues/263
 
 # ===== BEGIN CONFIG =====
 NUM_NODES=4
diff --git a/tests/unit/models/generation/test_vllm_generation.py b/tests/unit/models/generation/test_vllm_generation.py
index dc1de1b123..1404b02337 100644
--- a/tests/unit/models/generation/test_vllm_generation.py
+++ b/tests/unit/models/generation/test_vllm_generation.py
@@ -475,7 +475,7 @@ async def test_vllm_policy_generation_async(
 
 
 @pytest.mark.skip(
-    reason="Skipping for now, will be fixed in https://github.com/NVIDIA/NeMo-RL/issues/408"
+    reason="Skipping for now, will be fixed in https://github.com/NVIDIA-NeMo/RL/issues/408"
 )
 def test_vllm_worker_seed_behavior(cluster, tokenizer):
     """
diff --git a/tests/unit/utils/test_native_checkpoint.py b/tests/unit/utils/test_native_checkpoint.py
index 88356d2dba..7df7f8543b 100755
--- a/tests/unit/utils/test_native_checkpoint.py
+++ b/tests/unit/utils/test_native_checkpoint.py
@@ -330,7 +330,7 @@ def test_convert_dcp_to_hf(policy, num_gpus):
             os.path.join(tmp_dir, "test_hf_and_dcp-hf-offline"),
             simple_policy_config["model_name"],
             # TODO: After the following PR gets merged:
-            # https://github.com/NVIDIA/NeMo-RL/pull/148/files
+            # https://github.com/NVIDIA-NeMo/RL/pull/148/files
             # tokenizer should be copied from policy/tokenizer/* instead of relying on the model name
             # We can expose a arg at the top level --tokenizer_path to plumb that through.
             # This is more stable than relying on the current NeMo-RL get_tokenizer() which can
diff --git a/uv.lock b/uv.lock
index 9b50767fac..e2a02bdd23 100644
--- a/uv.lock
+++ b/uv.lock
@@ -2427,7 +2427,7 @@ test = [
 [[package]]
 name = "nemo-run"
 version = "0.5.0rc0.dev0"
-source = { git = "https://github.com/NVIDIA/NeMo-Run?rev=414f0077c648fde2c71bb1186e97ccbf96d6844c#414f0077c648fde2c71bb1186e97ccbf96d6844c" }
+source = { git = "https://github.com/NVIDIA-NeMo/Run?rev=414f0077c648fde2c71bb1186e97ccbf96d6844c#414f0077c648fde2c71bb1186e97ccbf96d6844c" }
 dependencies = [
     { name = "catalogue" },
     { name = "cryptography" },
@@ -2473,7 +2473,7 @@ requires-dist = [
     { name = "ijson" },
     { name = "lightning" },
     { name = "matplotlib" },
-    { name = "nemo-run", git = "https://github.com/NVIDIA/NeMo-Run?rev=414f0077c648fde2c71bb1186e97ccbf96d6844c" },
+    { name = "nemo-run", git = "https://github.com/NVIDIA-NeMo/Run?rev=414f0077c648fde2c71bb1186e97ccbf96d6844c" },
     { name = "onnx" },
     { name = "scikit-learn" },
     { name = "webdataset" },