Skip to content

Commit 0dfc817

Browse files
ertkonukmarcromeynmichal2409akoumpapablo-garay
authored andcommitted
Adds Tiktoken tokenizer for Nemotron-Mistral 12B (NVIDIA#9797)
* Adding context- & expert-parallism to MegatronStrategy (#9525) Signed-off-by: Tugrul Konuk <[email protected]> * Add CICD test for Stable Diffusion (#9464) * Add CICD test for Stable Diffusion Signed-off-by: Michal Futrega <[email protected]> * Update cicd-main.yml Signed-off-by: Michal Futrega <[email protected]> * Use single gpu runner Signed-off-by: Michal Futrega <[email protected]> --------- Signed-off-by: Michal Futrega <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * Akoumparouli/nemo ux mixtral (#9446) * use default collate if dataset does not have one Signed-off-by: Alexandros Koumparoulis <[email protected]> * mixtral config Signed-off-by: Alexandros Koumparoulis <[email protected]> * add convert_state Signed-off-by: Alexandros Koumparoulis <[email protected]> * fix StateDictTransform for 2D layers, e.g. MoE Signed-off-by: Alexandros Koumparoulis <[email protected]> * pass num_moe_experts to specs Signed-off-by: Alexandros Koumparoulis <[email protected]> * udpate MixtralModel Signed-off-by: Alexandros Koumparoulis <[email protected]> * Apply isort and black reformatting Signed-off-by: akoumpa <[email protected]> Signed-off-by: Alexandros Koumparoulis <[email protected]> * mini docstring Signed-off-by: Alexandros Koumparoulis <[email protected]> * Apply isort and black reformatting Signed-off-by: akoumpa <[email protected]> --------- Signed-off-by: Alexandros Koumparoulis <[email protected]> Signed-off-by: akoumpa <[email protected]> Co-authored-by: akoumpa <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * update mcoreddp call (#9345) * update mcoreddp call Signed-off-by: Alexandros Koumparoulis <[email protected]> * update mcore commits Signed-off-by: Alexandros Koumparoulis <[email protected]> --------- Signed-off-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: Pablo Garay <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * [NeMo-UX] Llama and Gemma (#9528) * add llama Signed-off-by: Chen Cui <[email protected]> * Apply isort and black reformatting Signed-off-by: cuichenx <[email protected]> * add llama Signed-off-by: Chen Cui <[email protected]> * Apply isort and black reformatting Signed-off-by: cuichenx <[email protected]> * add llama3 Signed-off-by: Chen Cui <[email protected]> * Apply isort and black reformatting Signed-off-by: cuichenx <[email protected]> * fix typo Signed-off-by: Chen Cui <[email protected]> * enable importers with multiple models Signed-off-by: Chen Cui <[email protected]> * Apply isort and black reformatting Signed-off-by: cuichenx <[email protected]> * add gemma Signed-off-by: Chen Cui <[email protected]> * Apply isort and black reformatting Signed-off-by: cuichenx <[email protected]> * checks Signed-off-by: Chen Cui <[email protected]> * Apply isort and black reformatting Signed-off-by: cuichenx <[email protected]> --------- Signed-off-by: Chen Cui <[email protected]> Signed-off-by: cuichenx <[email protected]> Co-authored-by: cuichenx <[email protected]> Co-authored-by: Marc Romeyn <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * [NeMo-UX] minor logging bug fixes (#9529) * minor exp_manager bug fixes * remove print statement * fix docstring * fix AppState defaults --------- Co-authored-by: Marc Romeyn <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * mcore distOpt restore fix (#9421) Signed-off-by: Alexandros Koumparoulis <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * Custom Tiktoken tokenizer. Signed-off-by: Tugrul Konuk <[email protected]> * Fixed the tokenizer decoding on special tokens. Signed-off-by: Tugrul Konuk <[email protected]> * Apply isort and black reformatting Signed-off-by: ertkonuk <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * Added token_to_id() method. Signed-off-by: Tugrul Konuk <[email protected]> * Update neva conversion script from and to HF (#9296) * Update NeMo script Signed-off-by: yaoyu-33 <[email protected]> * Fix example scripts Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * Update convert_llava_nemo_to_hf.py Signed-off-by: yaoyu-33 <[email protected]> * address comments Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * vLLM Export Support (#9381) * Export implementation for vLLM 0.4.3. Supports LLAMA2, Mistral, Mixtral (unverified), Gemma and StarCoder2 models. The nemo.export.tensorrt_llm alias was removed to avoid initializing TRT-LLM when importing anything from nemo.export. Signed-off-by: Alexey Panteleev <[email protected]> * Fixed some CodeQL warnings. Signed-off-by: Alexey Panteleev <[email protected]> * Apply isort and black reformatting Signed-off-by: apanteleev <[email protected]> * Removed empty files. Signed-off-by: Alexey Panteleev <[email protected]> * Apply isort and black reformatting Signed-off-by: apanteleev <[email protected]> * Updated the integration for vLLM 0.5.0. Signed-off-by: Alexey Panteleev <[email protected]> * Updated the vLLM deployment interface to use max_output_len instead of max_output_token. Signed-off-by: Alexey Panteleev <[email protected]> * Apply isort and black reformatting Signed-off-by: apanteleev <[email protected]> * Moved the Exporter class to nemo/export and renamed its file to vllm_exporter.py, to be more similar to TRT-LLM. Signed-off-by: Alexey Panteleev <[email protected]> * Apply isort and black reformatting Signed-off-by: apanteleev <[email protected]> * Implemented vLLM support in the export tests, added functional testing, implemented forward evaluation on vLLM without Triton. Signed-off-by: Alexey Panteleev <[email protected]> * Apply isort and black reformatting Signed-off-by: apanteleev <[email protected]> * Moved the vLLM deployment functionality to the common deploy_triton.py script. Signed-off-by: Alexey Panteleev <[email protected]> * Apply isort and black reformatting Signed-off-by: apanteleev <[email protected]> * Fixed the CodeQL discovered issues. Signed-off-by: Alexey Panteleev <[email protected]> * Apply isort and black reformatting Signed-off-by: apanteleev <[email protected]> * Fixed one more return of a wrong dimensionality... Signed-off-by: Alexey Panteleev <[email protected]> * More wrong dimensionality returns. Signed-off-by: Alexey Panteleev <[email protected]> --------- Signed-off-by: Alexey Panteleev <[email protected]> Signed-off-by: apanteleev <[email protected]> Co-authored-by: apanteleev <[email protected]> Co-authored-by: Onur Yilmaz <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * PL: Delete precision if using plugin. TODO switch to MegatronTrainerBuilder (#9535) Signed-off-by: Alexandros Koumparoulis <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * Add page context fmha (#9526) Signed-off-by: Tugrul Konuk <[email protected]> * extend get_gpt_layer_modelopt_spec to support MoE (#9532) Signed-off-by: Alexandros Koumparoulis <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * fix mock data generation for legacy dataset (#9530) Signed-off-by: dimapihtar <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * [Nemo-UX] IO fixes (#9512) * Improve IOMixin.io_transform_args to handle dataclasses better * Dump task json + img inside NeMoLogger * Adding store_io to train task * Update opt.connect to also propagate to __io__ * Rename opt to optim for consistency * Moving to using safe serialization using fiddle, only use cloudpickle when needed * Apply isort and black reformatting Signed-off-by: marcromeyn <[email protected]> * Using Config from fiddle instead of sdk for now * Apply isort and black reformatting Signed-off-by: marcromeyn <[email protected]> * Move enable_nemo_ckpt_io from MegatronStrategy to ModelCheckpoint * Apply isort and black reformatting Signed-off-by: marcromeyn <[email protected]> * Move nemo-ckpt to _get_finalize_save_checkpoint_callback * Apply isort and black reformatting Signed-off-by: marcromeyn <[email protected]> * Update TrainerContext & io.load_ckpt * Use renamed TrainerContext inside ModelCheckpoint * Remove double io saving * Rename lightning.pytorch.opt -> optim * Apply isort and black reformatting Signed-off-by: marcromeyn <[email protected]> * Remove store_io from train-task * Adding fiddle-extension for torch * Apply isort and black reformatting Signed-off-by: marcromeyn <[email protected]> * Move fdl_torch import * Apply isort and black reformatting Signed-off-by: marcromeyn <[email protected]> * Adding dtype to serialization * Some fixes * Apply isort and black reformatting Signed-off-by: marcromeyn <[email protected]> * Make TransformerConfig inherit from IOMixin to fix serialization error * Make TransformerConfig inherit from IOMixin to fix serialization error * Apply isort and black reformatting Signed-off-by: marcromeyn <[email protected]> * Add support for BuiltinFunctionType * Apply isort and black reformatting Signed-off-by: marcromeyn <[email protected]> * Add missing import * Apply isort and black reformatting Signed-off-by: marcromeyn <[email protected]> * Fix dataclass fields --------- Signed-off-by: marcromeyn <[email protected]> Co-authored-by: marcromeyn <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * Test C++ runtime on demand in nemo_export.py to avoid possible OOMs (#9544) * Add test_cpp_runtime flag Signed-off-by: Jan Lasek <[email protected]> * Apply isort and black reformatting Signed-off-by: janekl <[email protected]> --------- Signed-off-by: Jan Lasek <[email protected]> Signed-off-by: janekl <[email protected]> Co-authored-by: janekl <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * Fix lhotse tests for v1.24.2 (#9546) * Fix lhotse tests for v1.24.0 Signed-off-by: Piotr Żelasko <[email protected]> * Fix RIR test Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * gpu_unitTests_notOptional (#9551) Signed-off-by: Tugrul Konuk <[email protected]> * add reset learning rate functionality (#9372) * add reset_lr functionality Signed-off-by: dimapihtar <[email protected]> * fix reset_lr logic Signed-off-by: dimapihtar <[email protected]> * Apply isort and black reformatting Signed-off-by: dimapihtar <[email protected]> * move reset_lr from optim section Signed-off-by: dimapihtar <[email protected]> * Apply isort and black reformatting Signed-off-by: dimapihtar <[email protected]> * add reset_lr value to config Signed-off-by: dimapihtar <[email protected]> * set reset_lr False by default Signed-off-by: dimapihtar <[email protected]> * remove extra line Signed-off-by: dimapihtar <[email protected]> * add reset_lr test Signed-off-by: dimapihtar <[email protected]> * add reset_lr test Signed-off-by: dimapihtar <[email protected]> * remove extra quote Signed-off-by: dimapihtar <[email protected]> * add ability to reset schedule's max_steps and decay_steps Signed-off-by: dimapihtar <[email protected]> * Apply isort and black reformatting Signed-off-by: dimapihtar <[email protected]> * change scheduler's first step logic when using reset_lr Signed-off-by: dimapihtar <[email protected]> * revert config Signed-off-by: dimapihtar <[email protected]> * fix reset_lr logic Signed-off-by: dimapihtar <[email protected]> * Apply isort and black reformatting Signed-off-by: dimapihtar <[email protected]> * revert config Signed-off-by: dimapihtar <[email protected]> * revert config Signed-off-by: dimapihtar <[email protected]> * update reset_lr comments Signed-off-by: dimapihtar <[email protected]> * add use cases for reset_lr feature Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: dimapihtar <[email protected]> Signed-off-by: dimapihtar <[email protected]> Co-authored-by: dimapihtar <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * Add Python AIStore SDK to container and bump min Lhotse version (#9537) * Add Python AIStore SDK to requirements and bump min Lhotse version Signed-off-by: Piotr Żelasko <[email protected]> * Move AIStore Python SDK to Dockerfile, remove matplotlib/ipywidgets deps Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * Adding 'use_dynamo' option for export to use onnx.dynamo_export() instead of onnx.export() (#9147) * Ininial WARs to implement dynamo option for export Signed-off-by: Boris Fomitchev <[email protected]> * including weights in .onnx Signed-off-by: Boris Fomitchev <[email protected]> * dynamo_export works for many small models Signed-off-by: Boris Fomitchev <[email protected]> * External weights behaviour fixed Signed-off-by: Boris Fomitchev <[email protected]> * Cleanup Signed-off-by: Boris Fomitchev <[email protected]> * Apply isort and black reformatting Signed-off-by: borisfom <[email protected]> * print cleaned up Signed-off-by: Boris Fomitchev <[email protected]> * Added overloadable dynamic_shapes_for_export Signed-off-by: Boris Fomitchev <[email protected]> * Addressing code review Signed-off-by: Boris Fomitchev <[email protected]> * Fixing CI issues Signed-off-by: Boris Fomitchev <[email protected]> * Fixing CI test failure Signed-off-by: Boris Fomitchev <[email protected]> * Eliminated test cross-contamination Signed-off-by: Boris Fomitchev <[email protected]> --------- Signed-off-by: Boris Fomitchev <[email protected]> Signed-off-by: borisfom <[email protected]> Co-authored-by: Eric Harper <[email protected]> Co-authored-by: Somshubra Majumdar <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * [NeMo-UX] Fix tokenizer IO (#9555) * Adding tokenizer to io-test + making it pass * Handling tokenizer correctly inside dump_io * Apply isort and black reformatting Signed-off-by: marcromeyn <[email protected]> * Removing not used import --------- Signed-off-by: marcromeyn <[email protected]> Co-authored-by: marcromeyn <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * [NeMo UX] Move mistral_7b.py to mistral.py (#9545) * Move mistral_7b.py to mistral.py Signed-off-by: Alexandros Koumparoulis <[email protected]> * rename MixtralConfig to MixtralConfig8x7B Signed-off-by: Alexandros Koumparoulis <[email protected]> * mistral rename: mistralconfig7b & mistralmodel Signed-off-by: Alexandros Koumparoulis <[email protected]> * fix Signed-off-by: Alexandros Koumparoulis <[email protected]> --------- Signed-off-by: Alexandros Koumparoulis <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * Use closed-formula to round by multiple (#9307) * Use closed-formula to round by multiple Signed-off-by: Alexandros Koumparoulis <[email protected]> * Apply isort and black reformatting Signed-off-by: akoumpa <[email protected]> --------- Signed-off-by: Alexandros Koumparoulis <[email protected]> Signed-off-by: akoumpa <[email protected]> Co-authored-by: akoumpa <[email protected]> Co-authored-by: Pablo Garay <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * ci: Do not attempt to send slack on fork (#9556) * ci: Do not attempt to send slack on fork Signed-off-by: Oliver Koenig <[email protected]> * test Signed-off-by: Oliver Koenig <[email protected]> --------- Signed-off-by: Oliver Koenig <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * Fix nemo export test (#9547) * fix minor import bug Signed-off-by: Onur Yilmaz <[email protected]> * fix export test Signed-off-by: Onur Yilmaz <[email protected]> * Apply isort and black reformatting Signed-off-by: oyilmaz-nvidia <[email protected]> --------- Signed-off-by: Onur Yilmaz <[email protected]> Signed-off-by: oyilmaz-nvidia <[email protected]> Co-authored-by: oyilmaz-nvidia <[email protected]> Co-authored-by: Pablo Garay <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * Fix SDXL incorrect name in docs (#9534) Signed-off-by: Tugrul Konuk <[email protected]> * GPU unit tests: Mark flaky tests to be fixed (#9559) Signed-off-by: Tugrul Konuk <[email protected]> * Bump PTL version (#9557) Signed-off-by: Abhishree <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * [Resiliency] Straggler detection (#9473) * Initial straggler det impl Signed-off-by: Jacek Bieniusiewicz <[email protected]> * Apply isort and black reformatting Signed-off-by: jbieniusiewi <[email protected]> * Fixed CI code checks Signed-off-by: Jacek Bieniusiewicz <[email protected]> * Apply isort and black reformatting Signed-off-by: jbieniusiewi <[email protected]> * Removed unused import Signed-off-by: Jacek Bieniusiewicz <[email protected]> * remove submodule Signed-off-by: Maanu Grover <[email protected]> * Updated documentation; Updated callback params; Cosmetic changes Signed-off-by: Jacek Bieniusiewicz <[email protected]> * Apply isort and black reformatting Signed-off-by: jbieniusiewi <[email protected]> * Fixed straggler det config; Added basic test Signed-off-by: Jacek Bieniusiewicz <[email protected]> * Apply isort and black reformatting Signed-off-by: jbieniusiewi <[email protected]> * Fixes in test_straggler_det.py Signed-off-by: Jacek Bieniusiewicz <[email protected]> * Updated straggler callback API Signed-off-by: Jacek Bieniusiewicz <[email protected]> * Apply isort and black reformatting Signed-off-by: jbieniusiewi <[email protected]> * stop_if_detected=False by default Signed-off-by: Jacek Bieniusiewicz <[email protected]> --------- Signed-off-by: Jacek Bieniusiewicz <[email protected]> Signed-off-by: jbieniusiewi <[email protected]> Signed-off-by: Maanu Grover <[email protected]> Co-authored-by: jbieniusiewi <[email protected]> Co-authored-by: Maanu Grover <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * switch to torch_dist as default dist checkpointing backend (#9541) Signed-off-by: ashors1 <[email protected]> Co-authored-by: Marc Romeyn <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * [NeMo-UX] Checkpointing bug fixes (#9562) * fix checkpoint loading * fix * fixes * another fix * Apply isort and black reformatting Signed-off-by: ashors1 <[email protected]> --------- Signed-off-by: ashors1 <[email protected]> Co-authored-by: ashors1 <[email protected]> Co-authored-by: Marc Romeyn <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * Add tps and pps params to the export script (#9558) * fix minor import bug Signed-off-by: Onur Yilmaz <[email protected]> * fix export test Signed-off-by: Onur Yilmaz <[email protected]> * Apply isort and black reformatting Signed-off-by: oyilmaz-nvidia <[email protected]> * remove n_gpus param Signed-off-by: Onur Yilmaz <[email protected]> * add and fix parameters Signed-off-by: Onur Yilmaz <[email protected]> * fix deploy script Signed-off-by: Onur Yilmaz <[email protected]> * Apply isort and black reformatting Signed-off-by: oyilmaz-nvidia <[email protected]> * rename tps and pps params Signed-off-by: Onur Yilmaz <[email protected]> --------- Signed-off-by: Onur Yilmaz <[email protected]> Signed-off-by: oyilmaz-nvidia <[email protected]> Co-authored-by: oyilmaz-nvidia <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * Consolidate gpt continue training script into pretraining script (#9413) * Consolidate gpt continue training with pretraining Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * fix default config Signed-off-by: yaoyu-33 <[email protected]> * Add github action cicd Signed-off-by: yaoyu-33 <[email protected]> * extract _integrate_original_checkpoint_data as a method Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * fix getattr Signed-off-by: yaoyu-33 <[email protected]> * Revert "Add github action cicd" This reverts commit a453f16ba2be6413db932623009da893208acdd5. * Update comments in nlp_overrides.py Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * Add support to change Multi task model prompt (#9542) * Add support to change Multi task model prompt Signed-off-by: smajumdar <[email protected]> * Add support to change Multi task model prompt Signed-off-by: smajumdar <[email protected]> * Apply isort and black reformatting Signed-off-by: titu1994 <[email protected]> * Update nemo/collections/common/prompts/formatter.py Co-authored-by: Piotr Żelasko <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> * Address comments Signed-off-by: smajumdar <[email protected]> * Apply isort and black reformatting Signed-off-by: titu1994 <[email protected]> * Address comments Signed-off-by: smajumdar <[email protected]> --------- Signed-off-by: smajumdar <[email protected]> Signed-off-by: titu1994 <[email protected]> Signed-off-by: Somshubra Majumdar <[email protected]> Co-authored-by: Piotr Żelasko <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * Add Multimodal Exporter (#9256) * Add video-neva TRT export * Add TRT inference * Change config * Apply isort and black reformatting Signed-off-by: meatybobby <[email protected]> * Change export params * Remove unused import * Add neva export * Apply isort and black reformatting Signed-off-by: meatybobby <[email protected]> * Change unpack nemo * Apply isort and black reformatting Signed-off-by: meatybobby <[email protected]> * Add trt infer config * Fix neva trt inference * Apply isort and black reformatting Signed-off-by: meatybobby <[email protected]> * Add exporter * Apply isort and black reformatting Signed-off-by: meatybobby <[email protected]> * Fix infer * Add PyTriton * Apply isort and black reformatting Signed-off-by: meatybobby <[email protected]> * Fix deploy wrong dim * Apply isort and black reformatting Signed-off-by: meatybobby <[email protected]> * Change to pass PIL Image * Apply isort and black reformatting Signed-off-by: meatybobby <[email protected]> * Fix video neva deploy * Change query * Change deploy * Remove unused import * Change ptuning * Change to mm exporter * Add script * Apply isort and black reformatting Signed-off-by: meatybobby <[email protected]> * Fix script --------- Signed-off-by: meatybobby <[email protected]> Co-authored-by: meatybobby <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * Enable encoder adapters for Canary and MultiTaskAED models (#9409) * Fix assertions for adapter types Signed-off-by: smajumdar <[email protected]> * Apply isort and black reformatting Signed-off-by: titu1994 <[email protected]> * Cleanup Signed-off-by: smajumdar <[email protected]> * Apply isort and black reformatting Signed-off-by: titu1994 <[email protected]> * Finalize support for decoder adapters Signed-off-by: smajumdar <[email protected]> * Apply isort and black reformatting Signed-off-by: titu1994 <[email protected]> * fix the freeze/unfreeze problem by replacing as_frozen with torch.inference_mode * Apply isort and black reformatting Signed-off-by: weiqingw4ng <[email protected]> * Update tests to new generic way of module update Signed-off-by: smajumdar <[email protected]> * Finalize code for update module Signed-off-by: smajumdar <[email protected]> * Apply isort and black reformatting Signed-off-by: titu1994 <[email protected]> * Fix variable name Signed-off-by: smajumdar <[email protected]> * Finalize projection support for transformer mha adapters Signed-off-by: smajumdar <[email protected]> * Apply isort and black reformatting Signed-off-by: titu1994 <[email protected]> * Correct implementation of freeze restore Signed-off-by: smajumdar <[email protected]> * Apply isort and black reformatting Signed-off-by: titu1994 <[email protected]> * Corrects the implementation of replace_adapter_modules to limit to just the top level modules Signed-off-by: smajumdar <[email protected]> * Apply isort and black reformatting Signed-off-by: titu1994 <[email protected]> * Remove registration of Transformer MHA Signed-off-by: smajumdar <[email protected]> * Remove registration of Transformer MHA Signed-off-by: smajumdar <[email protected]> * Address reviewer comments Signed-off-by: smajumdar <[email protected]> --------- Signed-off-by: smajumdar <[email protected]> Signed-off-by: titu1994 <[email protected]> Signed-off-by: weiqingw4ng <[email protected]> Co-authored-by: Weiqing Wang <[email protected]> Co-authored-by: weiqingw4ng <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * pass option through (#9570) Signed-off-by: Maanu Grover <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * PTQ refinements (#9574) * Rename megatron_gpt_quantization -> megatron_gpt_ptq Signed-off-by: Jan Lasek <[email protected]> * Configure export.save_path as dir or tarball Signed-off-by: Jan Lasek <[email protected]> * PTQ docs update Signed-off-by: Jan Lasek <[email protected]> * Make model_type optional in case of quantized checkpoints Signed-off-by: Jan Lasek <[email protected]> * Drop unused save_nemo_model_config argument Signed-off-by: Jan Lasek <[email protected]> --------- Signed-off-by: Jan Lasek <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * Audio model collection (#9263) * Audio model collection Signed-off-by: Ante Jukić <[email protected]> * Apply isort and black reformatting Signed-off-by: anteju <[email protected]> * Fix imports Signed-off-by: Ante Jukić <[email protected]> * Addressed PR comments Signed-off-by: Ante Jukić <[email protected]> * Apply isort and black reformatting Signed-off-by: anteju <[email protected]> --------- Signed-off-by: Ante Jukić <[email protected]> Signed-off-by: anteju <[email protected]> Co-authored-by: anteju <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * [NeMo-UX] Fix Trainer serialization (#9571) * Fix Trainer serialization * Apply isort and black reformatting Signed-off-by: marcromeyn <[email protected]> --------- Signed-off-by: marcromeyn <[email protected]> Co-authored-by: marcromeyn <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * Update click version requirement (#9580) Signed-off-by: Dong Hyuk Chang <[email protected]> Co-authored-by: Dong Hyuk Chang <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * [Fault tolerance] Heartbeat detection (#9352) * Fault tolerance related changes Signed-off-by: Jacek Bieniusiewicz <[email protected]> * Cosmetic changes in documentation Signed-off-by: Jacek Bieniusiewicz <[email protected]> * Apply isort and black reformatting Signed-off-by: jbieniusiewi <[email protected]> * Doc update round2 Signed-off-by: Jacek Bieniusiewicz <[email protected]> --------- Signed-off-by: Jacek Bieniusiewicz <[email protected]> Signed-off-by: jbieniusiewi <[email protected]> Co-authored-by: Jacek Bieniusiewicz <[email protected]> Co-authored-by: jbieniusiewi <[email protected]> Co-authored-by: jbieniusiewi <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * Add ModelOpt QAT example for Llama2 SFT model (#9326) * add INT4 QAT example for Llama2 SFT model Signed-off-by: Keval Morabia <[email protected]> * Add config parameter to control kv cache quantization Signed-off-by: Keval Morabia <[email protected]> * Fix typo in cicd-main.yml for QAT test Signed-off-by: Keval Morabia <[email protected]> * fix nlp_overrides.py Signed-off-by: Keval Morabia <[email protected]> * address reviewer feedback Signed-off-by: Keval Morabia <[email protected]> * quantize unwrapped model Signed-off-by: Keval Morabia <[email protected]> * add compress export argument for qat config Signed-off-by: Keval Morabia <[email protected]> --------- Signed-off-by: Keval Morabia <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * Set TE flag in legacy -> mcore conversion script (#9585) * set TE flag Signed-off-by: Chen Cui <[email protected]> * Apply isort and black reformatting Signed-off-by: cuichenx <[email protected]> --------- Signed-off-by: Chen Cui <[email protected]> Signed-off-by: cuichenx <[email protected]> Co-authored-by: cuichenx <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * [Nemo-UX] Add fabric-API for manual forward-pass (#9577) * First pass over fabric-API * Adding Trainer -> Fabric conversion * Some small fixes to get a forward-pass in Fabric working * Apply isort and black reformatting Signed-off-by: marcromeyn <[email protected]> * Adding doc-string to Fabric.import_model * Adding track_io to io_init of Fabric * Fix Fabric.load_model + add doc-string * Apply isort and black reformatting Signed-off-by: marcromeyn <[email protected]> * Remove unused import * Some small fixes * Fix failing test --------- Signed-off-by: marcromeyn <[email protected]> Co-authored-by: marcromeyn <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * [Nemo-UX] Add SDK-factories to llm-collection (#9589) * Adding sdk-factories to llm-collection * Removing _model from mistral + mixtral * Expose lr_scheduler inside lightning * Apply isort and black reformatting Signed-off-by: marcromeyn <[email protected]> --------- Signed-off-by: marcromeyn <[email protected]> Co-authored-by: marcromeyn <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * Multimodal projection layer adapter fix for PP>1 (#9445) * enabling multimodal adapters to load in PP>1 Signed-off-by: paul-gibbons <[email protected]> * Apply isort and black reformatting Signed-off-by: paul-gibbons <[email protected]> * parameterizing validate_access_integrity, set to false when PP>1 Signed-off-by: paul-gibbons <[email protected]> formatting fix Signed-off-by: paul-gibbons <[email protected]> Apply isort and black reformatting Signed-off-by: paul-gibbons <[email protected]> * Apply isort and black reformatting Signed-off-by: paul-gibbons <[email protected]> * update nlp_model.py Signed-off-by: paul-gibbons <[email protected]> * Apply isort and black reformatting Signed-off-by: paul-gibbons <[email protected]> * update modelPT with validate_access_integrity Signed-off-by: paul-gibbons <[email protected]> * Apply isort and black reformatting Signed-off-by: paul-gibbons <[email protected]> * updating save_restore_connector w/ validate_access_integrity Signed-off-by: paul-gibbons <[email protected]> * Apply isort and black reformatting Signed-off-by: paul-gibbons <[email protected]> * addressing comment Signed-off-by: paul-gibbons <[email protected]> * adding validate_access_integrity to super().load_config_and_state_dict() Signed-off-by: paul-gibbons <[email protected]> * testing reorder of validate_access_integrity for CI failures Signed-off-by: paul-gibbons <[email protected]> --------- Signed-off-by: paul-gibbons <[email protected]> Signed-off-by: paul-gibbons <[email protected]> Co-authored-by: paul-gibbons <[email protected]> Co-authored-by: Eric Harper <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * Add offline quantization script for QLoRA deployment (#9455) * add qlora offline quantization script Signed-off-by: Chen Cui <[email protected]> * Apply isort and black reformatting Signed-off-by: cuichenx <[email protected]> * clean Signed-off-by: Chen Cui <[email protected]> * docstring Signed-off-by: Chen Cui <[email protected]> --------- Signed-off-by: Chen Cui <[email protected]> Signed-off-by: cuichenx <[email protected]> Co-authored-by: cuichenx <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * qlora support more models (#9488) Signed-off-by: Chen Cui <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * [NeMo-UX] Some improvements to NeMoLogger (#9591) Signed-off-by: Tugrul Konuk <[email protected]> * Set n_gpu to None in nemo export (#9593) * fix minor import bug Signed-off-by: Onur Yilmaz <[email protected]> * set ngpus to None Signed-off-by: Onur Yilmaz <[email protected]> --------- Signed-off-by: Onur Yilmaz <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * Inflight nemo model export support (#9527) * online model conversion and refit Signed-off-by: Jimmy Zhang <[email protected]> * clean code Signed-off-by: Jimmy Zhang <[email protected]> * cleanup Signed-off-by: Jimmy Zhang <[email protected]> * add refit, cleanup code Signed-off-by: Jimmy Zhang <[email protected]> * combine weight conversion functions Signed-off-by: Jimmy Zhang <[email protected]> * cleanup code Signed-off-by: Jimmy Zhang <[email protected]> * Apply isort and black reformatting Signed-off-by: JimmyZhang12 <[email protected]> * remove debug print Signed-off-by: Jimmy Zhang <[email protected]> * cleanup code Signed-off-by: Jimmy Zhang <[email protected]> * fix single gpu and cleanup code Signed-off-by: Jimmy Zhang <[email protected]> * Apply isort and black reformatting Signed-off-by: JimmyZhang12 <[email protected]> --------- Signed-off-by: JimmyZhang12 <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * vLLM Export Improvements (#9596) * Separated the vLLM export functionality from the common deployment script into deploy_vllm_triton.py. Signed-off-by: Alexey Panteleev <[email protected]> * Fixed vocab_size for LLAMA3. Signed-off-by: Alexey Panteleev <[email protected]> * Export test: fixed deployment testing w/o Megatron, made functional tests optional, added --gpu_memory_utilization. Signed-off-by: Alexey Panteleev <[email protected]> * Apply isort and black reformatting Signed-off-by: apanteleev <[email protected]> * Addressing review and CodeQL comments. Signed-off-by: Alexey Panteleev <[email protected]> --------- Signed-off-by: Alexey Panteleev <[email protected]> Signed-off-by: apanteleev <[email protected]> Co-authored-by: apanteleev <[email protected]> Co-authored-by: Onur Yilmaz <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * Set finalize_model_grads_func in on_fit_start instead to make sure it's being called (#9599) Signed-off-by: Tugrul Konuk <[email protected]> * Set no_sync_func & grad_sync_fucn (#9601) * Set no_sync_func & grad_sync_fucn Signed-off-by: Alexandros Koumparoulis <[email protected]> * set overlap_param_sync Signed-off-by: Alexandros Koumparoulis <[email protected]> * Apply isort and black reformatting Signed-off-by: akoumpa <[email protected]> --------- Signed-off-by: Alexandros Koumparoulis <[email protected]> Signed-off-by: akoumpa <[email protected]> Co-authored-by: akoumpa <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * small nemo logger bug fix (#9607) Co-authored-by: Marc Romeyn <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * fix the dict format returned by scheduler method (#9609) Co-authored-by: Marc Romeyn <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * [NeMo-UX] Dataloading enhancements and bug fixes (#9595) * fix dataloading + checkpoint restore * clean up data sampler * fix typo * support passing multiple paths to data module * fix validation dataloader * fix dataloader len when using gradient accumulation * fix progress bar * Apply isort and black reformatting Signed-off-by: ashors1 <[email protected]> * fix step count in loggers * fix blended dataset * address comments * address comment * move step logging into strategy * Apply isort and black reformatting Signed-off-by: ashors1 <[email protected]> --------- Signed-off-by: ashors1 <[email protected]> Co-authored-by: Marc Romeyn <[email protected]> Co-authored-by: ashors1 <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * Fix serialization of AutoResume (#9616) * fix serialization of autoresume * update undefined variables Signed-off-by: Tugrul Konuk <[email protected]> * Chat template support for megatron_gpt_eval.py (#9354) * Bump PTL version (#9557) Signed-off-by: Abhishree <[email protected]> Signed-off-by: Alexandros Koumparoulis <[email protected]> * [Resiliency] Straggler detection (#9473) * Initial straggler det impl Signed-off-by: Jacek Bieniusiewicz <[email protected]> * Apply isort and black reformatting Signed-off-by: jbieniusiewi <[email protected]> * Fixed CI code checks Signed-off-by: Jacek Bieniusiewicz <[email protected]> * Apply isort and black reformatting Signed-off-by: jbieniusiewi <[email protected]> * Removed unused import Signed-off-by: Jacek Bieniusiewicz <[email protected]> * remove submodule Signed-off-by: Maanu Grover <[email protected]> * Updated documentation; Updated callback params; Cosmetic changes Signed-off-by: Jacek Bieniusiewicz <[email protected]> * Apply isort and black reformatting Signed-off-by: jbieniusiewi <[email protected]> * Fixed straggler det config; Added basic test Signed-off-by: Jacek Bieniusiewicz <[email protected]> * Apply isort and black reformatting Signed-off-by: jbieniusiewi <[email protected]> * Fixes in test_straggler_det.py Signed-off-by: Jacek Bieniusiewicz <[email protected]> * Updated straggler callback API Signed-off-by: Jacek Bieniusiewicz <[email protected]> * Apply isort and black reformatting Signed-off-by: jbieniusiewi <[email protected]> * stop_if_detected=False by default Signed-off-by: Jacek Bieniusiewicz <[email protected]> --------- Signed-off-by: Jacek Bieniusiewicz <[email protected]> Signed-off-by: jbieniusiewi <[email protected]> Signed-off-by: Maanu Grover <[email protected]> Co-authored-by: jbieniusiewi <[email protected]> Co-authored-by: Maanu Grover <[email protected]> Signed-off-by: Alexandros Koumparoulis <[email protected]> * move model loading to separate function; call toContainer once; pad using closed formula Signed-off-by: Alexandros Koumparoulis <[email protected]> * read prompts from file Signed-off-by: Alexandros Koumparoulis <[email protected]> * If input prompt contains dict, apply model.tokenizer.chat_template Signed-off-by: Alexandros Koumparoulis <[email protected]> * Apply isort and black reformatting Signed-off-by: akoumpa <[email protected]> Signed-off-by: Alexandros Koumparoulis <[email protected]> * apply @Gal Leibovich's patch Taken from: https://github.com/NVIDIA/NeMo/commit/17572905344db4692583e72799d55801a8860f35 Signed-off-by: Alexandros Koumparoulis <[email protected]> * rename prompts_file to prompts_jsonl Signed-off-by: Alexandros Koumparoulis <[email protected]> * Apply isort and black reformatting Signed-off-by: akoumpa <[email protected]> Signed-off-by: Alexandros Koumparoulis <[email protected]> * add chat_template param Signed-off-by: Alexandros Koumparoulis <[email protected]> * Add ChatTemplateMixin to SentencePieceTokenizer Signed-off-by: Alexandros Koumparoulis <[email protected]> * add chat-template to text-gen-strat Signed-off-by: Alexandros Koumparoulis <[email protected]> * move load prompts to separate file Signed-off-by: Alexandros Koumparoulis <[email protected]> * remove chat-template from text-gen-utils Signed-off-by: Alexandros Koumparoulis <[email protected]> * make chat-template more generic Signed-off-by: Alexandros Koumparoulis <[email protected]> * add assert message Signed-off-by: Alexandros Koumparoulis <[email protected]> * small refactor for chat_template_mixin Signed-off-by: Alexandros Koumparoulis <[email protected]> * undo ckpt conv changes Signed-off-by: Alexandros Koumparoulis <[email protected]> * Apply isort and black reformatting Signed-off-by: akoumpa <[email protected]> Signed-off-by: Alexandros Koumparoulis <[email protected]> * move rounding to function Signed-off-by: Alexandros Koumparoulis <[email protected]> * fix Signed-off-by: Alexandros Koumparoulis <[email protected]> * Apply isort and black reformatting Signed-off-by: akoumpa <[email protected]> --------- Signed-off-by: Abhishree <[email protected]> Signed-off-by: Alexandros Koumparoulis <[email protected]> Signed-off-by: Jacek Bieniusiewicz <[email protected]> Signed-off-by: jbieniusiewi <[email protected]> Signed-off-by: Maanu Grover <[email protected]> Signed-off-by: akoumpa <[email protected]> Signed-off-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: Abhishree Thittenamane <[email protected]> Co-authored-by: jbieniusiewi <[email protected]> Co-authored-by: jbieniusiewi <[email protected]> Co-authored-by: Maanu Grover <[email protected]> Co-authored-by: akoumpa <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * Jsonl support (#9611) * Adding support to preprocess .jsonl and .jsonl.gz files in input directory Signed-off-by: adityavavre <[email protected]> * Adding support to preprocess .jsonl and .jsonl.gz files in input directory Signed-off-by: adityavavre <[email protected]> * Apply isort and black reformatting Signed-off-by: adityavavre <[email protected]> --------- Signed-off-by: adityavavre <[email protected]> Signed-off-by: adityavavre <[email protected]> Co-authored-by: adityavavre <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * [NeMo-UX] Add PEFT (#9490) * initial commit for PEFT in nemo2 * Apply isort and black reformatting Signed-off-by: cuichenx <[email protected]> * address comments Signed-off-by: Chen Cui <[email protected]> * make import easier Signed-off-by: Chen Cui <[email protected]> * Apply isort and black reformatting Signed-off-by: cuichenx <[email protected]> * address comments Signed-off-by: Chen Cui <[email protected]> * Update nemo/collections/llm/peft/lora.py Signed-off-by: Marc Romeyn <[email protected]> * Some small fixes + adding more doc-strings * Apply isort and black reformatting Signed-off-by: marcromeyn <[email protected]> * Adding ModelTransform callback * Apply isort and black reformatting Signed-off-by: marcromeyn <[email protected]> * Fixing type-hint for model_transform * Apply isort and black reformatting Signed-off-by: marcromeyn <[email protected]> * fix import Signed-off-by: Chen Cui <[email protected]> * model transform for gemma llama Signed-off-by: Chen Cui <[email protected]> * Apply isort and black reformatting Signed-off-by: cuichenx <[email protected]> * fix model transform Signed-off-by: Chen Cui <[email protected]> * Apply isort and black reformatting Signed-off-by: cuichenx <[email protected]> * change lora target default to all linear modules Signed-off-by: Chen Cui <[email protected]> * Apply isort and black reformatting Signed-off-by: cuichenx <[email protected]> * Small fix in mixtral * Apply isort and black reformatting Signed-off-by: marcromeyn <[email protected]> * Integrating PEFT to the public-API + some fixes * Big refactor to allow to load adapter-states * Some fixes to support adapter_path * Apply isort and black reformatting Signed-off-by: marcromeyn <[email protected]> * Disabling ckpt reloading when adapter_path is passed * Fix CLI * Apply isort and black reformatting Signed-off-by: marcromeyn <[email protected]> * Remove commented-out code * Remove commented-out code * Remove un-used import * Fix callback imports * Apply isort and black reformatting Signed-off-by: marcromeyn <[email protected]> * Fixing llm.pretrain * Some small fixes * Apply isort and black reformatting Signed-off-by: marcromeyn <[email protected]> * Fix missing import + type-hint in finetune * Adding PreemptionCallback + some more tests * Apply isort and black reformatting Signed-off-by: marcromeyn <[email protected]> * Clean up imports & clean up llm.api * Apply isort and black reformatting Signed-off-by: marcromeyn <[email protected]> * Trying to fix failing tests * Remove __init__.py 2 * Apply isort and black reformatting Signed-off-by: marcromeyn <[email protected]> * Fix failing test * Trying to fix last failing test --------- Signed-off-by: cuichenx <[email protected]> Signed-off-by: Chen Cui <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> Signed-off-by: marcromeyn <[email protected]> Co-authored-by: cuichenx <[email protected]> Co-authored-by: Marc Romeyn <[email protected]> Co-authored-by: marcromeyn <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * Akoumparouli/mistral import instruct chat template fix (#9567) * use bf16 by defualt mistral conv Signed-off-by: Alexandros Koumparoulis <[email protected]> * add chat template Signed-off-by: Alexandros Koumparoulis <[email protected]> * use capitalized role names Signed-off-by: Alexandros Koumparoulis <[email protected]> --------- Signed-off-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: Marc Romeyn <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * Remove .cuda calls, use device isntead (#9602) Signed-off-by: Alexandros Koumparoulis <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * fix converter defautl args (#9565) * fix converter defautl args Signed-off-by: Alexandros Koumparoulis <[email protected]> * Apply isort and black reformatting Signed-off-by: akoumpa <[email protected]> --------- Signed-off-by: Alexandros Koumparoulis <[email protected]> Signed-off-by: akoumpa <[email protected]> Co-authored-by: akoumpa <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * mixtral export (#9603) Signed-off-by: Alexandros Koumparoulis <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * fix: remove non_blocking from PTL's .cuda call (#9618) Signed-off-by: Alexandros Koumparoulis <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * Alit/mamba tmp (#9612) * adding mamba support * fix import mixins * rm convert jamba * Apply isort and black reformatting Signed-off-by: JRD971000 <[email protected]> * more cleanups * use GPT text gen * Apply isort and black reformatting Signed-off-by: JRD971000 <[email protected]> * fixing gbs in TP convetor * Apply isort and black reformatting Signed-off-by: JRD971000 <[email protected]> * add reqs * add tutorial * minor fix to tutorial * moving finetuning files Signed-off-by: arendu <[email protected]> * moving finetuning files Signed-off-by: arendu <[email protected]> * address comments * Apply isort and black reformatting Signed-off-by: JRD971000 <[email protected]> * address comments * Apply isort and black reformatting Signed-off-by: JRD971000 <[email protected]> * add mamba_tmp * remove mamba import * Apply isort and black reformatting Signed-off-by: JRD971000 <[email protected]> --------- Signed-off-by: JRD971000 <[email protected]> Signed-off-by: arendu <[email protected]> Co-authored-by: Ali Taghibakhshi <[email protected]> Co-authored-by: JRD971000 <[email protected]> Co-authored-by: arendu <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * TitaNet Batch Verify Speaker (#9337) * add batch_inference for verify_speakers method Signed-off-by: [email protected] <[email protected]> * remove not used package Signed-off-by: [email protected] <[email protected]> * change batch inference logic Signed-off-by: [email protected] <[email protected]> * fixup Signed-off-by: [email protected] <[email protected]> * requested changes Signed-off-by: [email protected] <[email protected]> * add verify_speakers_batch to docs Signed-off-by: [email protected] <[email protected]> * handle None durations in manifest Signed-off-by: [email protected] <[email protected]> * change logging text Signed-off-by: [email protected] <[email protected]> * Apply isort and black reformatting Signed-off-by: monica-sekoyan <[email protected]> * check duration presence Signed-off-by: [email protected] <[email protected]> * add channel_selector to dataset configs Signed-off-by: [email protected] <[email protected]> --------- Signed-off-by: [email protected] <[email protected]> Signed-off-by: monica-sekoyan <[email protected]> Co-authored-by: monica-sekoyan <[email protected]> Co-authored-by: Nithin Rao <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * Enable MCore checkpointing optimizations (#9505) * Expose num processes in PyT Dist Signed-off-by: Mikołaj Błaż <[email protected]> * Add parallel save/load optimizations from MCore Signed-off-by: Mikołaj Błaż <[email protected]> * Remove async utils from MCore Signed-off-by: Mikołaj Błaż <[email protected]> * Enable DistOpt paralell R/W Signed-off-by: Mikołaj Błaż <[email protected]> * Enable PyT Dist caching Signed-off-by: Mikołaj Błaż <[email protected]> * Small fixes Signed-off-by: Mikołaj Błaż <[email protected]> * Make sure DistCkptIO is instantiated from config Signed-off-by: Mikołaj Błaż <[email protected]> * Bump MCore version to v0.7 Signed-off-by: Mikołaj Błaż <[email protected]> * Print load strategy Signed-off-by: Mikołaj Błaż <[email protected]> * Forward MCore to model space DistOpt Signed-off-by: Mikołaj Błaż <[email protected]> * Add separate flag to control DistOpt paralell R/W Signed-off-by: Mikołaj Błaż <[email protected]> * Turn off parallel save by default Signed-off-by: Mikołaj Błaż <[email protected]> --------- Signed-off-by: Mikołaj Błaż <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * Change mixtral moe key name for trt-llm (#9620) * fix minor import bug Signed-off-by: Onur Yilmaz <[email protected]> * change moe key values Signed-off-by: Onur Yilmaz <[email protected]> * add weight to the key Signed-off-by: Onur Yilmaz <[email protected]> --------- Signed-off-by: Onur Yilmaz <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * fix ckpt load bug (#9621) * fix ckpt load bug Signed-off-by: dimapihtar <[email protected]> * Apply isort and black reformatting Signed-off-by: dimapihtar <[email protected]> --------- Signed-off-by: dimapihtar <[email protected]> Signed-off-by: dimapihtar <[email protected]> Co-authored-by: dimapihtar <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * NeVA Minor Fixes (#9608) * fix neva resume with empty param loaded for some pp stage Signed-off-by: yaoyu-33 <[email protected]> * fix crop size check Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * fix pretrianing data sizes and weights (#9627) Signed-off-by: Chen Cui <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * Alit/mamba (#9575) * adding mamba support * fix import mixins * rm convert jamba * Apply isort and black reformatting Signed-off-by: JRD971000 <[email protected]> * more cleanups * use GPT text gen * Apply isort and black reformatting Signed-off-by: JRD971000 <[email protected]> * fixing gbs in TP convetor * Apply isort and black reformatting Signed-off-by: JRD971000 <[email protected]> * add reqs * add tutorial * minor fix to tutorial * moving finetuning files Signed-off-by: arendu <[email protected]> * moving finetuning files Signed-off-by: arendu <[email protected]> * address comments * Apply isort and black reformatting Signed-off-by: JRD971000 <[email protected]> * address comments * Apply isort and black reformatting Signed-off-by: JRD971000 <[email protected]> * address comments * add mamba dependancies * add mcore tag * modify dockerfile ci * modify dockerfile ci --------- Signed-off-by: JRD971000 <[email protected]> Signed-off-by: arendu <[email protected]> Co-authored-by: Ali Taghibakhshi <[email protected]> Co-authored-by: JRD971000 <[email protected]> Co-authored-by: arendu <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * [NeMo-UX] async checkpointing support (#9466) * add async checkpointing support * fixes * Apply isort and black reformatting Signed-off-by: ashors1 <[email protected]> * add parallel read/write support and other optimizations * Apply isort and black reformatting Signed-off-by: ashors1 <[email protected]> * address comments, make dist checkpointing args configurable * Apply isort and black reformatting Signed-off-by: ashors1 <[email protected]> * fix small typo Signed-off-by: ashors1 <[email protected]> * Update default sharding type Co-authored-by: mikolajblaz <[email protected]> Signed-off-by: Anna Shors <[email protected]> * Update default sharding type Co-authored-by: mikolajblaz <[email protected]> Signed-off-by: Anna Shors <[email protected]> * Apply isort and black reformatting Signed-off-by: ashors1 <[email protected]> --------- Signed-off-by: ashors1 <[email protected]> Signed-off-by: ashors1 <[email protected]> Signed-off-by: Anna Shors <[email protected]> Co-authored-by: ashors1 <[email protected]> Co-authored-by: mikolajblaz <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * Fix the arguments of forward_for_export function in msdd_models (#9624) * Fix the arguments of forward_for_export function Signed-off-by: Taejin Park <[email protected]> * Apply isort and black reformatting Signed-off-by: tango4j <[email protected]> --------- Signed-off-by: Taejin Park <[email protected]> Signed-off-by: tango4j <[email protected]> Co-authored-by: tango4j <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * Change default parallel_save to False (#9632) Signed-off-by: Mikołaj Błaż <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * Unwrap ckpt_io for model opt (async save) (#9622) Signed-off-by: Mikołaj Błaż <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * MCore T5 support for NeMo - Training (#9432) * huvu/mcore_t5 first commit from local * removing DEBUGGING prints * cleaning megatron_lm_encoder_decoder_model.py code * cleaning code * adding Github action test * only run mcore T5 test * only run mcore T5 test * only run mcore T5 test * only run mcore T5 test * reset .github/workflows/cicd-main.yml * reset .github/workflows/cicd-main.yml * adding condition self.mcore_t5 when running self.build_transformer_config() * refractor megatron_lm_encoder_decoder_model.py to not use self.model * only run T5-related tests * remove all self.model * reset cicd file * reset cicd file * updating codes remove duplicate if/else; adding mcore/transformer_engine to config file * adjust +model.mcore_t5=True * Apply isort and black reformatting Signed-off-by: huvunvidia <[email protected]> --------- Signed-off-by: huvunvidia <[email protected]> Co-authored-by: Huy Vu2 <[email protected]> Co-authored-by: huvunvidia <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * [Nemo-UX] Expose transformer_layer_spec inside GPTConfig (#9592) * Expose transformer_layer_spec inside GPTConfig * Apply isort and black reformatting Signed-off-by: marcromeyn <[email protected]> * Expose layer-specs * Apply isort and black reformatting Signed-off-by: marcromeyn <[email protected]> --------- Signed-off-by: marcromeyn <[email protected]> Co-authored-by: marcromeyn <[email protected]> Signed-off-by: Tugrul Konuk <[email protected]> * Update NeMo Clip to Use MCore Modules (#9594) * update clip model and config file Signed-off-by: yaoyu-33 <[email protected]> * update clip for mcore Signed-off-by: yaoyu-33 <[email protected]> * MCore CLIP Fix Signed-off-by: yaoyu-33 <[email protected]> * fix no mask Signed-off-by: yaoyu-33 <[email protected]> * few neva fixes Signed-off-by: yaoyu-33 <[email protected]> * update siglip module Signed-off-by: yaoyu-33 <[email protected]> * add siglip loss Signed-off-by: yaoyu-33 <[email protected]> * fix Signed-off-by: yaoyu-33 <[email protected]> * fix collate fn Signed-off-by: yaoyu-33 <[email protected]> * update siglip conversion script Signed-off-by: yaoyu-33 <[email protected]> * update siglip convert Signed-off-by: yaoyu-33 <[email protected]> * clip fixes Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * clean up script Signed-off-by: yaoyu-33 <[email protected]> * clip fixe…
1 parent 06df64f commit 0dfc817

File tree

5 files changed

+207
-1
lines changed

5 files changed

+207
-1
lines changed

nemo/collections/common/tokenizers/__init__.py

+1
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@
1919
from nemo.collections.common.tokenizers.huggingface.auto_tokenizer import AutoTokenizer
2020
from nemo.collections.common.tokenizers.regex_tokenizer import RegExTokenizer
2121
from nemo.collections.common.tokenizers.sentencepiece_tokenizer import SentencePieceTokenizer
22+
from nemo.collections.common.tokenizers.tiktoken_tokenizer import TiktokenTokenizer
2223
from nemo.collections.common.tokenizers.tokenizer_spec import TokenizerSpec
2324
from nemo.collections.common.tokenizers.word_tokenizer import WordTokenizer
2425

Original file line numberDiff line numberDiff line change
@@ -0,0 +1,200 @@
1+
# Copyright (c) 2024, NVIDIA CORPORATION. All rights reserved.
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
15+
import base64
16+
import json
17+
import os
18+
from pathlib import Path
19+
from typing import Dict, List, Optional
20+
21+
try:
22+
import tiktoken
23+
except ImportError:
24+
pass
25+
26+
from nemo.collections.common.tokenizers.tokenizer_spec import TokenizerSpec
27+
28+
__all__ = ['TiktokenTokenizer']
29+
30+
31+
def reload_mergeable_ranks(
32+
path: str,
33+
max_vocab: Optional[int] = None,
34+
) -> Dict[bytes, int]:
35+
"""
36+
Reload the tokenizer JSON file and convert it to Tiktoken format.
37+
"""
38+
assert path.endswith(".json")
39+
40+
# reload vocab
41+
with open(path, "r") as f:
42+
vocab = json.load(f)
43+
assert isinstance(vocab, list)
44+
print(f"Vocab size: {len(vocab)}")
45+
if max_vocab is not None:
46+
vocab = vocab[:max_vocab]
47+
print(f"Cutting vocab to first {len(vocab)} tokens.")
48+
49+
# build ranks
50+
ranks: Dict[bytes, int] = {}
51+
for i, x in enumerate(vocab):
52+
assert x.keys() == {"rank", "token_bytes", "token_str"}
53+
assert x["rank"] == i
54+
merge = base64.b64decode(x["token_bytes"])
55+
assert i >= 256 or merge == bytes([i])
56+
ranks[merge] = x["rank"]
57+
58+
# sanity check
59+
assert len(ranks) == len(vocab)
60+
assert set(ranks.values()) == set(range(len(ranks)))
61+
62+
return ranks
63+
64+
65+
PATTERN_TIKTOKEN = "[^\\r\\n\\p{L}\\p{N}]?[\\p{Lu}\\p{Lt}\\p{Lm}\\p{Lo}\\p{M}]*[\\p{Ll}\\p{Lm}\\p{Lo}\\p{M}]+|[^\\r\\n\\p{L}\\p{N}]?[\\p{Lu}\\p{Lt}\\p{Lm}\\p{Lo}\\p{M}]+[\\p{Ll}\\p{Lm}\\p{Lo}\\p{M}]*|\\p{N}| ?[^\\s\\p{L}\\p{N}]+[\\r\\n/]*|\\s*[\\r\\n]+|\\s+(?!\\S)|\\s+"
66+
DEFAULT_TIKTOKEN_MAX_VOCAB = 2**17 # 131072
67+
SPECIAL_TOKENS = ["<unk>", "<s>", "</s>"]
68+
SPECIAL_TOKEN_TEMPLATE = "<SPECIAL_{id}>"
69+
70+
71+
class TiktokenTokenizer(TokenizerSpec):
72+
"""
73+
TiktokenTokenizer https://github.com/openai/tiktoken.
74+
75+
Args:
76+
model_path: path to tokenizer vocabulary
77+
num_special_tokens: number of special tokens to generate
78+
special_tokens: template for user-defined special tokens
79+
pattern: Regex pattern to split the text
80+
"""
81+
82+
def __init__(
83+
self,
84+
vocab_file: str,
85+
pattern: str = PATTERN_TIKTOKEN,
86+
vocab_size: int = DEFAULT_TIKTOKEN_MAX_VOCAB, # 131072
87+
num_special_tokens: int = 1000,
88+
special_tokens: Optional[List[str]] = None,
89+
):
90+
if not vocab_file or not os.path.exists(vocab_file):
91+
raise ValueError(f"vocab_file: {vocab_file} is invalid")
92+
93+
if special_tokens is None:
94+
special_tokens = SPECIAL_TOKENS.copy()
95+
96+
assert len(special_tokens) == len(set(special_tokens)), f"Special tokens should be unique: {special_tokens}"
97+
assert len(special_tokens) <= num_special_tokens < vocab_size
98+
assert set(SPECIAL_TOKENS) <= set(special_tokens), f"Custom special tokens should include {SPECIAL_TOKENS}"
99+
100+
self._unk_id = special_tokens.index("<unk>")
101+
self._bos_id = special_tokens.index("<s>")
102+
self._eos_id = special_tokens.index("</s>")
103+
104+
self._vocab_size = vocab_size
105+
print(f'{self._vocab_size = }')
106+
self.num_special_tokens = num_special_tokens
107+
special_filler = [SPECIAL_TOKEN_TEMPLATE.format(id=i) for i in range(len(special_tokens), num_special_tokens)]
108+
if special_filler:
109+
print(f"Adding special tokens {special_filler[0]}, ..., {special_filler[-1]}")
110+
self.special_tokens = special_tokens + special_filler
111+
assert len(set(self.special_tokens)) == len(self.special_tokens) == num_special_tokens, self.special_tokens
112+
self.inner_vocab_size = vocab_size - num_special_tokens
113+
114+
# reload vocab
115+
self.token2id = reload_mergeable_ranks(vocab_file, max_vocab=self.inner_vocab_size)
116+
self.id2token = {v: k for k, v in self.token2id.items()}
117+
assert set(range(self.inner_vocab_size)) == set(self.id2token.keys())
118+
119+
self.shifted_id2token = {i: tok for i, tok in enumerate(self.special_tokens)}
120+
for key, value in self.id2token.items():
121+
self.shifted_id2token[key + self.num_special_tokens] = value
122+
123+
self.tokenizer = tiktoken.Encoding(
124+
name=Path(vocab_file).parent.name,
125+
pat_str=pattern,
126+
mergeable_ranks=self.token2id,
127+
special_tokens={}, # special tokens are handled manually
128+
)
129+
130+
def text_to_tokens(self, text: str):
131+
token_ids = self.tokenizer.encode(text)
132+
return [self.tokenizer.decode_single_token_bytes(token) for token in token_ids]
133+
134+
def tokens_to_text(self, tokens: List[int]):
135+
token_ids = [self.tokenizer.encode_single_token(tokens) for tokens in tokens]
136+
return self.tokenizer.decode(token_ids)
137+
138+
def token_to_id(self, token):
139+
return self.tokenizer.encode_single_token(token)
140+
141+
def tokens_to_ids(self, tokens):
142+
return [self.tokenizer.encode_single_token(token) for token in tokens]
143+
144+
def ids_to_tokens(self, token_ids):
145+
tokens = []
146+
for token_id in token_ids:
147+
if token_id < self.num_special_tokens:
148+
tokens.append(self.special_tokens[token_id])
149+
else:
150+
token_id -= self.num_special_tokens
151+
token_bytes = self.tokenizer.decode_single_token_bytes(token_id)
152+
tokens.append(token_bytes.decode('utf-8', errors='replace'))
153+
return tokens
154+
155+
def text_to_ids(self, text: str):
156+
tokens = self.tokenizer.encode(text)
157+
tokens = [t + self.num_special_tokens for t in tokens]
158+
return tokens
159+
160+
def ids_to_text(self, tokens: List[int]):
161+
# Filter out special tokens and adjust the remaining tokens
162+
adjusted_tokens = [
163+
t - self.num_special_tokens
164+
for t in tokens
165+
if t not in {self.bos, self.eos} and t >= self.num_special_tokens
166+
]
167+
168+
# Decode only if there are tokens left after filtering
169+
if adjusted_tokens:
170+
return self.tokenizer.decode(adjusted_tokens)
171+
else:
172+
return "" # Return an empty string if all tokens were filtered out
173+
174+
@property
175+
def bos_id(self):
176+
return self._bos_id
177+
178+
@property
179+
def eos_id(self):
180+
return self._eos_id
181+
182+
@property
183+
def unk_id(self):
184+
return self._unk_id
185+
186+
@property
187+
def vocab(self):
188+
return self.token2id
189+
190+
@property
191+
def decoder(self):
192+
return self.shifted_id2token
193+
194+
@property
195+
def encoder(self):
196+
return self.vocab
197+
198+
@property
199+
def vocab_size(self) -> int:
200+
return self._vocab_size

nemo/collections/nlp/modules/common/tokenizer_utils.py

+5
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@
2222
from nemo.collections.common.tokenizers.huggingface.auto_tokenizer import AutoTokenizer
2323
from nemo.collections.common.tokenizers.regex_tokenizer import RegExTokenizer
2424
from nemo.collections.common.tokenizers.tabular_tokenizer import TabularTokenizer
25+
from nemo.collections.common.tokenizers.tiktoken_tokenizer import TiktokenTokenizer
2526
from nemo.collections.common.tokenizers.word_tokenizer import WordTokenizer
2627
from nemo.collections.nlp.modules.common.huggingface.huggingface_utils import get_huggingface_pretrained_lm_models_list
2728
from nemo.collections.nlp.modules.common.lm_utils import get_pretrained_lm_models_list
@@ -122,6 +123,8 @@ def get_tokenizer(
122123
legacy=True,
123124
chat_template=chat_template,
124125
)
126+
elif tokenizer_name == 'tiktoken':
127+
return nemo.collections.common.tokenizers.tiktoken_tokenizer.TiktokenTokenizer(vocab_file=vocab_file)
125128
elif tokenizer_name == 'word':
126129
return WordTokenizer(vocab_file=vocab_file, **special_tokens_dict)
127130
elif tokenizer_name == 'char':
@@ -221,6 +224,8 @@ def get_nmt_tokenizer(
221224
)
222225
elif library == 'tabular':
223226
return TabularTokenizer(vocab_file, delimiter=delimiter)
227+
elif library == 'tiktoken':
228+
return TiktokenTokenizer(vocab_file=vocab_file)
224229
else:
225230
raise NotImplementedError(
226231
'Currently we only support "huggingface", "sentencepiece", "megatron", and "byte-level" tokenizer'

nemo/export/multimodal/run.py

-1
Original file line numberDiff line numberDiff line change
@@ -80,7 +80,6 @@ def init_tokenizer(self, llm_engine_dir):
8080

8181
self.tokenizer = AutoTokenizer.from_pretrained(os.path.join(llm_engine_dir, 'huggingface_tokenizer'))
8282
self.tokenizer.pad_token = self.tokenizer.eos_token
83-
8483
if self.model_type == 'vita':
8584
self.tokenizer.im_start_id = self.tokenizer.convert_tokens_to_ids("<extra_id_4>")
8685
self.tokenizer.im_end_id = self.tokenizer.convert_tokens_to_ids("<extra_id_5>")

requirements/requirements_nlp.txt

+1
Original file line numberDiff line numberDiff line change
@@ -20,4 +20,5 @@ rouge_score
2020
sacrebleu # manually install sacrebleu[ja] for Japanese support; MeCab is unsupported in Python 3.11+
2121
sentence_transformers
2222
tensorstore<0.1.46
23+
tiktoken==0.7.0
2324
zarr

0 commit comments

Comments
 (0)