Add PP support in NeVA along with few bug fixes #11170

yaoyu-33 · 2024-11-05T17:53:49Z

What does this PR do ?

Add support for enabling PP in neva training. 2 Options are supplied: (1) vit share 1st pp stage with llm, encoder_pp=0. (2) vit takes entire 1st pp stage with llm, encoder_pp=1.

Collection: [multimodal]

Changelog

Add specific line by line info of high level changes in this PR.

Usage

You can potentially add a usage example below

# Add a code snippet demonstrating how to use this

GitHub Actions CI

The Jenkins CI system has been replaced by GitHub Actions self-hosted runners.

The GitHub Actions CI will run automatically when the "Run CICD" label is added to the PR.
To re-run CI remove and add the label again.
To run CI on an untrusted fork, a NeMo user with write access must first click "Approve and run".

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you add or update any necessary documentation?
Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
- Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

New Feature
Bugfix
Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

Related to # (issue)

Signed-off-by: yaoyu-33 <[email protected]>

…add_llama_vlm

Signed-off-by: yaoyu-33 <[email protected]>

…add_llama_vlm

Signed-off-by: yaoyu-33 <[email protected]>

…add_llama_vlm_hf

Signed-off-by: yaoyu-33 <[email protected]>

…ya/add_llama_vlm_hf

…/add_llama_vlm_hf

Signed-off-by: yaoyu-33 <[email protected]>

# Conflicts: # nemo/collections/multimodal/data/energon/base.py # nemo/collections/vlm/__init__.py # nemo/collections/vlm/mllama/data/mock.py # nemo/collections/vlm/mllama/model/base.py # nemo/collections/vlm/neva/data/__init__.py # nemo/collections/vlm/neva/data/conversation.py # nemo/collections/vlm/neva/data/lazy.py # nemo/collections/vlm/peft/lora.py # nemo/collections/vlm/recipes/__init__.py

examples/vlm/neva_finetune.py

examples/vlm/neva_finetune_70b.py

examples/vlm/neva_pretrain.py

Signed-off-by: yaoyu-33 <[email protected]>

examples/vlm/neva_finetune.py

Signed-off-by: yaoyu-33 <[email protected]>

…neva_pp

Signed-off-by: yaoyu-33 <[email protected]>

cuichenx

LGTM

github-actions · 2024-11-20T23:40:08Z

beep boop 🤖: 🚨 The following files must be fixed before merge!

Your code was analyzed with PyLint. The following annotations have been identified:


------------------------------------
Your code has been rated at 10.00/10

Thank you for improving NeMo's documentation!

github-actions · 2024-11-20T23:40:12Z

beep boop 🤖: 🙏 The following files have warnings. In case you are familiar with these, please try helping us to improve the code base.

Your code was analyzed with PyLint. The following annotations have been identified:

************* Module nemo.collections.vlm.neva.data.llava_next_energon
nemo/collections/vlm/neva/data/llava_next_energon.py:28:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/vlm/neva/data/llava_next_energon.py:33:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/vlm/neva/data/llava_next_energon.py:37:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/vlm/neva/data/llava_next_energon.py:104:0: C0115: Missing class docstring (missing-class-docstring)
************* Module nemo.collections.llm.fn.activation
nemo/collections/llm/fn/activation.py:25:0: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/llm/fn/activation.py:30:0: C0116: Missing function or method docstring (missing-function-docstring)
************* Module nemo.collections.llm.gpt.model.base
nemo/collections/llm/gpt/model/base.py:226:0: C0301: Line too long (122/119) (line-too-long)
nemo/collections/llm/gpt/model/base.py:51:0: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/llm/gpt/model/base.py:94:0: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/llm/gpt/model/base.py:115:0: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/llm/gpt/model/base.py:123:0: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/llm/gpt/model/base.py:131:0: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/llm/gpt/model/base.py:139:0: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/llm/gpt/model/base.py:149:0: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/llm/gpt/model/base.py:159:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/llm/gpt/model/base.py:182:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/llm/gpt/model/base.py:252:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/llm/gpt/model/base.py:264:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/llm/gpt/model/base.py:276:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/llm/gpt/model/base.py:288:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/llm/gpt/model/base.py:300:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/llm/gpt/model/base.py:312:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/llm/gpt/model/base.py:326:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/llm/gpt/model/base.py:344:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/llm/gpt/model/base.py:348:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/llm/gpt/model/base.py:371:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/llm/gpt/model/base.py:374:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/llm/gpt/model/base.py:377:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/llm/gpt/model/base.py:381:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/llm/gpt/model/base.py:386:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/llm/gpt/model/base.py:407:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/llm/gpt/model/base.py:414:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/llm/gpt/model/base.py:421:0: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/llm/gpt/model/base.py:449:0: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/llm/gpt/model/base.py:43:4: W0611: Unused import fused_weight_gradient_mlp_cuda (unused-import)
************* Module nemo.collections.llm.gpt.model.llama
nemo/collections/llm/gpt/model/llama.py:483:0: C0301: Line too long (158/119) (line-too-long)
nemo/collections/llm/gpt/model/llama.py:42:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/llm/gpt/model/llama.py:62:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/llm/gpt/model/llama.py:71:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/llm/gpt/model/llama.py:80:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/llm/gpt/model/llama.py:89:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/llm/gpt/model/llama.py:111:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/llm/gpt/model/llama.py:132:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/llm/gpt/model/llama.py:142:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/llm/gpt/model/llama.py:154:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/llm/gpt/model/llama.py:164:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/llm/gpt/model/llama.py:175:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/llm/gpt/model/llama.py:186:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/llm/gpt/model/llama.py:192:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/llm/gpt/model/llama.py:198:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/llm/gpt/model/llama.py:209:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/llm/gpt/model/llama.py:213:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/llm/gpt/model/llama.py:225:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/llm/gpt/model/llama.py:245:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/llm/gpt/model/llama.py:259:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/llm/gpt/model/llama.py:265:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/llm/gpt/model/llama.py:302:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/llm/gpt/model/llama.py:321:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/llm/gpt/model/llama.py:338:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/llm/gpt/model/llama.py:342:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/llm/gpt/model/llama.py:475:0: C0116: Missing function or method docstring (missing-function-docstring)
************* Module nemo.collections.llm.gpt.model.ssm
nemo/collections/llm/gpt/model/ssm.py:190:0: C0301: Line too long (129/119) (line-too-long)
nemo/collections/llm/gpt/model/ssm.py:193:0: C0301: Line too long (131/119) (line-too-long)
nemo/collections/llm/gpt/model/ssm.py:194:0: C0301: Line too long (151/119) (line-too-long)
nemo/collections/llm/gpt/model/ssm.py:195:0: C0301: Line too long (129/119) (line-too-long)
nemo/collections/llm/gpt/model/ssm.py:39:0: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/llm/gpt/model/ssm.py:51:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/llm/gpt/model/ssm.py:89:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/llm/gpt/model/ssm.py:110:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/llm/gpt/model/ssm.py:127:8: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/llm/gpt/model/ssm.py:131:12: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/llm/gpt/model/ssm.py:134:12: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/llm/gpt/model/ssm.py:155:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/llm/gpt/model/ssm.py:203:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/llm/gpt/model/ssm.py:216:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/llm/gpt/model/ssm.py:221:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/llm/gpt/model/ssm.py:235:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/llm/gpt/model/ssm.py:249:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/llm/gpt/model/ssm.py:263:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/llm/gpt/model/ssm.py:277:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/llm/gpt/model/ssm.py:291:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/llm/gpt/model/ssm.py:305:0: C0115: Missing class docstring (missing-class-docstring)
************* Module nemo.collections.vlm.mllama.data.mock
nemo/collections/vlm/mllama/data/mock.py:27:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/vlm/mllama/data/mock.py:68:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/mllama/data/mock.py:79:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/mllama/data/mock.py:84:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/mllama/data/mock.py:89:4: C0116: Missing function or method docstring (missing-function-docstring)
************* Module nemo.collections.vlm.mllama.model.base
nemo/collections/vlm/mllama/model/base.py:50:0: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/mllama/model/base.py:99:0: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/mllama/model/base.py:116:0: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/mllama/model/base.py:121:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/vlm/mllama/model/base.py:152:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/mllama/model/base.py:155:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/mllama/model/base.py:162:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/vlm/mllama/model/base.py:227:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/vlm/mllama/model/base.py:248:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/mllama/model/base.py:276:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/vlm/mllama/model/base.py:305:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/mllama/model/base.py:315:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/mllama/model/base.py:319:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/vlm/mllama/model/base.py:361:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/mllama/model/base.py:364:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/mllama/model/base.py:398:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/mllama/model/base.py:491:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/vlm/mllama/model/base.py:508:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/mllama/model/base.py:512:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/mllama/model/base.py:541:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/mllama/model/base.py:544:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/mllama/model/base.py:547:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/mllama/model/base.py:551:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/mllama/model/base.py:557:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/mllama/model/base.py:564:4: C0116: Missing function or method docstring (missing-function-docstring)
************* Module nemo.collections.vlm.neva.data.conversation
nemo/collections/vlm/neva/data/conversation.py:93:0: C0301: Line too long (255/119) (line-too-long)
nemo/collections/vlm/neva/data/conversation.py:135:0: C0301: Line too long (198/119) (line-too-long)
nemo/collections/vlm/neva/data/conversation.py:358:0: C0301: Line too long (176/119) (line-too-long)
nemo/collections/vlm/neva/data/conversation.py:426:0: C0301: Line too long (314/119) (line-too-long)
nemo/collections/vlm/neva/data/conversation.py:428:0: C0301: Line too long (210/119) (line-too-long)
nemo/collections/vlm/neva/data/conversation.py:571:0: C0301: Line too long (152/119) (line-too-long)
nemo/collections/vlm/neva/data/conversation.py:595:0: C0301: Line too long (152/119) (line-too-long)
nemo/collections/vlm/neva/data/conversation.py:620:0: C0301: Line too long (152/119) (line-too-long)
nemo/collections/vlm/neva/data/conversation.py:63:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/neva/data/conversation.py:67:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/neva/data/conversation.py:78:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/neva/data/conversation.py:237:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/neva/data/conversation.py:240:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/neva/data/conversation.py:286:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/neva/data/conversation.py:300:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/neva/data/conversation.py:324:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/neva/data/conversation.py:336:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/neva/data/conversation.py:18:0: W0611: Unused defaultdict imported from collections (unused-import)
************* Module nemo.collections.vlm.neva.data.lazy
nemo/collections/vlm/neva/data/lazy.py:564:0: C0301: Line too long (128/119) (line-too-long)
nemo/collections/vlm/neva/data/lazy.py:67:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/neva/data/lazy.py:72:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/neva/data/lazy.py:115:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/neva/data/lazy.py:120:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/neva/data/lazy.py:134:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/neva/data/lazy.py:161:0: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/neva/data/lazy.py:236:0: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/neva/data/lazy.py:246:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/vlm/neva/data/lazy.py:418:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/neva/data/lazy.py:493:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/vlm/neva/data/lazy.py:557:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/neva/data/lazy.py:568:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/neva/data/lazy.py:571:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/neva/data/lazy.py:574:4: C0116: Missing function or method docstring (missing-function-docstring)
************* Module nemo.collections.vlm.neva.data.mock
nemo/collections/vlm/neva/data/mock.py:29:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/vlm/neva/data/mock.py:72:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/neva/data/mock.py:83:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/neva/data/mock.py:88:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/neva/data/mock.py:93:4: C0116: Missing function or method docstring (missing-function-docstring)
************* Module nemo.collections.vlm.neva.model.api
nemo/collections/vlm/neva/model/api.py:20:0: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/neva/model/api.py:24:0: C0116: Missing function or method docstring (missing-function-docstring)
************* Module nemo.collections.vlm.neva.model.base
nemo/collections/vlm/neva/model/base.py:470:0: C0301: Line too long (139/119) (line-too-long)
nemo/collections/vlm/neva/model/base.py:473:0: C0301: Line too long (128/119) (line-too-long)
nemo/collections/vlm/neva/model/base.py:481:0: C0301: Line too long (125/119) (line-too-long)
nemo/collections/vlm/neva/model/base.py:490:0: C0301: Line too long (122/119) (line-too-long)
nemo/collections/vlm/neva/model/base.py:93:0: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/neva/model/base.py:134:0: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/neva/model/base.py:151:0: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/neva/model/base.py:172:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/neva/model/base.py:227:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/neva/model/base.py:242:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/vlm/neva/model/base.py:261:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/neva/model/base.py:282:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/vlm/neva/model/base.py:313:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/neva/model/base.py:351:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/neva/model/base.py:364:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/vlm/neva/model/base.py:619:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/vlm/neva/model/base.py:636:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/neva/model/base.py:640:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/neva/model/base.py:664:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/neva/model/base.py:667:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/neva/model/base.py:670:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/neva/model/base.py:674:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/neva/model/base.py:680:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/neva/model/base.py:687:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/neva/model/base.py:27:0: W0611: Unused TEDotProductAttention imported from megatron.core.extensions.transformer_engine (unused-import)
************* Module nemo.collections.vlm.neva.model.llava
nemo/collections/vlm/neva/model/llava.py:108:0: C0301: Line too long (137/119) (line-too-long)
nemo/collections/vlm/neva/model/llava.py:109:0: C0301: Line too long (122/119) (line-too-long)
nemo/collections/vlm/neva/model/llava.py:110:0: C0301: Line too long (146/119) (line-too-long)
nemo/collections/vlm/neva/model/llava.py:111:0: C0301: Line too long (144/119) (line-too-long)
nemo/collections/vlm/neva/model/llava.py:146:0: C0301: Line too long (128/119) (line-too-long)
nemo/collections/vlm/neva/model/llava.py:147:0: C0301: Line too long (161/119) (line-too-long)
nemo/collections/vlm/neva/model/llava.py:148:0: C0301: Line too long (157/119) (line-too-long)
nemo/collections/vlm/neva/model/llava.py:149:0: C0301: Line too long (150/119) (line-too-long)
nemo/collections/vlm/neva/model/llava.py:150:0: C0301: Line too long (146/119) (line-too-long)
nemo/collections/vlm/neva/model/llava.py:151:0: C0301: Line too long (158/119) (line-too-long)
nemo/collections/vlm/neva/model/llava.py:152:0: C0301: Line too long (154/119) (line-too-long)
nemo/collections/vlm/neva/model/llava.py:153:0: C0301: Line too long (135/119) (line-too-long)
nemo/collections/vlm/neva/model/llava.py:154:0: C0301: Line too long (131/119) (line-too-long)
nemo/collections/vlm/neva/model/llava.py:155:0: C0301: Line too long (135/119) (line-too-long)
nemo/collections/vlm/neva/model/llava.py:156:0: C0301: Line too long (131/119) (line-too-long)
nemo/collections/vlm/neva/model/llava.py:41:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/vlm/neva/model/llava.py:46:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/vlm/neva/model/llava.py:59:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/vlm/neva/model/llava.py:71:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/vlm/neva/model/llava.py:83:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/vlm/neva/model/llava.py:105:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/neva/model/llava.py:177:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/neva/model/llava.py:183:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/vlm/neva/model/llava.py:223:0: C0116: Missing function or method docstring (missing-function-docstring)
************* Module nemo.lightning.io.state
nemo/lightning/io/state.py:31:0: C0115: Missing class docstring (missing-class-docstring)
nemo/lightning/io/state.py:334:4: C0116: Missing function or method docstring (missing-function-docstring)

-----------------------------------
Your code has been rated at 9.17/10

Thank you for improving NeMo's documentation!

github-actions · 2024-11-21T02:52:19Z

[🤖]: Hi @yaoyu-33 👋,

We wanted to let you know that a CICD pipeline for this PR just finished successfully

So it might be time to merge this PR or get some approvals

I'm just a bot so I'll leave it you what to do next.

//cc @pablo-garay @ko3n1g

* evian3 update Signed-off-by: yaoyu-33 <[email protected]> * add encoder parallel default config Signed-off-by: yaoyu-33 <[email protected]> * add encoder parallel default config Signed-off-by: yaoyu-33 <[email protected]> * clean up Signed-off-by: yaoyu-33 <[email protected]> * add aspect ratio in model * support energon dataloader * some pp update Signed-off-by: yaoyu-33 <[email protected]> * fixes Signed-off-by: yaoyu-33 <[email protected]> * fix kv merging Signed-off-by: yaoyu-33 <[email protected]> * fix get_key_value_tensors Signed-off-by: yaoyu-33 <[email protected]> * rename files Signed-off-by: yaoyu-33 <[email protected]> * update to HF style position embedding Signed-off-by: yaoyu-33 <[email protected]> * fix energon dataloader and support batching * update forward args Signed-off-by: yaoyu-33 <[email protected]> * clean up and move to aspect_ratio_ids Signed-off-by: yaoyu-33 <[email protected]> * rename back to language.py Signed-off-by: yaoyu-33 <[email protected]> * fix loss function Signed-off-by: yaoyu-33 <[email protected]> * update and fix energon Signed-off-by: yaoyu-33 <[email protected]> * Add hf import * Fix type * Change config * update energon pretrain Signed-off-by: yaoyu-33 <[email protected]> * clean up * clean up * reformat Signed-off-by: yaoyu-33 <[email protected]> * update inference files for new code * update to instruct * update to instruct * update few names Signed-off-by: yaoyu-33 <[email protected]> * update generation Signed-off-by: yaoyu-33 <[email protected]> * fix importer embedding.weight * few fixes Signed-off-by: yaoyu-33 <[email protected]> * add hf script Signed-off-by: yaoyu-33 <[email protected]> * fix kv import * remove interleaved * fixes and updates Signed-off-by: yaoyu-33 <[email protected]> * lora fixes Signed-off-by: yaoyu-33 <[email protected]> * some code clean ups Signed-off-by: yaoyu-33 <[email protected]> * update training scripts Signed-off-by: yaoyu-33 <[email protected]> * refactors Signed-off-by: yaoyu-33 <[email protected]> * add LoRA finetuning * fixes and nemo update Signed-off-by: yaoyu-33 <[email protected]> * fix importer registering issue by adding 11B and 90B configs * update `decoder_seq_len` Signed-off-by: yaoyu-33 <[email protected]> * science vqa script Signed-off-by: yaoyu-33 <[email protected]> * clean up script name Signed-off-by: yaoyu-33 <[email protected]> * fix ckpt save serialization issue * fix predefined config classes * add num_chunks in input Signed-off-by: yaoyu-33 <[email protected]> * fix format Signed-off-by: yaoyu-33 <[email protected]> * update finetuning scripts for PEFT * add 11b recipe (need NVIDIA#10645 to test) * fix mask generation Signed-off-by: yaoyu-33 <[email protected]> * minor fix code style Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * Support no image inference * add llama svqa eval * fix masking Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * fix generation Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * add 90b recipe and revise 11b recipe * Apply isort and black reformatting Signed-off-by: cuichenx <[email protected]> * clean up typing * add option to disable vision padding * Apply isort and black reformatting Signed-off-by: cuichenx <[email protected]> * base model finetuning (does not work yet) * Apply isort and black reformatting Signed-off-by: cuichenx <[email protected]> * fixed default conversation template config for MLLama * Update svqa * add multinode * bot happy * Apply isort and black reformatting Signed-off-by: cuichenx <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: artbataev <[email protected]> * Perf improvements. Mainly from XAttn mask calculation (NVIDIA#10901) * Perf improvements. Mainly from XAttn mask calculation * Apply isort and black reformatting Signed-off-by: parthmannan <[email protected]> --------- Signed-off-by: parthmannan <[email protected]> Co-authored-by: parthmannan <[email protected]> * fix existing issues Signed-off-by: yaoyu-33 <[email protected]> * fix scripts Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * fix lora * few fixes for non image support Signed-off-by: yaoyu-33 <[email protected]> * update masking gen Signed-off-by: yaoyu-33 <[email protected]> * update lazy dataset Signed-off-by: yaoyu-33 <[email protected]> * fix data sampler and loading issue Signed-off-by: yaoyu-33 <[email protected]> * Add vlm generation * Apply isort and black reformatting Signed-off-by: meatybobby <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * generation update Signed-off-by: yaoyu-33 <[email protected]> * update lazy dataset Signed-off-by: yaoyu-33 <[email protected]> * Fix _strategy_lib.py Signed-off-by: Yu Yao <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * fix warning Signed-off-by: yaoyu-33 <[email protected]> * hide vlm examples Signed-off-by: yaoyu-33 <[email protected]> * Revert "Add vlm generation" This reverts commit 4711c75 Signed-off-by: yaoyu-33 <[email protected]> * Fix VisionEncoder multi-batch bug * update mcore parallelism initialization Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * Update megatron_init.py Signed-off-by: Yu Yao <[email protected]> * add encoder parallel default config Signed-off-by: yaoyu-33 <[email protected]> * Fix _strategy_lib.py Signed-off-by: Yu Yao <[email protected]> * llm.generate fixes (NVIDIA#10983) * fix context path, disable optimizer init, add tp Signed-off-by: HuiyingLi <[email protected]> * format Signed-off-by: HuiyingLi <[email protected]> * address comments, require user to provide trainer Signed-off-by: HuiyingLi <[email protected]> * minor fix Signed-off-by: HuiyingLi <[email protected]> * minor fixes Signed-off-by: HuiyingLi <[email protected]> --------- Signed-off-by: HuiyingLi <[email protected]> * use __dict__ in check (NVIDIA#11012) * check is_hf_model in leaf module Signed-off-by: Alexandros Koumparoulis <[email protected]> * Apply isort and black reformatting Signed-off-by: akoumpa <[email protected]> * disable getattr alternative path Signed-off-by: Alexandros Koumparoulis <[email protected]> * fix Signed-off-by: Alexandros Koumparoulis <[email protected]> * undo; Signed-off-by: Alexandros Koumparoulis <[email protected]> --------- Signed-off-by: Alexandros Koumparoulis <[email protected]> Signed-off-by: akoumpa <[email protected]> Co-authored-by: akoumpa <[email protected]> * LoRA support for HF::AutoModelForCausalLM (NVIDIA#10982) * add LinearAdapter Signed-off-by: Alexandros Koumparoulis <[email protected]> * add hf lora example Signed-off-by: Alexandros Koumparoulis <[email protected]> * remove unused imports Signed-off-by: Alexandros Koumparoulis <[email protected]> * fix Signed-off-by: Alexandros Koumparoulis <[email protected]> * fix Signed-off-by: Alexandros Koumparoulis <[email protected]> * subclass mixin Signed-off-by: Alexandros Koumparoulis <[email protected]> * remove stale imports Signed-off-by: Alexandros Koumparoulis <[email protected]> * undo Signed-off-by: Alexandros Koumparoulis <[email protected]> * fix scale Signed-off-by: Alexandros Koumparoulis <[email protected]> * regex selector for peft Signed-off-by: Alexandros Koumparoulis <[email protected]> * move lora Signed-off-by: Alexandros Koumparoulis <[email protected]> * fmt Signed-off-by: Alexandros Koumparoulis <[email protected]> * hf_auto_model_for_causal_lm finetune recipe Signed-off-by: Alexandros Koumparoulis <[email protected]> * Apply isort and black reformatting Signed-off-by: akoumpa <[email protected]> --------- Signed-off-by: Alexandros Koumparoulis <[email protected]> Signed-off-by: akoumpa <[email protected]> Co-authored-by: akoumpa <[email protected]> * Change default for always_save_context to True (NVIDIA#11014) Signed-off-by: Abhishree <[email protected]> Co-authored-by: Pablo Garay <[email protected]> * Add a build option to load_context (NVIDIA#10713) * Add a build option to load_context Signed-off-by: Marc Romeijn <[email protected]> Signed-off-by: Alexandros Koumparoulis <[email protected]> * Adding test Signed-off-by: Marc Romeijn <[email protected]> Signed-off-by: Alexandros Koumparoulis <[email protected]> * Trying to fix failing CPU test Signed-off-by: Marc Romeijn <[email protected]> Signed-off-by: Alexandros Koumparoulis <[email protected]> * cherry-pick fix Signed-off-by: Alexandros Koumparoulis <[email protected]> --------- Signed-off-by: Marc Romeijn <[email protected]> Signed-off-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: Alexandros Koumparoulis <[email protected]> * Fix pip install (NVIDIA#11026) * Move AutoTokenizer inline Signed-off-by: Marc Romeyn <[email protected]> * Move einops to common requirements Signed-off-by: Marc Romeyn <[email protected]> * Move AutoTokenizer import to top-level again in fine_tuning Signed-off-by: Marc Romeyn <[email protected]> * Move megatron init inside nemo.lightning Signed-off-by: Marc Romeyn <[email protected]> * Make megatron_lazy_init_context work when transformer-engine is not installed Signed-off-by: Marc Romeyn <[email protected]> * Only import get_nmt_tokenizer when needed Signed-off-by: Marc Romeyn <[email protected]> * Apply isort and black reformatting Signed-off-by: marcromeyn <[email protected]> --------- Signed-off-by: Marc Romeyn <[email protected]> Signed-off-by: marcromeyn <[email protected]> Co-authored-by: marcromeyn <[email protected]> * [WIP] Add docs for NEST SSL (NVIDIA#10804) * add docs Signed-off-by: stevehuang52 <[email protected]> * update doc and fix missing param Signed-off-by: stevehuang52 <[email protected]> --------- Signed-off-by: stevehuang52 <[email protected]> * Change dist ckpt defaults (NVIDIA#10913) * Enable ckpt features by default (async ckpt), ckpt every 15mins and reduce preemption time to 1min Signed-off-by: Shriya Palsamudram <[email protected]> * fix ssm tests Signed-off-by: Shriya Palsamudram <[email protected]> * Make note that ckpt_async_save is disabled for SSMs Signed-off-by: Shriya Palsamudram <[email protected]> * Enable async ckpt for SSMs with fix Signed-off-by: Shriya Palsamudram <[email protected]> * Disable async ckpt in the peft test as it is a known bug, add note. Signed-off-by: Shriya Palsamudram <[email protected]> * Fix failing unit tests Signed-off-by: Shriya Palsamudram <[email protected]> * Ashors/peft async ckpt (NVIDIA#11010) * [WIP] prototype for supporting async checkpointing with peft Signed-off-by: ashors1 <[email protected]> Signed-off-by: Shriya Palsamudram <[email protected]> * Enable async ckpt for the peft test Signed-off-by: Shriya Palsamudram <[email protected]> * Fix peft setup test Signed-off-by: Shriya Palsamudram <[email protected]> --------- Signed-off-by: Shriya Palsamudram <[email protected]> Signed-off-by: ashors1 <[email protected]> Co-authored-by: ataghibakhsh <[email protected]> * Akoumparouli/mixtral recipe fix r2.0.0 (NVIDIA#10994) * Mixtral TP8 EP1 Signed-off-by: Alexandros Koumparoulis <[email protected]> * Apply isort and black reformatting Signed-off-by: akoumpa <[email protected]> --------- Signed-off-by: Alexandros Koumparoulis <[email protected]> Signed-off-by: akoumpa <[email protected]> Co-authored-by: akoumpa <[email protected]> * Fix _strategy_lib tests (NVIDIA#11033) * fix world size and don't mock Signed-off-by: Maanu Grover <[email protected]> * cleanup global state Signed-off-by: Maanu Grover <[email protected]> * check app state instead Signed-off-by: Maanu Grover <[email protected]> * fix syntax nemo logger test Signed-off-by: Maanu Grover <[email protected]> --------- Signed-off-by: Maanu Grover <[email protected]> * Update `BaseMegatronSampler` for compatibility with PTL's `_BatchProgress` (NVIDIA#11016) * Revert "[NeMo-UX] Use custom `BatchProgress` class which does not restore states (NVIDIA#10383)" This reverts commit b5798de. * make megatron sampler return the total number of batches in the dataset Signed-off-by: ashors1 <[email protected]> --------- Signed-off-by: ashors1 <[email protected]> * PTQ example for NeMo 2.0 (NVIDIA#10642) * initial commit Signed-off-by: Piotr Kaminski <[email protected]> * create Quantizer for NeMo 2.0 Signed-off-by: Piotr Kaminski <[email protected]> * refactor Signed-off-by: Piotr Kaminski <[email protected]> * Call quantize on an unwrapped mcore model Signed-off-by: Piotr Kaminski <[email protected]> * Apply isort and black reformatting Signed-off-by: Laplasjan107 <[email protected]> * Add tests, adjust unwrapping Signed-off-by: Piotr Kaminski <[email protected]> * Apply isort and black reformatting Signed-off-by: Laplasjan107 <[email protected]> * fix export Signed-off-by: Piotr Kaminski <[email protected]> * Apply isort and black reformatting Signed-off-by: Laplasjan107 <[email protected]> * Apply isort and black reformatting Signed-off-by: artbataev <[email protected]> * Fix output_path argument for HF import Signed-off-by: Piotr Kamiński <[email protected]> * fix fabric ckpt loading Signed-off-by: Piotr Kaminski <[email protected]> * Apply isort and black reformatting Signed-off-by: Laplasjan107 <[email protected]> * code review suggestions Signed-off-by: Piotr Kaminski <[email protected]> * Apply isort and black reformatting Signed-off-by: Laplasjan107 <[email protected]> * remove unused import Signed-off-by: Piotr Kaminski <[email protected]> * use cnn dataset in github ci Signed-off-by: Piotr Kaminski <[email protected]> * applied code review Signed-off-by: Piotr Kaminski <[email protected]> * code review changes Signed-off-by: Piotr Kaminski <[email protected]> * Apply isort and black reformatting Signed-off-by: Laplasjan107 <[email protected]> * simplify interface for data iterator Signed-off-by: Piotr Kaminski <[email protected]> * Apply isort and black reformatting Signed-off-by: Laplasjan107 <[email protected]> * (partial) PP fix Signed-off-by: Piotr Kaminski <[email protected]> * Apply isort and black reformatting Signed-off-by: Laplasjan107 <[email protected]> --------- Signed-off-by: Piotr Kaminski <[email protected]> Signed-off-by: Laplasjan107 <[email protected]> Signed-off-by: Piotr Kamiński <[email protected]> Signed-off-by: artbataev <[email protected]> Co-authored-by: Piotr Kaminski <[email protected]> Co-authored-by: Laplasjan107 <[email protected]> Co-authored-by: artbataev <[email protected]> * TDT compute timestamps option and Extra Whitespace handling for SPE (NVIDIA#10875) * add token duration Signed-off-by: monica-sekoyan <[email protected]> * revert rnnt change Signed-off-by: monica-sekoyan <[email protected]> * add remove_extra_whitespaces arg to spe tokenizer Signed-off-by: monica-sekoyan <[email protected]> * add token duration retrieval Signed-off-by: monica-sekoyan <[email protected]> * add ignore_extra_whitespace to spe Signed-off-by: monica-sekoyan <[email protected]> * add compute_timestamp support for tdt Signed-off-by: monica-sekoyan <[email protected]> * fix config field name Signed-off-by: monica-sekoyan <[email protected]> * add refinement for tdt timestamps Signed-off-by: monica-sekoyan <[email protected]> * add segments timestamp support and refinement for ctc Signed-off-by: monica-sekoyan <[email protected]> * modify tests for ctc decoding timestamps Signed-off-by: monica-sekoyan <[email protected]> * add rnnt timestamp tests Signed-off-by: monica-sekoyan <[email protected]> * updated doc Signed-off-by: monica-sekoyan <[email protected]> * fix in test Signed-off-by: monica-sekoyan <[email protected]> * Apply isort and black reformatting Signed-off-by: monica-sekoyan <[email protected]> * fix of unicode char Signed-off-by: monica-sekoyan <[email protected]> * fix rnnt_decoding test Signed-off-by: monica-sekoyan <[email protected]> * workaround for tesst tokenizer Signed-off-by: monica-sekoyan <[email protected]> * Apply isort and black reformatting Signed-off-by: monica-sekoyan <[email protected]> * modify segments formation Signed-off-by: monica-sekoyan <[email protected]> * modify segments for ctc Signed-off-by: monica-sekoyan <[email protected]> * fix in ctc refinement Signed-off-by: monica-sekoyan <[email protected]> * Apply isort and black reformatting Signed-off-by: monica-sekoyan <[email protected]> * minor changes Signed-off-by: monica-sekoyan <[email protected]> * reverse offset change Signed-off-by: monica-sekoyan <[email protected]> * Apply isort and black reformatting Signed-off-by: monica-sekoyan <[email protected]> * warning mode=once Signed-off-by: monica-sekoyan <[email protected]> * Apply isort and black reformatting Signed-off-by: monica-sekoyan <[email protected]> * make ignore_extrawhitespaces false Signed-off-by: monica-sekoyan <[email protected]> * minor changes Signed-off-by: monica-sekoyan <[email protected]> * adjust changes to the tests Signed-off-by: monica-sekoyan <[email protected]> * modify prompt_formatter tests Signed-off-by: monica-sekoyan <[email protected]> * Apply isort and black reformatting Signed-off-by: monica-sekoyan <[email protected]> --------- Signed-off-by: monica-sekoyan <[email protected]> Signed-off-by: monica-sekoyan <[email protected]> Co-authored-by: monica-sekoyan <[email protected]> * Basic online dynamic FP8 quantization with vLLM (NVIDIA#10904) * Basic online dynamic quantization with vLLM Signed-off-by: Jan Lasek <[email protected]> * Apply isort and black reformatting Signed-off-by: janekl <[email protected]> * vllm 0.6.3 updates Signed-off-by: Jan Lasek <[email protected]> * Pass quantization param in deploy_vllm_triton.py script Signed-off-by: Jan Lasek <[email protected]> --------- Signed-off-by: Jan Lasek <[email protected]> Signed-off-by: janekl <[email protected]> Co-authored-by: janekl <[email protected]> * ci: Improve VM maintenance (NVIDIA#10758) * ci: Improve VM maintenance Signed-off-by: Oliver Koenig <[email protected]> * rename stuff Signed-off-by: Oliver Koenig <[email protected]> * title Signed-off-by: Oliver Koenig <[email protected]> * use team Signed-off-by: Oliver Koenig <[email protected]> * run on failure too Signed-off-by: Oliver Koenig <[email protected]> * fix Signed-off-by: Oliver Koenig <[email protected]> * yrdy Signed-off-by: Oliver Koenig <[email protected]> * f Signed-off-by: Oliver Koenig <[email protected]> * test Signed-off-by: Oliver Koenig <[email protected]> * fix Signed-off-by: Oliver Koenig <[email protected]> * f Signed-off-by: Oliver Koenig <[email protected]> * f Signed-off-by: Oliver Koenig <[email protected]> * f Signed-off-by: Oliver Koenig <[email protected]> --------- Signed-off-by: Oliver Koenig <[email protected]> * neva update Signed-off-by: yaoyu-33 <[email protected]> * Add comment for vision transpose * update megatron_init.py inside lightning Signed-off-by: yaoyu-33 <[email protected]> * Fix PP Signed-off-by: yaoyu-33 <[email protected]> * add examples Signed-off-by: yaoyu-33 <[email protected]> * fix test Signed-off-by: yaoyu-33 <[email protected]> * try fix test Signed-off-by: yaoyu-33 <[email protected]> * try fix test Signed-off-by: yaoyu-33 <[email protected]> * Fix megatron megatron_init.py dp Signed-off-by: Yu Yao <[email protected]> * Update lightning megatron_init.py dp Signed-off-by: Yu Yao <[email protected]> * make it possible to update pre_preprocess and post_process for llm, required in vlm Signed-off-by: yaoyu-33 <[email protected]> * Fixes for neva to run with PP Signed-off-by: yaoyu-33 <[email protected]> * Add mcore vit support, and checkpoint conversion Signed-off-by: yaoyu-33 <[email protected]> * fix checkpoint loading for epp Signed-off-by: yaoyu-33 <[email protected]> * update script Signed-off-by: yaoyu-33 <[email protected]> * rename llama to mllama folder name Signed-off-by: yaoyu-33 <[email protected]> * update to attention bias Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * added datamodule for llava-next * modified state dict transform * neva model changes to support llava-next * remove accidentally checked in files Signed-off-by: Yashaswi Karnati <[email protected]> * Apply isort and black reformatting Signed-off-by: yashaswikarnati <[email protected]> * remove unused imports * added io_init to not save task_encoder and image_processor * Apply isort and black reformatting Signed-off-by: yashaswikarnati <[email protected]> * added scripts for pretrain and finetune Signed-off-by: Yashaswi Karnati <[email protected]> * Apply isort and black reformatting Signed-off-by: yashaswikarnati <[email protected]> * generation example * Apply isort and black reformatting Signed-off-by: yashaswikarnati <[email protected]> * small change in llava next example * llava next end-end train * Apply isort and black reformatting Signed-off-by: yashaswikarnati <[email protected]> * finetune changes * Apply isort and black reformatting Signed-off-by: yashaswikarnati <[email protected]> * finetune debug changes * update dropout to 0 Signed-off-by: yaoyu-33 <[email protected]> * added example generation script * added doc strings, formating, remove debug statemens and unsued imports * remove example scripts * fix attention bias Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * remove disable_vision_padding since we now have a fix Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * Update init for mllama Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * Address comments Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * fix copyright title Signed-off-by: yaoyu-33 <[email protected]> * multiple fixes Signed-off-by: yaoyu-33 <[email protected]> * bug fix Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * fix code scan Signed-off-by: yaoyu-33 <[email protected]> * Fix for SP Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * update vision code Signed-off-by: yaoyu-33 <[email protected]> * revert attention bias changes until latest MLM code got merged Signed-off-by: yaoyu-33 <[email protected]> * fix warning Signed-off-by: yaoyu-33 <[email protected]> * Turn off system message check, as it's "" now Signed-off-by: yaoyu-33 <[email protected]> * Update layer spec and add siglip support Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * update pretrain script Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * Fix scripts Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * add neva training recipes Signed-off-by: yaoyu-33 <[email protected]> * fix mllama mock ds Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * fix recipe Signed-off-by: yaoyu-33 <[email protected]> * fix pp Signed-off-by: yaoyu-33 <[email protected]> * scripts update Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * scripts update Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * update config api Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * few updates Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * update 70b Signed-off-by: yaoyu-33 <[email protected]> * hide examples for pr Signed-off-by: yaoyu-33 <[email protected]> * fix few issues Signed-off-by: yaoyu-33 <[email protected]> * add docstring layer spec Signed-off-by: yaoyu-33 <[email protected]> * add docstring to vit config Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * fix copyright Signed-off-by: yaoyu-33 <[email protected]> * fix Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: cuichenx <[email protected]> Signed-off-by: Chen Cui <[email protected]> Signed-off-by: artbataev <[email protected]> Signed-off-by: parthmannan <[email protected]> Signed-off-by: meatybobby <[email protected]> Signed-off-by: Yu Yao <[email protected]> Signed-off-by: HuiyingLi <[email protected]> Signed-off-by: Alexandros Koumparoulis <[email protected]> Signed-off-by: akoumpa <[email protected]> Signed-off-by: Abhishree <[email protected]> Signed-off-by: Marc Romeijn <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> Signed-off-by: marcromeyn <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Signed-off-by: Shriya Palsamudram <[email protected]> Signed-off-by: ashors1 <[email protected]> Signed-off-by: Maanu Grover <[email protected]> Signed-off-by: Piotr Kaminski <[email protected]> Signed-off-by: Laplasjan107 <[email protected]> Signed-off-by: Piotr Kamiński <[email protected]> Signed-off-by: monica-sekoyan <[email protected]> Signed-off-by: monica-sekoyan <[email protected]> Signed-off-by: Jan Lasek <[email protected]> Signed-off-by: janekl <[email protected]> Signed-off-by: Oliver Koenig <[email protected]> Signed-off-by: Yashaswi Karnati <[email protected]> Signed-off-by: yashaswikarnati <[email protected]> Signed-off-by: Yashaswi Karnati <[email protected]> Co-authored-by: Chen Cui <[email protected]> Co-authored-by: Bobby Chen <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Co-authored-by: cuichenx <[email protected]> Co-authored-by: Yashaswi Karnati <[email protected]> Co-authored-by: artbataev <[email protected]> Co-authored-by: Parth Mannan <[email protected]> Co-authored-by: parthmannan <[email protected]> Co-authored-by: meatybobby <[email protected]> Co-authored-by: Huiying <[email protected]> Co-authored-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: akoumpa <[email protected]> Co-authored-by: Abhishree Thittenamane <[email protected]> Co-authored-by: Pablo Garay <[email protected]> Co-authored-by: Marc Romeyn <[email protected]> Co-authored-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: marcromeyn <[email protected]> Co-authored-by: He Huang (Steve) <[email protected]> Co-authored-by: Shriya Rishab <[email protected]> Co-authored-by: ataghibakhsh <[email protected]> Co-authored-by: Maanu Grover <[email protected]> Co-authored-by: Anna Shors <[email protected]> Co-authored-by: Piotr Kamiński <[email protected]> Co-authored-by: Piotr Kaminski <[email protected]> Co-authored-by: Laplasjan107 <[email protected]> Co-authored-by: monica-sekoyan <[email protected]> Co-authored-by: monica-sekoyan <[email protected]> Co-authored-by: Jan Lasek <[email protected]> Co-authored-by: janekl <[email protected]> Co-authored-by: oliver könig <[email protected]> Co-authored-by: Pablo Garay <[email protected]> Co-authored-by: ykarnati <[email protected]> Co-authored-by: Yashaswi Karnati <[email protected]> Co-authored-by: yashaswikarnati <[email protected]>

* evian3 update Signed-off-by: yaoyu-33 <[email protected]> * add encoder parallel default config Signed-off-by: yaoyu-33 <[email protected]> * add encoder parallel default config Signed-off-by: yaoyu-33 <[email protected]> * clean up Signed-off-by: yaoyu-33 <[email protected]> * add aspect ratio in model * support energon dataloader * some pp update Signed-off-by: yaoyu-33 <[email protected]> * fixes Signed-off-by: yaoyu-33 <[email protected]> * fix kv merging Signed-off-by: yaoyu-33 <[email protected]> * fix get_key_value_tensors Signed-off-by: yaoyu-33 <[email protected]> * rename files Signed-off-by: yaoyu-33 <[email protected]> * update to HF style position embedding Signed-off-by: yaoyu-33 <[email protected]> * fix energon dataloader and support batching * update forward args Signed-off-by: yaoyu-33 <[email protected]> * clean up and move to aspect_ratio_ids Signed-off-by: yaoyu-33 <[email protected]> * rename back to language.py Signed-off-by: yaoyu-33 <[email protected]> * fix loss function Signed-off-by: yaoyu-33 <[email protected]> * update and fix energon Signed-off-by: yaoyu-33 <[email protected]> * Add hf import * Fix type * Change config * update energon pretrain Signed-off-by: yaoyu-33 <[email protected]> * clean up * clean up * reformat Signed-off-by: yaoyu-33 <[email protected]> * update inference files for new code * update to instruct * update to instruct * update few names Signed-off-by: yaoyu-33 <[email protected]> * update generation Signed-off-by: yaoyu-33 <[email protected]> * fix importer embedding.weight * few fixes Signed-off-by: yaoyu-33 <[email protected]> * add hf script Signed-off-by: yaoyu-33 <[email protected]> * fix kv import * remove interleaved * fixes and updates Signed-off-by: yaoyu-33 <[email protected]> * lora fixes Signed-off-by: yaoyu-33 <[email protected]> * some code clean ups Signed-off-by: yaoyu-33 <[email protected]> * update training scripts Signed-off-by: yaoyu-33 <[email protected]> * refactors Signed-off-by: yaoyu-33 <[email protected]> * add LoRA finetuning * fixes and nemo update Signed-off-by: yaoyu-33 <[email protected]> * fix importer registering issue by adding 11B and 90B configs * update `decoder_seq_len` Signed-off-by: yaoyu-33 <[email protected]> * science vqa script Signed-off-by: yaoyu-33 <[email protected]> * clean up script name Signed-off-by: yaoyu-33 <[email protected]> * fix ckpt save serialization issue * fix predefined config classes * add num_chunks in input Signed-off-by: yaoyu-33 <[email protected]> * fix format Signed-off-by: yaoyu-33 <[email protected]> * update finetuning scripts for PEFT * add 11b recipe (need NVIDIA#10645 to test) * fix mask generation Signed-off-by: yaoyu-33 <[email protected]> * minor fix code style Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * Support no image inference * add llama svqa eval * fix masking Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * fix generation Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * add 90b recipe and revise 11b recipe * Apply isort and black reformatting Signed-off-by: cuichenx <[email protected]> * clean up typing * add option to disable vision padding * Apply isort and black reformatting Signed-off-by: cuichenx <[email protected]> * base model finetuning (does not work yet) * Apply isort and black reformatting Signed-off-by: cuichenx <[email protected]> * fixed default conversation template config for MLLama * Update svqa * add multinode * bot happy * Apply isort and black reformatting Signed-off-by: cuichenx <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: artbataev <[email protected]> * Perf improvements. Mainly from XAttn mask calculation (NVIDIA#10901) * Perf improvements. Mainly from XAttn mask calculation * Apply isort and black reformatting Signed-off-by: parthmannan <[email protected]> --------- Signed-off-by: parthmannan <[email protected]> Co-authored-by: parthmannan <[email protected]> * fix existing issues Signed-off-by: yaoyu-33 <[email protected]> * fix scripts Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * fix lora * few fixes for non image support Signed-off-by: yaoyu-33 <[email protected]> * update masking gen Signed-off-by: yaoyu-33 <[email protected]> * update lazy dataset Signed-off-by: yaoyu-33 <[email protected]> * fix data sampler and loading issue Signed-off-by: yaoyu-33 <[email protected]> * Add vlm generation * Apply isort and black reformatting Signed-off-by: meatybobby <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * generation update Signed-off-by: yaoyu-33 <[email protected]> * update lazy dataset Signed-off-by: yaoyu-33 <[email protected]> * Fix _strategy_lib.py Signed-off-by: Yu Yao <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * fix warning Signed-off-by: yaoyu-33 <[email protected]> * hide vlm examples Signed-off-by: yaoyu-33 <[email protected]> * Revert "Add vlm generation" This reverts commit 4711c75 Signed-off-by: yaoyu-33 <[email protected]> * Fix VisionEncoder multi-batch bug * update mcore parallelism initialization Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * Update megatron_init.py Signed-off-by: Yu Yao <[email protected]> * add encoder parallel default config Signed-off-by: yaoyu-33 <[email protected]> * Fix _strategy_lib.py Signed-off-by: Yu Yao <[email protected]> * llm.generate fixes (NVIDIA#10983) * fix context path, disable optimizer init, add tp Signed-off-by: HuiyingLi <[email protected]> * format Signed-off-by: HuiyingLi <[email protected]> * address comments, require user to provide trainer Signed-off-by: HuiyingLi <[email protected]> * minor fix Signed-off-by: HuiyingLi <[email protected]> * minor fixes Signed-off-by: HuiyingLi <[email protected]> --------- Signed-off-by: HuiyingLi <[email protected]> * use __dict__ in check (NVIDIA#11012) * check is_hf_model in leaf module Signed-off-by: Alexandros Koumparoulis <[email protected]> * Apply isort and black reformatting Signed-off-by: akoumpa <[email protected]> * disable getattr alternative path Signed-off-by: Alexandros Koumparoulis <[email protected]> * fix Signed-off-by: Alexandros Koumparoulis <[email protected]> * undo; Signed-off-by: Alexandros Koumparoulis <[email protected]> --------- Signed-off-by: Alexandros Koumparoulis <[email protected]> Signed-off-by: akoumpa <[email protected]> Co-authored-by: akoumpa <[email protected]> * LoRA support for HF::AutoModelForCausalLM (NVIDIA#10982) * add LinearAdapter Signed-off-by: Alexandros Koumparoulis <[email protected]> * add hf lora example Signed-off-by: Alexandros Koumparoulis <[email protected]> * remove unused imports Signed-off-by: Alexandros Koumparoulis <[email protected]> * fix Signed-off-by: Alexandros Koumparoulis <[email protected]> * fix Signed-off-by: Alexandros Koumparoulis <[email protected]> * subclass mixin Signed-off-by: Alexandros Koumparoulis <[email protected]> * remove stale imports Signed-off-by: Alexandros Koumparoulis <[email protected]> * undo Signed-off-by: Alexandros Koumparoulis <[email protected]> * fix scale Signed-off-by: Alexandros Koumparoulis <[email protected]> * regex selector for peft Signed-off-by: Alexandros Koumparoulis <[email protected]> * move lora Signed-off-by: Alexandros Koumparoulis <[email protected]> * fmt Signed-off-by: Alexandros Koumparoulis <[email protected]> * hf_auto_model_for_causal_lm finetune recipe Signed-off-by: Alexandros Koumparoulis <[email protected]> * Apply isort and black reformatting Signed-off-by: akoumpa <[email protected]> --------- Signed-off-by: Alexandros Koumparoulis <[email protected]> Signed-off-by: akoumpa <[email protected]> Co-authored-by: akoumpa <[email protected]> * Change default for always_save_context to True (NVIDIA#11014) Signed-off-by: Abhishree <[email protected]> Co-authored-by: Pablo Garay <[email protected]> * Add a build option to load_context (NVIDIA#10713) * Add a build option to load_context Signed-off-by: Marc Romeijn <[email protected]> Signed-off-by: Alexandros Koumparoulis <[email protected]> * Adding test Signed-off-by: Marc Romeijn <[email protected]> Signed-off-by: Alexandros Koumparoulis <[email protected]> * Trying to fix failing CPU test Signed-off-by: Marc Romeijn <[email protected]> Signed-off-by: Alexandros Koumparoulis <[email protected]> * cherry-pick fix Signed-off-by: Alexandros Koumparoulis <[email protected]> --------- Signed-off-by: Marc Romeijn <[email protected]> Signed-off-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: Alexandros Koumparoulis <[email protected]> * Fix pip install (NVIDIA#11026) * Move AutoTokenizer inline Signed-off-by: Marc Romeyn <[email protected]> * Move einops to common requirements Signed-off-by: Marc Romeyn <[email protected]> * Move AutoTokenizer import to top-level again in fine_tuning Signed-off-by: Marc Romeyn <[email protected]> * Move megatron init inside nemo.lightning Signed-off-by: Marc Romeyn <[email protected]> * Make megatron_lazy_init_context work when transformer-engine is not installed Signed-off-by: Marc Romeyn <[email protected]> * Only import get_nmt_tokenizer when needed Signed-off-by: Marc Romeyn <[email protected]> * Apply isort and black reformatting Signed-off-by: marcromeyn <[email protected]> --------- Signed-off-by: Marc Romeyn <[email protected]> Signed-off-by: marcromeyn <[email protected]> Co-authored-by: marcromeyn <[email protected]> * [WIP] Add docs for NEST SSL (NVIDIA#10804) * add docs Signed-off-by: stevehuang52 <[email protected]> * update doc and fix missing param Signed-off-by: stevehuang52 <[email protected]> --------- Signed-off-by: stevehuang52 <[email protected]> * Change dist ckpt defaults (NVIDIA#10913) * Enable ckpt features by default (async ckpt), ckpt every 15mins and reduce preemption time to 1min Signed-off-by: Shriya Palsamudram <[email protected]> * fix ssm tests Signed-off-by: Shriya Palsamudram <[email protected]> * Make note that ckpt_async_save is disabled for SSMs Signed-off-by: Shriya Palsamudram <[email protected]> * Enable async ckpt for SSMs with fix Signed-off-by: Shriya Palsamudram <[email protected]> * Disable async ckpt in the peft test as it is a known bug, add note. Signed-off-by: Shriya Palsamudram <[email protected]> * Fix failing unit tests Signed-off-by: Shriya Palsamudram <[email protected]> * Ashors/peft async ckpt (NVIDIA#11010) * [WIP] prototype for supporting async checkpointing with peft Signed-off-by: ashors1 <[email protected]> Signed-off-by: Shriya Palsamudram <[email protected]> * Enable async ckpt for the peft test Signed-off-by: Shriya Palsamudram <[email protected]> * Fix peft setup test Signed-off-by: Shriya Palsamudram <[email protected]> --------- Signed-off-by: Shriya Palsamudram <[email protected]> Signed-off-by: ashors1 <[email protected]> Co-authored-by: ataghibakhsh <[email protected]> * Akoumparouli/mixtral recipe fix r2.0.0 (NVIDIA#10994) * Mixtral TP8 EP1 Signed-off-by: Alexandros Koumparoulis <[email protected]> * Apply isort and black reformatting Signed-off-by: akoumpa <[email protected]> --------- Signed-off-by: Alexandros Koumparoulis <[email protected]> Signed-off-by: akoumpa <[email protected]> Co-authored-by: akoumpa <[email protected]> * Fix _strategy_lib tests (NVIDIA#11033) * fix world size and don't mock Signed-off-by: Maanu Grover <[email protected]> * cleanup global state Signed-off-by: Maanu Grover <[email protected]> * check app state instead Signed-off-by: Maanu Grover <[email protected]> * fix syntax nemo logger test Signed-off-by: Maanu Grover <[email protected]> --------- Signed-off-by: Maanu Grover <[email protected]> * Update `BaseMegatronSampler` for compatibility with PTL's `_BatchProgress` (NVIDIA#11016) * Revert "[NeMo-UX] Use custom `BatchProgress` class which does not restore states (NVIDIA#10383)" This reverts commit b5798de. * make megatron sampler return the total number of batches in the dataset Signed-off-by: ashors1 <[email protected]> --------- Signed-off-by: ashors1 <[email protected]> * PTQ example for NeMo 2.0 (NVIDIA#10642) * initial commit Signed-off-by: Piotr Kaminski <[email protected]> * create Quantizer for NeMo 2.0 Signed-off-by: Piotr Kaminski <[email protected]> * refactor Signed-off-by: Piotr Kaminski <[email protected]> * Call quantize on an unwrapped mcore model Signed-off-by: Piotr Kaminski <[email protected]> * Apply isort and black reformatting Signed-off-by: Laplasjan107 <[email protected]> * Add tests, adjust unwrapping Signed-off-by: Piotr Kaminski <[email protected]> * Apply isort and black reformatting Signed-off-by: Laplasjan107 <[email protected]> * fix export Signed-off-by: Piotr Kaminski <[email protected]> * Apply isort and black reformatting Signed-off-by: Laplasjan107 <[email protected]> * Apply isort and black reformatting Signed-off-by: artbataev <[email protected]> * Fix output_path argument for HF import Signed-off-by: Piotr Kamiński <[email protected]> * fix fabric ckpt loading Signed-off-by: Piotr Kaminski <[email protected]> * Apply isort and black reformatting Signed-off-by: Laplasjan107 <[email protected]> * code review suggestions Signed-off-by: Piotr Kaminski <[email protected]> * Apply isort and black reformatting Signed-off-by: Laplasjan107 <[email protected]> * remove unused import Signed-off-by: Piotr Kaminski <[email protected]> * use cnn dataset in github ci Signed-off-by: Piotr Kaminski <[email protected]> * applied code review Signed-off-by: Piotr Kaminski <[email protected]> * code review changes Signed-off-by: Piotr Kaminski <[email protected]> * Apply isort and black reformatting Signed-off-by: Laplasjan107 <[email protected]> * simplify interface for data iterator Signed-off-by: Piotr Kaminski <[email protected]> * Apply isort and black reformatting Signed-off-by: Laplasjan107 <[email protected]> * (partial) PP fix Signed-off-by: Piotr Kaminski <[email protected]> * Apply isort and black reformatting Signed-off-by: Laplasjan107 <[email protected]> --------- Signed-off-by: Piotr Kaminski <[email protected]> Signed-off-by: Laplasjan107 <[email protected]> Signed-off-by: Piotr Kamiński <[email protected]> Signed-off-by: artbataev <[email protected]> Co-authored-by: Piotr Kaminski <[email protected]> Co-authored-by: Laplasjan107 <[email protected]> Co-authored-by: artbataev <[email protected]> * TDT compute timestamps option and Extra Whitespace handling for SPE (NVIDIA#10875) * add token duration Signed-off-by: monica-sekoyan <[email protected]> * revert rnnt change Signed-off-by: monica-sekoyan <[email protected]> * add remove_extra_whitespaces arg to spe tokenizer Signed-off-by: monica-sekoyan <[email protected]> * add token duration retrieval Signed-off-by: monica-sekoyan <[email protected]> * add ignore_extra_whitespace to spe Signed-off-by: monica-sekoyan <[email protected]> * add compute_timestamp support for tdt Signed-off-by: monica-sekoyan <[email protected]> * fix config field name Signed-off-by: monica-sekoyan <[email protected]> * add refinement for tdt timestamps Signed-off-by: monica-sekoyan <[email protected]> * add segments timestamp support and refinement for ctc Signed-off-by: monica-sekoyan <[email protected]> * modify tests for ctc decoding timestamps Signed-off-by: monica-sekoyan <[email protected]> * add rnnt timestamp tests Signed-off-by: monica-sekoyan <[email protected]> * updated doc Signed-off-by: monica-sekoyan <[email protected]> * fix in test Signed-off-by: monica-sekoyan <[email protected]> * Apply isort and black reformatting Signed-off-by: monica-sekoyan <[email protected]> * fix of unicode char Signed-off-by: monica-sekoyan <[email protected]> * fix rnnt_decoding test Signed-off-by: monica-sekoyan <[email protected]> * workaround for tesst tokenizer Signed-off-by: monica-sekoyan <[email protected]> * Apply isort and black reformatting Signed-off-by: monica-sekoyan <[email protected]> * modify segments formation Signed-off-by: monica-sekoyan <[email protected]> * modify segments for ctc Signed-off-by: monica-sekoyan <[email protected]> * fix in ctc refinement Signed-off-by: monica-sekoyan <[email protected]> * Apply isort and black reformatting Signed-off-by: monica-sekoyan <[email protected]> * minor changes Signed-off-by: monica-sekoyan <[email protected]> * reverse offset change Signed-off-by: monica-sekoyan <[email protected]> * Apply isort and black reformatting Signed-off-by: monica-sekoyan <[email protected]> * warning mode=once Signed-off-by: monica-sekoyan <[email protected]> * Apply isort and black reformatting Signed-off-by: monica-sekoyan <[email protected]> * make ignore_extrawhitespaces false Signed-off-by: monica-sekoyan <[email protected]> * minor changes Signed-off-by: monica-sekoyan <[email protected]> * adjust changes to the tests Signed-off-by: monica-sekoyan <[email protected]> * modify prompt_formatter tests Signed-off-by: monica-sekoyan <[email protected]> * Apply isort and black reformatting Signed-off-by: monica-sekoyan <[email protected]> --------- Signed-off-by: monica-sekoyan <[email protected]> Signed-off-by: monica-sekoyan <[email protected]> Co-authored-by: monica-sekoyan <[email protected]> * Basic online dynamic FP8 quantization with vLLM (NVIDIA#10904) * Basic online dynamic quantization with vLLM Signed-off-by: Jan Lasek <[email protected]> * Apply isort and black reformatting Signed-off-by: janekl <[email protected]> * vllm 0.6.3 updates Signed-off-by: Jan Lasek <[email protected]> * Pass quantization param in deploy_vllm_triton.py script Signed-off-by: Jan Lasek <[email protected]> --------- Signed-off-by: Jan Lasek <[email protected]> Signed-off-by: janekl <[email protected]> Co-authored-by: janekl <[email protected]> * ci: Improve VM maintenance (NVIDIA#10758) * ci: Improve VM maintenance Signed-off-by: Oliver Koenig <[email protected]> * rename stuff Signed-off-by: Oliver Koenig <[email protected]> * title Signed-off-by: Oliver Koenig <[email protected]> * use team Signed-off-by: Oliver Koenig <[email protected]> * run on failure too Signed-off-by: Oliver Koenig <[email protected]> * fix Signed-off-by: Oliver Koenig <[email protected]> * yrdy Signed-off-by: Oliver Koenig <[email protected]> * f Signed-off-by: Oliver Koenig <[email protected]> * test Signed-off-by: Oliver Koenig <[email protected]> * fix Signed-off-by: Oliver Koenig <[email protected]> * f Signed-off-by: Oliver Koenig <[email protected]> * f Signed-off-by: Oliver Koenig <[email protected]> * f Signed-off-by: Oliver Koenig <[email protected]> --------- Signed-off-by: Oliver Koenig <[email protected]> * neva update Signed-off-by: yaoyu-33 <[email protected]> * Add comment for vision transpose * update megatron_init.py inside lightning Signed-off-by: yaoyu-33 <[email protected]> * Fix PP Signed-off-by: yaoyu-33 <[email protected]> * add examples Signed-off-by: yaoyu-33 <[email protected]> * fix test Signed-off-by: yaoyu-33 <[email protected]> * try fix test Signed-off-by: yaoyu-33 <[email protected]> * try fix test Signed-off-by: yaoyu-33 <[email protected]> * Fix megatron megatron_init.py dp Signed-off-by: Yu Yao <[email protected]> * Update lightning megatron_init.py dp Signed-off-by: Yu Yao <[email protected]> * make it possible to update pre_preprocess and post_process for llm, required in vlm Signed-off-by: yaoyu-33 <[email protected]> * Fixes for neva to run with PP Signed-off-by: yaoyu-33 <[email protected]> * Add mcore vit support, and checkpoint conversion Signed-off-by: yaoyu-33 <[email protected]> * fix checkpoint loading for epp Signed-off-by: yaoyu-33 <[email protected]> * update script Signed-off-by: yaoyu-33 <[email protected]> * rename llama to mllama folder name Signed-off-by: yaoyu-33 <[email protected]> * update to attention bias Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * added datamodule for llava-next * modified state dict transform * neva model changes to support llava-next * remove accidentally checked in files Signed-off-by: Yashaswi Karnati <[email protected]> * Apply isort and black reformatting Signed-off-by: yashaswikarnati <[email protected]> * remove unused imports * added io_init to not save task_encoder and image_processor * Apply isort and black reformatting Signed-off-by: yashaswikarnati <[email protected]> * added scripts for pretrain and finetune Signed-off-by: Yashaswi Karnati <[email protected]> * Apply isort and black reformatting Signed-off-by: yashaswikarnati <[email protected]> * generation example * Apply isort and black reformatting Signed-off-by: yashaswikarnati <[email protected]> * small change in llava next example * llava next end-end train * Apply isort and black reformatting Signed-off-by: yashaswikarnati <[email protected]> * finetune changes * Apply isort and black reformatting Signed-off-by: yashaswikarnati <[email protected]> * finetune debug changes * update dropout to 0 Signed-off-by: yaoyu-33 <[email protected]> * added example generation script * added doc strings, formating, remove debug statemens and unsued imports * remove example scripts * fix attention bias Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * remove disable_vision_padding since we now have a fix Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * Update init for mllama Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * Address comments Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * fix copyright title Signed-off-by: yaoyu-33 <[email protected]> * multiple fixes Signed-off-by: yaoyu-33 <[email protected]> * bug fix Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * fix code scan Signed-off-by: yaoyu-33 <[email protected]> * Fix for SP Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * update vision code Signed-off-by: yaoyu-33 <[email protected]> * revert attention bias changes until latest MLM code got merged Signed-off-by: yaoyu-33 <[email protected]> * fix warning Signed-off-by: yaoyu-33 <[email protected]> * Turn off system message check, as it's "" now Signed-off-by: yaoyu-33 <[email protected]> * Update layer spec and add siglip support Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * update pretrain script Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * Fix scripts Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * add neva training recipes Signed-off-by: yaoyu-33 <[email protected]> * fix mllama mock ds Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * fix recipe Signed-off-by: yaoyu-33 <[email protected]> * fix pp Signed-off-by: yaoyu-33 <[email protected]> * scripts update Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * scripts update Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * update config api Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * few updates Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * update 70b Signed-off-by: yaoyu-33 <[email protected]> * hide examples for pr Signed-off-by: yaoyu-33 <[email protected]> * fix few issues Signed-off-by: yaoyu-33 <[email protected]> * add docstring layer spec Signed-off-by: yaoyu-33 <[email protected]> * add docstring to vit config Signed-off-by: yaoyu-33 <[email protected]> * Apply isort and black reformatting Signed-off-by: yaoyu-33 <[email protected]> * fix copyright Signed-off-by: yaoyu-33 <[email protected]> * fix Signed-off-by: yaoyu-33 <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: cuichenx <[email protected]> Signed-off-by: Chen Cui <[email protected]> Signed-off-by: artbataev <[email protected]> Signed-off-by: parthmannan <[email protected]> Signed-off-by: meatybobby <[email protected]> Signed-off-by: Yu Yao <[email protected]> Signed-off-by: HuiyingLi <[email protected]> Signed-off-by: Alexandros Koumparoulis <[email protected]> Signed-off-by: akoumpa <[email protected]> Signed-off-by: Abhishree <[email protected]> Signed-off-by: Marc Romeijn <[email protected]> Signed-off-by: Marc Romeyn <[email protected]> Signed-off-by: marcromeyn <[email protected]> Signed-off-by: stevehuang52 <[email protected]> Signed-off-by: Shriya Palsamudram <[email protected]> Signed-off-by: ashors1 <[email protected]> Signed-off-by: Maanu Grover <[email protected]> Signed-off-by: Piotr Kaminski <[email protected]> Signed-off-by: Laplasjan107 <[email protected]> Signed-off-by: Piotr Kamiński <[email protected]> Signed-off-by: monica-sekoyan <[email protected]> Signed-off-by: monica-sekoyan <[email protected]> Signed-off-by: Jan Lasek <[email protected]> Signed-off-by: janekl <[email protected]> Signed-off-by: Oliver Koenig <[email protected]> Signed-off-by: Yashaswi Karnati <[email protected]> Signed-off-by: yashaswikarnati <[email protected]> Signed-off-by: Yashaswi Karnati <[email protected]> Co-authored-by: Chen Cui <[email protected]> Co-authored-by: Bobby Chen <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Co-authored-by: cuichenx <[email protected]> Co-authored-by: Yashaswi Karnati <[email protected]> Co-authored-by: artbataev <[email protected]> Co-authored-by: Parth Mannan <[email protected]> Co-authored-by: parthmannan <[email protected]> Co-authored-by: meatybobby <[email protected]> Co-authored-by: Huiying <[email protected]> Co-authored-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: akoumpa <[email protected]> Co-authored-by: Abhishree Thittenamane <[email protected]> Co-authored-by: Pablo Garay <[email protected]> Co-authored-by: Marc Romeyn <[email protected]> Co-authored-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: marcromeyn <[email protected]> Co-authored-by: He Huang (Steve) <[email protected]> Co-authored-by: Shriya Rishab <[email protected]> Co-authored-by: ataghibakhsh <[email protected]> Co-authored-by: Maanu Grover <[email protected]> Co-authored-by: Anna Shors <[email protected]> Co-authored-by: Piotr Kamiński <[email protected]> Co-authored-by: Piotr Kaminski <[email protected]> Co-authored-by: Laplasjan107 <[email protected]> Co-authored-by: monica-sekoyan <[email protected]> Co-authored-by: monica-sekoyan <[email protected]> Co-authored-by: Jan Lasek <[email protected]> Co-authored-by: janekl <[email protected]> Co-authored-by: oliver könig <[email protected]> Co-authored-by: Pablo Garay <[email protected]> Co-authored-by: ykarnati <[email protected]> Co-authored-by: Yashaswi Karnati <[email protected]> Co-authored-by: yashaswikarnati <[email protected]> Signed-off-by: Youngeun Kwon <[email protected]>

yaoyu-33 and others added 30 commits September 27, 2024 10:29

evian3 update

f5549f3

Signed-off-by: yaoyu-33 <[email protected]>

add encoder parallel default config

6e9566a

Signed-off-by: yaoyu-33 <[email protected]>

add encoder parallel default config

8bc8823

Signed-off-by: yaoyu-33 <[email protected]>

clean up

d0ff08b

Signed-off-by: yaoyu-33 <[email protected]>

add aspect ratio in model

54923f0

Merge remote-tracking branch 'internal/yuya/add_llama_vlm' into yuya/…

5843c04

…add_llama_vlm

support energon dataloader

e0e5f00

some pp update

44cabfa

Signed-off-by: yaoyu-33 <[email protected]>

Merge remote-tracking branch 'internal/yuya/add_llama_vlm' into yuya/…

5e54a29

…add_llama_vlm

fixes

3ad96e5

Signed-off-by: yaoyu-33 <[email protected]>

fix kv merging

21e41ca

Signed-off-by: yaoyu-33 <[email protected]>

fix get_key_value_tensors

ff143ff

Signed-off-by: yaoyu-33 <[email protected]>

rename files

25c7781

Signed-off-by: yaoyu-33 <[email protected]>

update to HF style position embedding

4dbe2e3

Signed-off-by: yaoyu-33 <[email protected]>

fix energon dataloader and support batching

ca10c21

update forward args

cca1acb

Signed-off-by: yaoyu-33 <[email protected]>

Merge remote-tracking branch 'internal/yuya/add_llama_vlm' into yuya/…

f8cc794

…add_llama_vlm_hf

clean up and move to aspect_ratio_ids

f35e2f4

Signed-off-by: yaoyu-33 <[email protected]>

rename back to language.py

428403e

Signed-off-by: yaoyu-33 <[email protected]>

fix loss function

ffcb9df

Signed-off-by: yaoyu-33 <[email protected]>

update and fix energon

8f65450

Signed-off-by: yaoyu-33 <[email protected]>

Add hf import

7c33686

Fix type

11338d9

Change config

28a2f84

update energon pretrain

d5ea385

Signed-off-by: yaoyu-33 <[email protected]>

Merge remote-tracking branch 'internal/yuya/add_llama_vlm_hf' into yu…

23db738

…ya/add_llama_vlm_hf

clean up

412facb

Merge remote-tracking branch 'origin/yuya/add_llama_vlm_hf' into yuya…

1881095

…/add_llama_vlm_hf

clean up

5441d38

reformat

e67fe30

Signed-off-by: yaoyu-33 <[email protected]>

yaoyu-33 and others added 2 commits November 13, 2024 18:37

Apply isort and black reformatting

379779c

Signed-off-by: yaoyu-33 <[email protected]>

github-actions bot removed the Multi Modal label Nov 13, 2024

github-advanced-security bot found potential problems Nov 13, 2024

View reviewed changes

yaoyu-33 and others added 2 commits November 14, 2024 11:41

scripts update

b85649c

Signed-off-by: yaoyu-33 <[email protected]>

Apply isort and black reformatting

879a2d7

Signed-off-by: yaoyu-33 <[email protected]>

github-advanced-security bot found potential problems Nov 14, 2024

View reviewed changes

examples/vlm/neva_finetune.py Fixed Show fixed Hide fixed

yaoyu-33 and others added 11 commits November 14, 2024 15:24

update config api

85eceb4

Signed-off-by: yaoyu-33 <[email protected]>

Merge remote-tracking branch 'origin/yuya/add_neva_pp' into yuya/add_…

30fe7f8

…neva_pp

Apply isort and black reformatting

b3687d1

Signed-off-by: yaoyu-33 <[email protected]>

few updates

e5e0448

Signed-off-by: yaoyu-33 <[email protected]>

Apply isort and black reformatting

4293e5a

Signed-off-by: yaoyu-33 <[email protected]>

update 70b

bb891f8

Signed-off-by: yaoyu-33 <[email protected]>

hide examples for pr

22cc8f6

Signed-off-by: yaoyu-33 <[email protected]>

fix few issues

376b2df

Signed-off-by: yaoyu-33 <[email protected]>

add docstring layer spec

b13e5a4

Signed-off-by: yaoyu-33 <[email protected]>

add docstring to vit config

5a88d8f

Signed-off-by: yaoyu-33 <[email protected]>

Apply isort and black reformatting

a58a043

Signed-off-by: yaoyu-33 <[email protected]>

cuichenx self-requested a review November 18, 2024 19:30

yaoyu-33 added 2 commits November 18, 2024 12:32

fix copyright

1389696

Signed-off-by: yaoyu-33 <[email protected]>

fix

1c698a8

Signed-off-by: yaoyu-33 <[email protected]>

cuichenx approved these changes Nov 20, 2024

View reviewed changes

yaoyu-33 added the Run CICD label Nov 20, 2024

yaoyu-33 merged commit 773590c into main Nov 21, 2024
167 of 168 checks passed

yaoyu-33 deleted the yuya/add_neva_pp branch November 21, 2024 17:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add PP support in NeVA along with few bug fixes #11170

Add PP support in NeVA along with few bug fixes #11170

yaoyu-33 commented Nov 5, 2024 •

edited

Loading

cuichenx left a comment

github-actions bot commented Nov 20, 2024

github-actions bot commented Nov 20, 2024

github-actions bot commented Nov 21, 2024

Add PP support in NeVA along with few bug fixes #11170

Add PP support in NeVA along with few bug fixes #11170

Conversation

yaoyu-33 commented Nov 5, 2024 • edited Loading

What does this PR do ?

Changelog

Usage

GitHub Actions CI

Before your PR is "Ready for review"

Who can review?

Additional Information

cuichenx left a comment

Choose a reason for hiding this comment

github-actions bot commented Nov 20, 2024

github-actions bot commented Nov 20, 2024

github-actions bot commented Nov 21, 2024

yaoyu-33 commented Nov 5, 2024 •

edited

Loading