Dkorzekwa/puzzletron use importance hooks from prune#1115
Conversation
…om_prune Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## feature/puzzletron #1115 +/- ##
======================================================
- Coverage 70.22% 70.19% -0.03%
======================================================
Files 228 228
Lines 25952 25956 +4
======================================================
- Hits 18224 18221 -3
- Misses 7728 7735 +7 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Suggestion to keep model-specific hooks inside the model descriptor so someone wanting to add a new model support can see all model-specific stuff in single place
There was a problem hiding this comment.
I thought the same, but let's rethink it, added to TODOs
There was a problem hiding this comment.
Wasnt this change already merged to feature/puzzletron in your previous PR?
There was a problem hiding this comment.
maybe it auto-resolved wrongly during your merge from main to feature/puzzletron? I saw the same when merging from main to dkorzekwa/puzzletron_use_importance_hooks_from_prune and I think I noticed it.
### What does this PR do? Implement puzzletron compression algorithm based on Puzzle paper (https://arxiv.org/abs/2411.19146) <details> <summary> Th list of reviewed and merged MRs that resulted in the feature/puzzletron branch</summary> Merging dkorzekwa/any_model to feature/puzzletron [Add anymodel directories to feature/puzzletron by danielkorzekwa · Pull Request #974 · NVIDIA/Model-Optimizer](#974) - merged [Draft: anymodel activation scoring by danielkorzekwa · Pull Request #989 · NVIDIA/Model-Optimizer](#989) - merged [Draft: Merge anymodel pruning by danielkorzekwa · Pull Request #990 · NVIDIA/Model-Optimizer](#990) - merged [Draft: Merging anymodel:build_library_and_stats by danielkorzekwa · Pull Request #993 · NVIDIA/Model-Optimizer](#993) - merged [Dkorzekwa/any model calc one block scores by danielkorzekwa · Pull Request #994 · NVIDIA/Model-Optimizer](#994) - merged [Draft: merge any_model: mip_and_realize_models by danielkorzekwa · Pull Request #995 · NVIDIA/Model-Optimizer](#995) - merged [Dkorzekwa/any model other modeqls by danielkorztiekwa · Pull Request #1007 · NVIDIA/Model-Optimizer](#1007) - merged PR to 1007: #1039 - merged [Dkorzekwa/anymodel gptoss by danielkorzekwa · Pull Request #1020 · NVIDIA/Model-Optimizer](#1020) - merged [Merge any_model tutorial by danielkorzekwa · Pull Request #1035 · NVIDIA/Model-Optimizer](#1035) - merged [Merge mbridge distillation for any_model by danielkorzekwa · Pull Request #1036 · NVIDIA/Model-Optimizer](#1036) - merged [MR branch for the remaining difference between dkorzekwa/any_model an… by danielkorzekwa · Pull Request #1047 · NVIDIA/Model-Optimizer](#1047) - merged [Dkorzekwa/decilm hf code cleanup by danielkorzekwa · Pull Request #1071 · NVIDIA/Model-Optimizer](#1071) - merged [Dkorzekwa/decilm hf code cleanup 2 by danielkorzekwa · Pull Request #1073 · NVIDIA/Model-Optimizer](#1073) - merged [Dkorzekwa/anymodel subblock stats by danielkorzekwa · Pull Request #1085 · NVIDIA/Model-Optimizer](#1085) - merged [Dkorzekwa/anymodel subblock stats nodecilm by danielkorzekwa · Pull Request #1102 · NVIDIA/Model-Optimizer](#1102) - merged [Dkorzekwa/decilm cleanup post subblockstats by danielkorzekwa · Pull Request #1103 · NVIDIA/Model-Optimizer](#1103) - merged [code clean up by danielkorzekwa · Pull Request #1110 · NVIDIA/Model-Optimizer](#1110) - merged Merging into main: [Activation hooks redesign (reuse hooks component across both minitron and puzzletron) by danielkorzekwa · Pull Request #1022 · NVIDIA/Model-Optimizer](#1022) - merged [Dkorzekwa/puzzletron use importance hooks from prune by danielkorzekwa · Pull Request #1115 · NVIDIA/Model-Optimizer](#1115) - merged </details> <!-- Details about the change. --> ### Usage Puzzletron tutorial: https://github.com/NVIDIA/Model-Optimizer/tree/feature/puzzletron/examples/puzzletron ### Testing The main e2e test for compressing 9 models with Puzzletron: https://github.com/NVIDIA/Model-Optimizer/blob/feature/puzzletron/tests/gpu/torch/puzzletron/test_puzzletron.py 2-gpu nightly tests: - https://github.com/NVIDIA/Model-Optimizer/actions/runs/24468209205/job/71501061203 - https://github.com/NVIDIA/Model-Optimizer/actions/runs/24470214159/job/71508152952 ### Before your PR is "*Ready for review*" - Is this change backward compatible?: ✅ - If you copied code from any other sources or added a new PIP dependency, did you follow guidance in `CONTRIBUTING.md`: ✅ - Did you write any new necessary tests?: ✅ - Did you update [Changelog](https://github.com/NVIDIA/Model-Optimizer/blob/main/CHANGELOG.rst)?: ✅ <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * Added Puzzletron: end-to-end heterogeneous pruning & NAS workflow with AnyModel support, example pipelines, deployment and evaluation utilities, and tools for converting/pruning and exporting compressed checkpoints. * **Documentation** * Comprehensive Puzzletron tutorials, model-specific guides, evaluator instructions, example configs, and changelog entry. * **Chores** * CI/workflow updates (extras installation, longer GPU test timeout), pre-commit hook exclusion updated, and CODEOWNERS entries added. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com> Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com> Signed-off-by: Liana Mikaelyan <lmikaelyan@nvidia.com> Signed-off-by: Liana Mikaelyan <45925959+LianaMikael@users.noreply.github.com> Signed-off-by: Daniel Korzekwa <daniel.korzekwa@gmail.com> Signed-off-by: jrausch <jrausch@nvidia.com> Signed-off-by: root <root@pool0-00848.cm.cluster> Co-authored-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com> Co-authored-by: Liana Mikaelyan <lmikaelyan@nvidia.com> Co-authored-by: Liana Mikaelyan <45925959+LianaMikael@users.noreply.github.com> Co-authored-by: J Rausch <38429553+j-rausch@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
What does this PR do?
Use pruning importance hooks from prune/importance_hooks/ in modelopt.torch.puzzletron