Skip to content

Conversation

@lucaslie
Copy link
Collaborator

@lucaslie lucaslie commented Sep 24, 2025

Work in Progress for partial graph capture as a prerequisite to do cudagraph capture for VLMs

Summary

There is two major changes required:

  1. Nested graph inside larger nn.Module --> make all transforms aware and differentiate between transforms on the full model and on the individual subgraphs
  2. Switching infra to kwargs-only from positional args for inputs. Using positional args is not a scalable way forward as the subgraphs are most certainly going to be called with positional arguments.

Other thoughts:

  1. Thinking about auto-capturing subgraph based on inputs to the sub-forward function --> hard-coded right now
  2. Better args/kwargs handling and adding extra flexibility there...

config.yaml to play around with

# model: meta-llama/Meta-Llama-3.1-8B-Instruct
# model: mistralai/Magistral-Small-2507
# model: Qwen/Qwen2.5-VL-7B-Instruct
# model: meta-llama/Llama-4-Scout-17B-16E-Instruct
model: mistralai/Mistral-Small-3.1-24B-Instruct-2503
args:
  # mode: graph
  world_size: 0
  runtime: demollm
  compile_backend: torch-opt
  attn_page_size: 64
  max_input_len: 4096
  max_seq_len: 8192
  attn_backend: flashinfer
  # model_factory: AutoModelForImageTextToText
  # model_factory: AutoModelForCausalLM
  model_factory: Mistral3VLM
  skip_loading_weights: true
  model_kwargs:
    text_config:
        num_hidden_layers: 2
    # tp_plan: auto
    # tp_plan: null
    # device_map: cuda
    # num_hidden_layers: 3
    # _attn_implementation: eager
benchmark:
  enabled: false
dry_run: false
# prompt:
#   batch_size: 4
#   queries:
#     - "How big is the universe? "
#     - {"prompt": "In simple words and a single sentence, explain the concept of gravity: "}
#     # see for chat template format: https://huggingface.co/docs/transformers/en/chat_templating_multimodal
#     - - role: user
#         content:
#           - type: text
#             text: How to fix slicing in golf?
#     - - role: user
#         content:
#           - type: text
#             text: Please describe the natural scenery you see in the following images
#           - type: image
#             url: https://huggingface.co/datasets/YiYiXu/testing-images/resolve/main/seashore.png
#           - type: image
#             url: https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/inpaint.png

karljang and others added 30 commits September 19, 2025 08:42
…#7393)

Signed-off-by: greg-kwasniewski1 <[email protected]>
Signed-off-by: Grzegorz Kwasniewski <[email protected]>
…ing external web data pulls (NVIDIA#7879)

Signed-off-by: Chang Liu (Enterprise Products) <[email protected]>
… shape for sm10x group gemm (NVIDIA#7757)

Signed-off-by: Xiwen Yu <[email protected]>
Signed-off-by: djns99 <[email protected]>
Co-authored-by: djns99 <[email protected]>
… Workflow (NVIDIA#7808)

Signed-off-by: Stefan Niebler <[email protected]>
Co-authored-by: Daniel Cámpora <[email protected]>
…er and graph attn metadata (NVIDIA#7606)

Signed-off-by: Hui Gao <[email protected]>
Signed-off-by: Wangshanshan <[email protected]>
…7560)

Signed-off-by: Yan Chunwei <[email protected]>
Co-authored-by: Ryan McCormick <[email protected]>
Signed-off-by: Wangshanshan <[email protected]>
…el with multiple layer types (NVIDIA#7636)

Signed-off-by: Balaram Buddharaju <[email protected]>
Signed-off-by: Wangshanshan <[email protected]>
lucaslie and others added 10 commits October 3, 2025 15:39
…pattern matcher utility; remove fuse_collective (NVIDIA#7545)

Signed-off-by: Frida Hou <[email protected]>
Signed-off-by: Fridah-nv <[email protected]>
…DIA#5543)

Signed-off-by: Yan Chunwei <[email protected]>
Signed-off-by: chunweiy <[email protected]>
Signed-off-by: Superjomn <[email protected]>
Signed-off-by: chunweiy <[email protected]>
…_multi_lora, fix its API use with pytorch flow LoRA (NVIDIA#8146)

Signed-off-by: Amit Zuker <[email protected]>
@lucaslie lucaslie force-pushed the ll/subgraphs branch 6 times, most recently from 5bf9cde to 024fd2f Compare October 8, 2025 20:49
Signed-off-by: Lucas Liebenwein <[email protected]>
@lucaslie
Copy link
Collaborator Author

lucaslie commented Oct 9, 2025

see NVIDIA#8203

@lucaslie lucaslie closed this Oct 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.