Skip to content

[Bugfix][NPU][XPU] Use platform-aware profiler activities for trace generation#1542

Closed
lishunyang12 wants to merge 3 commits into
vllm-project:mainfrom
lishunyang12:fix/npu-profiler-activity
Closed

[Bugfix][NPU][XPU] Use platform-aware profiler activities for trace generation#1542
lishunyang12 wants to merge 3 commits into
vllm-project:mainfrom
lishunyang12:fix/npu-profiler-activity

Conversation

@lishunyang12
Copy link
Copy Markdown
Collaborator

Summary

  • The diffusion TorchProfiler hardcodes ProfilerActivity.CUDA, which fails on NPU (Ascend) devices since CUDA activity is not available there.
  • This extracts activity selection into a helper that checks current_omni_platform.device_type and uses ProfilerActivity.NPU (provided by torch_npu) on NPU devices, falling back to ProfilerActivity.CUDA otherwise.

Fixes #1484

Test plan

  • On NPU: profiler should now start without error and export_chrome_trace should produce a valid trace file.
  • On CUDA: no behavior change — ProfilerActivity.CUDA is still used.

cc @gcanlin

@gcanlin
Copy link
Copy Markdown
Collaborator

gcanlin commented Feb 27, 2026

Thanks! I will test it on NPU.

@lishunyang12 lishunyang12 force-pushed the fix/npu-profiler-activity branch from 352bb61 to 3b5b321 Compare February 27, 2026 15:03

activities = [ProfilerActivity.CPU]
device_type = current_omni_platform.device_type
if device_type == "npu":
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it support other platforms(rocm, xpu)? Does it require adaptation?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ROCm uses \ in PyTorch's profiler API, so the else branch already covers it. XPU support can be added when needed — this PR is scoped to fix #1484.

Copy link
Copy Markdown
Collaborator Author

@lishunyang12 lishunyang12 Feb 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point — I'll add XPU support too. ROCm already works with the CUDA fallback since PyTorch maps it to ProfilerActivity.CUDA.

@lishunyang12
Copy link
Copy Markdown
Collaborator Author

@xuechendi PTAL

@lishunyang12 lishunyang12 changed the title [Bugfix][NPU] Use platform-aware profiler activities for trace generation [Bugfix][NPU][XPU] Use platform-aware profiler activities for trace generation Feb 28, 2026
Copy link
Copy Markdown
Contributor

@xuechendi xuechendi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

activities.append(getattr(ProfilerActivity, "NPU"))
elif device_type == "xpu":
# Intel XPU support
activities.append(getattr(ProfilerActivity, "XPU"))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is how we did in vLLM main repo

TorchProfilerActivity = Literal["CPU", "CUDA", "XPU"]
TorchProfilerActivityMap = {
    "CPU": torch.profiler.ProfilerActivity.CPU,
    "CUDA": torch.profiler.ProfilerActivity.CUDA,
    "XPU": torch.profiler.ProfilerActivity.XPU,
}

Current codes with getattr also works. Thanks for adding XPU

@david6666666
Copy link
Copy Markdown
Collaborator

please fix DCO

@gcanlin gcanlin mentioned this pull request Feb 28, 2026
5 tasks
@JustQJ
Copy link
Copy Markdown
Contributor

JustQJ commented Feb 28, 2026

Hi, I still encounter error

[Stage-0] INFO 02-28 08:59:07 [diffusion_engine.py:227] Starting diffusion profiling → /mnt/deepseek/cloudide/tpcode/profile/stage_0_diffusion_1772269147*.json
[Stage-0] INFO 02-28 08:59:07 [torch_profiler.py:66] [Rank 0] Starting End-to-End Torch profiler

Processed prompts:   0%|                        [Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:684] Error executing method 'start_profile'. This might cause issues in distributed execution.                                                                                 | 0/1 [00:00<?, ?it/s, est
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:684] Traceback (most recent call last):
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:684]   File "/mnt/deepseek/cloudide/tpcode/omni-qwen2512/vllm_omni/diffusion/worker/diffusion_worker.py", line 680, in execute_method
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:684]     return func(*args, **kwargs)
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:684]            ^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:684]   File "/mnt/deepseek/cloudide/tpcode/omni-qwen2512/vllm_omni/diffusion/worker/diffusion_worker.py", line 172, in start_profile
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:684]     return CurrentProfiler.start(trace_path_template)
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:684]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:684]   File "/mnt/deepseek/cloudide/tpcode/omni-qwen2512/vllm_omni/diffusion/profiler/torch_profiler.py", line 90, in start
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:684]     activities=_get_profiler_activities(),
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:684]                ^^^^^^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:684]   File "/mnt/deepseek/cloudide/tpcode/omni-qwen2512/vllm_omni/diffusion/profiler/torch_profiler.py", line 25, in _get_profiler_activities
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:684]     activities.append(getattr(ProfilerActivity, "NPU"))
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:684]                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:684] AttributeError: type object 'torch._C._profiler.ProfilerActivity' has no attribute 'NPU'
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:401] Error executing RPC: type object 'torch._C._profiler.ProfilerActivity' has no attribute 'NPU'
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:401] Traceback (most recent call last):
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:401]   File "/mnt/deepseek/cloudide/tpcode/omni-qwen2512/vllm_omni/diffusion/worker/diffusion_worker.py", line 398, in execute_rpc
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:401]     result = self.worker.execute_method(method, *args, **kwargs)
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:401]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:401]   File "/mnt/deepseek/cloudide/tpcode/omni-qwen2512/vllm_omni/diffusion/worker/diffusion_worker.py", line 685, in execute_method
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:401]     raise e
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:401]   File "/mnt/deepseek/cloudide/tpcode/omni-qwen2512/vllm_omni/diffusion/worker/diffusion_worker.py", line 680, in execute_method
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:401]     return func(*args, **kwargs)
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:401]            ^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:401]   File "/mnt/deepseek/cloudide/tpcode/omni-qwen2512/vllm_omni/diffusion/worker/diffusion_worker.py", line 172, in start_profile
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:401]     return CurrentProfiler.start(trace_path_template)
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:401]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:401]   File "/mnt/deepseek/cloudide/tpcode/omni-qwen2512/vllm_omni/diffusion/profiler/torch_profiler.py", line 90, in start
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:401]     activities=_get_profiler_activities(),
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:401]                ^^^^^^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:401]   File "/mnt/deepseek/cloudide/tpcode/omni-qwen2512/vllm_omni/diffusion/profiler/torch_profiler.py", line 25, in _get_profiler_activities
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:401]     activities.append(getattr(ProfilerActivity, "NPU"))
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:401]                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:401] AttributeError: type object 'torch._C._profiler.ProfilerActivity' has no attribute 'NPU'
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:430] Error processing RPC: type object 'torch._C._profiler.ProfilerActivity' has no attribute 'NPU'
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:430] Traceback (most recent call last):
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:430]   File "/mnt/deepseek/cloudide/tpcode/omni-qwen2512/vllm_omni/diffusion/worker/diffusion_worker.py", line 426, in worker_busy_loop
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:430]     result, should_reply = self.execute_rpc(msg)
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:430]                            ^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:430]   File "/mnt/deepseek/cloudide/tpcode/omni-qwen2512/vllm_omni/diffusion/worker/diffusion_worker.py", line 402, in execute_rpc
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:430]     raise e
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:430]   File "/mnt/deepseek/cloudide/tpcode/omni-qwen2512/vllm_omni/diffusion/worker/diffusion_worker.py", line 398, in execute_rpc
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:430]     result = self.worker.execute_method(method, *args, **kwargs)
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:430]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:430]   File "/mnt/deepseek/cloudide/tpcode/omni-qwen2512/vllm_omni/diffusion/worker/diffusion_worker.py", line 685, in execute_method
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:430]     raise e
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:430]   File "/mnt/deepseek/cloudide/tpcode/omni-qwen2512/vllm_omni/diffusion/worker/diffusion_worker.py", line 680, in execute_method
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:430]     return func(*args, **kwargs)
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:430]            ^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:430]   File "/mnt/deepseek/cloudide/tpcode/omni-qwen2512/vllm_omni/diffusion/worker/diffusion_worker.py", line 172, in start_profile
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:430]     return CurrentProfiler.start(trace_path_template)
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:430]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:430]   File "/mnt/deepseek/cloudide/tpcode/omni-qwen2512/vllm_omni/diffusion/profiler/torch_profiler.py", line 90, in start
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:430]     activities=_get_profiler_activities(),
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:430]                ^^^^^^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:430]   File "/mnt/deepseek/cloudide/tpcode/omni-qwen2512/vllm_omni/diffusion/profiler/torch_profiler.py", line 25, in _get_profiler_activities
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:430]     activities.append(getattr(ProfilerActivity, "NPU"))
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:430]                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Stage-0] ERROR 02-28 08:59:07 [diffusion_worker.py:430] AttributeError: type object 'torch._C._profiler.ProfilerActivity' has no attribute 'NPU'
[Stage-0] INFO 02-28 08:59:07 [omni_stage.py:805] [Stage-0] Diffusion Torch profiler started

@lishunyang12 lishunyang12 force-pushed the fix/npu-profiler-activity branch from ac733e2 to c9961c9 Compare February 28, 2026 12:50
@lishunyang12
Copy link
Copy Markdown
Collaborator Author

@JustQJ Please tried it again. I made some changes according to your bug results. :)

@JustQJ
Copy link
Copy Markdown
Contributor

JustQJ commented Mar 2, 2026

@JustQJ Please tried it again. I made some changes according to your bug results. :)

Hi, In my test, hasattr(ProfilerActivity, "NPU") is still false after importing torch_npu.

>>> import torch
/usr/local/python3.11.14/lib/python3.11/site-packages/torch_npu/__init__.py:309: UserWarning: On the interactive interface, the value of TASK_QUEUE_ENABLE is set to 0 by default.                      Do not set it to 1 to prevent some unknown errors
  warnings.warn("On the interactive interface, the value of TASK_QUEUE_ENABLE is set to 0 by default. \
>>> import torch_npu
>>> from torch.profiler import ProfilerActivity
>>> ProfilerActivity.CPU
<ProfilerActivity.CPU: 0>
>>> ProfilerActivity.CUDA
<ProfilerActivity.CUDA: 2>
>>> ProfilerActivity.NPU
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: type object 'torch._C._profiler.ProfilerActivity' has no attribute 'NPU'. Did you mean: 'CPU'?
>>> hasattr(ProfilerActivity, "NPU")
False

my env

accelerate                        1.12.0
aenum                             3.1.16
aiofiles                          24.1.0
aiohappyeyeballs                  2.6.1
aiohttp                           3.13.3
aiosignal                         1.4.0
annotated-doc                     0.0.4
annotated-types                   0.7.0
anthropic                         0.71.0
antlr4-python3-runtime            4.9.3
anyio                             4.12.1
arctic_inference                  0.1.1
asc_op_compile_base               0.1.0
asc_opc_tool                      0.1.0
astor                             0.8.1
attrs                             25.4.0
audioread                         3.1.0
auto_tune                         0.1.0
blake3                            1.0.8
blinker                           1.9.0
brotli                            1.2.0
cache_dit                         1.2.0
cachetools                        6.2.6
cbor2                             5.8.0
certifi                           2026.1.4
cffi                              2.0.0
charset-normalizer                3.4.4
click                             8.3.1
cloudpickle                       3.1.2
cmake                             4.2.1
coloredlogs                       15.0.1
compressed-tensors                0.13.0
cryptography                      46.0.4
dataflow                          0.0.1
decorator                         5.2.1
depyf                             0.20.0
diffusers                         0.36.0
dill                              0.4.1
diskcache                         5.6.3
distro                            1.9.0
dnspython                         2.8.0
docstring_parser                  0.17.0
einops                            0.8.2
email-validator                   2.3.0
es_math                           1.0.0
fastapi                           0.123.10
fastapi-cli                       0.0.20
fastapi-cloud-cli                 0.11.0
fastar                            0.8.0
ffmpy                             1.0.0
filelock                          3.20.3
Flask                             3.1.2
flatbuffers                       25.12.19
frozenlist                        1.8.0
fsspec                            2026.1.0
ge-py                             0.0.1
gguf                              0.17.1
gradio                            5.50.0
gradio_client                     1.14.0
groovy                            0.1.2
grpcio                            1.76.0
grpcio-reflection                 1.76.0
h11                               0.16.0
h2                                4.3.0
hccl                              0.1.0
hf-xet                            1.2.0
hpack                             4.1.0
httpcore                          1.0.9
httptools                         0.7.1
httpx                             0.28.1
httpx-sse                         0.4.3
huggingface-hub                   0.36.0
humanfriendly                     10.0
Hypercorn                         0.18.0
hyperframe                        6.1.0
idna                              3.11
ijson                             3.4.0.post0
ImageIO                           2.37.2
imageio-ffmpeg                    0.6.0
importlib_metadata                8.7.1
interegular                       0.3.3
itsdangerous                      2.2.0
Jinja2                            3.1.6
jiter                             0.12.0
jmespath                          1.1.0
joblib                            1.5.3
jsonschema                        4.26.0
jsonschema-specifications         2025.9.1
lark                              1.2.2
lazy_loader                       0.4
librosa                           0.11.0
llguidance                        1.3.0
llm_datadist                      0.0.1
llm_datadist_v1                   0.0.1
llvmlite                          0.46.0
lm-format-enforcer                0.11.3
loguru                            0.7.3
markdown-it-py                    4.0.0
MarkupSafe                        3.0.3
mcp                               1.26.0
mdurl                             0.1.2
mindiesd                          2.3.0
mistral_common                    1.9.0
model-hosting-container-standards 0.1.13
modelscope                        1.34.0
more-itertools                    10.8.0
mpmath                            1.3.0
msgpack                           1.1.2
msgspec                           0.20.0
msobjdump                         0.1.0
mspti                             0.0.1
multidict                         6.7.1
networkx                          3.6.1
ninja                             1.13.0
numba                             0.63.1
numpy                             2.3.5
omegaconf                         2.3.0
onnxruntime-cann                  1.23.2
op_compile_tool                   0.1.0
op_gen                            0.1
op_test_frame                     0.1
opc_tool                          0.1.0
openai                            2.16.0
openai-harmony                    0.0.8
openai-whisper                    20250625
opencv-python-headless            4.13.0.92
orjson                            3.11.7
outlines_core                     0.2.11
packaging                         26.0
pandas                            2.3.3
pandas-stubs                      2.3.3.260113
partial-json-parser               0.2.1.1.post7
pillow                            11.3.0
pip                               25.3
platformdirs                      4.5.1
pooch                             1.8.2
prettytable                       3.17.0
priority                          2.0.0
prometheus_client                 0.24.1
prometheus-fastapi-instrumentator 7.1.0
propcache                         0.4.1
protobuf                          6.33.5
psutil                            7.2.2
py-cpuinfo                        9.0.0
pybase64                          1.4.3
pybind11                          3.0.1
pycountry                         24.6.1
pycparser                         3.0
pydantic                          2.12.3
pydantic_core                     2.41.4
pydantic-extra-types              2.11.0
pydantic-settings                 2.12.0
pydub                             0.25.1
Pygments                          2.19.2
PyJWT                             2.10.1
python-dateutil                   2.9.0.post0
python-dotenv                     1.2.1
python-json-logger                4.0.0
python-multipart                  0.0.22
pytz                              2025.2
PyYAML                            6.0.3
pyzmq                             27.1.0
Quart                             0.20.0
ray                               2.48.0
referencing                       0.37.0
regex                             2026.1.15
requests                          2.32.5
resampy                           0.4.3
rich                              14.3.1
rich-toolkit                      0.17.1
rignore                           0.7.6
rpds-py                           0.30.0
ruff                              0.15.4
safehttpx                         0.1.7
safetensors                       0.7.0
schedule_search                   0.0.1
scikit-learn                      1.8.0
scipy                             1.17.0
semantic-version                  2.10.0
sentencepiece                     0.2.1
sentry-sdk                        2.51.0
setproctitle                      1.3.7
setuptools                        79.0.1
setuptools-scm                    9.2.2
shellingham                       1.5.4
show_kernel_debug_data            0.1.0
six                               1.17.0
sniffio                           1.3.1
soundfile                         0.13.1
sox                               1.5.0
soxr                              1.0.0
sse-starlette                     3.2.0
starlette                         0.50.0
superkernel                       0.1.0
supervisor                        4.3.0
sympy                             1.14.0
te                                0.4.0
threadpoolctl                     3.6.0
tiktoken                          0.12.0
tokenizers                        0.22.2
tomlkit                           0.13.3
torch                             2.9.0+cpu
torch_npu                         2.9.0
torchaudio                        2.9.0
torchsde                          0.2.6
torchvision                       0.24.0+cpu
tqdm                              4.67.1
trampoline                        0.1.2
transformers                      4.57.6
triton                            3.6.0
triton-ascend                     3.2.0
typer                             0.21.1
types-pytz                        2025.2.0.20251108
typing_extensions                 4.15.0
typing-inspection                 0.4.2
tzdata                            2025.3
urllib3                           2.6.3
uvicorn                           0.40.0
uvloop                            0.22.1
watchfiles                        1.1.1
wcwidth                           0.6.0
websockets                        15.0.1
Werkzeug                          3.1.5
wheel                             0.46.3
wsproto                           1.3.2
xgrammar                          0.1.29
yarl                              1.22.0
zipp                              3.23.0

@david6666666
Copy link
Copy Markdown
Collaborator

@lishunyang12 @JustQJ any progress?

…eneration

Signed-off-by: lishunyang <lishunyang12@163.com>
@lishunyang12 lishunyang12 force-pushed the fix/npu-profiler-activity branch from c9961c9 to f433b1a Compare March 4, 2026 16:32
@lishunyang12
Copy link
Copy Markdown
Collaborator Author

@JustQJ The crash is fixed — the hasattr check prevents the AttributeError you hit. On your env, profiling will fall back to CPU-only (trace file is still generated, just without NPU kernel data).

The reason ProfilerActivity.NPU doesn't appear is likely your torch 2.9.0+cpu build — torch_npu can't patch ProfilerActivity on a CPU-only torch. With a proper NPU torch build, import torch_npu should register ProfilerActivity.NPU.

@david6666666 DCO is fixed, squashed into one commit. Ready to go.

@lishunyang12
Copy link
Copy Markdown
Collaborator Author

lishunyang12 commented Mar 12, 2026

Should we accelerate this PR? @david6666666 @gcanlin

@Gaohan123 Gaohan123 added this to the v0.18.0 milestone Mar 14, 2026
@Gaohan123 Gaohan123 added the ready label to trigger buildkite CI label Mar 14, 2026
@gcanlin
Copy link
Copy Markdown
Collaborator

gcanlin commented Mar 18, 2026

@lishunyang12 Thanks for your contribution. #1261 is integrating NPU activities. NPU needs to use torch_npu api to get the activities after my check. Please take a look. I will add co-author in #1261 for you :)

@lishunyang12
Copy link
Copy Markdown
Collaborator Author

@lishunyang12 Thanks for your contribution. #1261 is integrating NPU activities. NPU needs to use torch_npu api to get the activities after my check. Please take a look. I will add co-author in #1261 for you :)

Thanks @gcanlin

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready label to trigger buildkite CI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug][NPU]: When I use an offline script for profile analysis, I am unable to generate a trace file.

7 participants