Zgc/ditorch add process monitor tool #58
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
在不侵入修改训练进程的情况下,周期性记录训练进程host侧和device侧的重要信息,如内存,芯片利用率等信息。
调试过程中一直有监控内存和设备使用率的需求,将使用过程中开发的工具提上来供以后使用
process_monitor_result_camb_pid129082_2024-10-11-11-54-21.csv
process_monitor_result_ascend_pid871524_2024-10-11-11-57-03.csv