[NPU] cherry-pick gc/dataloader/save&load/optimization from ascendrc to develop #32294

zhiqiu · 2021-04-15T06:57:12Z

PR types

Others

PR changes

Others

Describe

[NPU] cherry-pick gc/dataloader/save&load/optimization from ascendrc to develop

* support GarbageCollector for npu * fix typo * fix gather_grad * disable NPUDefaultStreamGarbageCollector on NPU

* support npu for memcpy op * add ut * fix ut * fix typo

* support npu profiler * add python api * fix bugs * add wrapper for incomplete type * update profile proto * record npu wait * add xpu placeholder

…dle#31956) * enable async copy and add wait before sync operation * remove unneccessary wait * add FillNpuTensorWithConstant * refine * fix fill_constant * make TensorFromVector/TensorToVector sync

* fix npu kernel of cast op to handle casting to same dtype * add comments

* fix compile problem on cann 20.3 * fix ut * fix test_mul * fix check_finite_and_scale * fix lookup_table_v2_grad * fix cmake * support print op

* support save load for NPU * add save load npu unittest * support np.array transform in NPU * fix errors * delete dygraph in unittest * add Wait * fix unittest * fix review comment * fix unittest problem * fix little problem

…rformance (PaddlePaddle#32196) * change aclrtSynchronizeDevice to aclrtSynchronizeStream for better performace * refine code

* fix NPUDeviceContext in all c++ unittest * refine log Co-authored-by: pangyoki <[email protected]>

paddle-bot-old · 2021-04-15T06:57:14Z

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

…r better performance (PaddlePaddle#31994) * enable async copy and add wait before sync operation * remove unneccessary wait * add FillNpuTensorWithConstant * refine * fix fill_constant * change TensorFromVector to FillNpuTensorWithConstant * fix ignored api * delete extra unittest * fix little error * fix update_loss_scaling_op_npu and check_finite_and_unscale_op_npu * change TensorCopySync to TensorCopy * delete useless Wait and add StreamWait * fix npu_stream error * fix check_finite_and_unscale_op_npu TensorCopy * only save stream wait * fix NPUDeviceContext in all c++ unittest * delete wait Co-authored-by: zhiqiu <[email protected]>

wanghuancoder

LGTM

zhiqiu and others added 14 commits April 15, 2021 03:28

[NPU] support GarbageCollector for npu (PaddlePaddle#31874)

65a680a

* support GarbageCollector for npu * fix typo * fix gather_grad * disable NPUDefaultStreamGarbageCollector on NPU

[NPU] support npu for memcpy op (PaddlePaddle#31808)

f5c50b5

* support npu for memcpy op * add ut * fix ut * fix typo

【NPU】fix bug of using temp vector (PaddlePaddle#31963)

3489ae0

fix bug when beta1_pow on cpu (PaddlePaddle#31995)

4668f1e

[NPU] support npu profiler (PaddlePaddle#31684)

2b4d669

* support npu profiler * add python api * fix bugs * add wrapper for incomplete type * update profile proto * record npu wait * add xpu placeholder

fix adam (PaddlePaddle#32016)

7c05f46

[NPU] enable async copy and add wait before sync operation (PaddlePad…

8077a33

…dle#31956) * enable async copy and add wait before sync operation * remove unneccessary wait * add FillNpuTensorWithConstant * refine * fix fill_constant * make TensorFromVector/TensorToVector sync

[NPU] Support dataloader on npu place. (PaddlePaddle#31867)

467b10f

[NPU] Wait on NPUPlace (PaddlePaddle#32086)

d1a2fad

[NPU] fix cast op (PaddlePaddle#32121)

c55646c

* fix npu kernel of cast op to handle casting to same dtype * add comments

[NPU] support cann 20.3 (PaddlePaddle#32044)

1d699b2

* fix compile problem on cann 20.3 * fix ut * fix test_mul * fix check_finite_and_scale * fix lookup_table_v2_grad * fix cmake * support print op

[NPU] Support npu save load (PaddlePaddle#31893)

1ef2b93

* support save load for NPU * add save load npu unittest * support np.array transform in NPU * fix errors * delete dygraph in unittest * add Wait * fix unittest * fix review comment * fix unittest problem * fix little problem

change aclrtSynchronizeDevice to aclrtSynchronizeStream for better pe…

62d42f0

…rformance (PaddlePaddle#32196) * change aclrtSynchronizeDevice to aclrtSynchronizeStream for better performace * refine code

fix NPUDeviceContext in all c++ unittest (PaddlePaddle#32198)

b046809

* fix NPUDeviceContext in all c++ unittest * refine log Co-authored-by: pangyoki <[email protected]>

pangyoki and others added 8 commits April 15, 2021 07:00

delete useless unittest file (PaddlePaddle#32206)

032a512

Fix op test (PaddlePaddle#32231)

25a2e73

fix conditional block (PaddlePaddle#32243)

a3ac600

fix adam bug again (PaddlePaddle#32246)

9e393e7

fix compile

565fc73

fix ut

eb6a7a1

fix ut

d4bbb4c

zhiqiu requested review from wanghuancoder, phlrain and raindrops2sea April 19, 2021 03:34

wanghuancoder approved these changes Apr 19, 2021

View reviewed changes

zhiqiu requested a review from Xreki April 19, 2021 06:46

phlrain approved these changes Apr 19, 2021

View reviewed changes

raindrops2sea approved these changes Apr 19, 2021

View reviewed changes

zhiqiu merged commit cbe5c9f into PaddlePaddle:develop Apr 19, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[NPU] cherry-pick gc/dataloader/save&load/optimization from ascendrc to develop #32294

[NPU] cherry-pick gc/dataloader/save&load/optimization from ascendrc to develop #32294

zhiqiu commented Apr 15, 2021

paddle-bot-old bot commented Apr 15, 2021

wanghuancoder left a comment

[NPU] cherry-pick gc/dataloader/save&load/optimization from ascendrc to develop #32294

[NPU] cherry-pick gc/dataloader/save&load/optimization from ascendrc to develop #32294

Conversation

zhiqiu commented Apr 15, 2021

PR types

PR changes

Describe

paddle-bot-old bot commented Apr 15, 2021

wanghuancoder left a comment

Choose a reason for hiding this comment