[asr] optimize attention cache used for attention ; 0-dim tensor for model export #2124

zh794390558 · 2022-07-08T09:59:55Z

PR types

Performance optimization

PR changes

Models

Describe

优化后相比优化前相对提升：9.9%

conformer_wenetspeech	-	python	attention_rescoring	0.5	16	-1	0.8712686765174202(utts=40)	-
conformer_wenetspeech（reduce attention cache）	-	python	attention_rescoring	0.5	16	-1	0.7847489114800089(utts=40)

using 0-dim tensor (tensor(0,0,0,0) ) to simple to_static conformer model export
fix cli to support scp batch process
add pybind11 to install deps.

Jackwaterveg · 2022-07-11T03:41:51Z

paddlespeech/s2t/modules/encoder.py

-            xs = paddle.cat((subsampling_cache, xs), dim=1)
-        else:
-            cache_size = 0
+        elayers = paddle.shape(att_cache)[0]


subsampling_cache 不用了？

缓存的计算量确实不大，不过可能也可以节省一定时间

合并到att_cache里了，会降低一些计算。

ok，理解了

Jackwaterveg

LGTM

Jackwaterveg

LGTM

zh794390558 added 3 commits July 8, 2022 08:08

refactor attention cache

fb40602

cli batch process support \t

5ca05fe

att cache for streaming asr

e818492

zh794390558 added this to the r1.2.0 milestone Jul 8, 2022

mergify bot added S2T asr/st CLI Server Demo Installation labels Jul 8, 2022

zh794390558 changed the title ~~[asr] optimize attention cache used for attention~~ [asr] optimize attention cache used for attention ; 0-dim tensor for model export Jul 8, 2022

Jackwaterveg reviewed Jul 11, 2022

View reviewed changes

Jackwaterveg approved these changes Jul 11, 2022

View reviewed changes

zh794390558 marked this pull request as ready for review July 12, 2022 08:30

zh794390558 modified the milestones: r1.2.0, r1.1.0 Jul 12, 2022

Jackwaterveg approved these changes Jul 12, 2022

View reviewed changes

zh794390558 merged commit e62cbc4 into PaddlePaddle:develop Jul 12, 2022

zh794390558 deleted the new_api branch July 12, 2022 11:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[asr] optimize attention cache used for attention ; 0-dim tensor for model export #2124

[asr] optimize attention cache used for attention ; 0-dim tensor for model export #2124

zh794390558 commented Jul 8, 2022 •

edited

Loading

Jackwaterveg Jul 11, 2022

Jackwaterveg Jul 11, 2022

zh794390558 Jul 11, 2022

Jackwaterveg Jul 11, 2022

Jackwaterveg left a comment

Jackwaterveg left a comment

[asr] optimize attention cache used for attention ; 0-dim tensor for model export #2124

[asr] optimize attention cache used for attention ; 0-dim tensor for model export #2124

Conversation

zh794390558 commented Jul 8, 2022 • edited Loading

PR types

PR changes

Describe

Jackwaterveg Jul 11, 2022

Choose a reason for hiding this comment

Jackwaterveg Jul 11, 2022

Choose a reason for hiding this comment

zh794390558 Jul 11, 2022

Choose a reason for hiding this comment

Jackwaterveg Jul 11, 2022

Choose a reason for hiding this comment

Jackwaterveg left a comment

Choose a reason for hiding this comment

Jackwaterveg left a comment

Choose a reason for hiding this comment

zh794390558 commented Jul 8, 2022 •

edited

Loading