From e6cbcca3e220b3b2ae869055f0771b48958b512b Mon Sep 17 00:00:00 2001
From: TianYuan <white-sky@qq.com>
Date: Fri, 16 Sep 2022 16:23:47 +0800
Subject: [PATCH] fix ERNIE-SAT README, test=doc (#2392)

---
 examples/aishell3/ernie_sat/README.md      | 13 ++++++-------
 examples/aishell3_vctk/ernie_sat/README.md | 13 ++++++-------
 examples/vctk/ernie_sat/README.md          | 11 +++++------
 3 files changed, 17 insertions(+), 20 deletions(-)
diff --git a/examples/aishell3/ernie_sat/README.md b/examples/aishell3/ernie_sat/README.md
index 707ee13814a..eb867ab75ee 100644
--- a/examples/aishell3/ernie_sat/README.md
+++ b/examples/aishell3/ernie_sat/README.md
@@ -1,11 +1,10 @@
-# ERNIE-SAT with AISHELL3 dataset
+# ERNIE-SAT with VCTK dataset
+ERNIE-SAT speech-text joint pretraining framework, which achieves SOTA results in cross-lingual multi-speaker speech synthesis and cross-lingual speech editing tasks, It can be applied to a series of scenarios such as Speech Editing, personalized Speech Synthesis, and Voice Cloning.
 
-ERNIE-SAT 是可以同时处理中英文的跨语言的语音-语言跨模态大模型，其在语音编辑、个性化语音合成以及跨语言的语音合成等多个任务取得了领先效果。可以应用于语音编辑、个性化合成、语音克隆、同传翻译等一系列场景，该项目供研究使用。
-
-## 模型框架
-ERNIE-SAT 中我们提出了两项创新：
-- 在预训练过程中将中英双语对应的音素作为输入，实现了跨语言、个性化的软音素映射
-- 采用语言和语音的联合掩码学习实现了语言和语音的对齐
+## Model Framework
+In ERNIE-SAT, we propose two innovations:
+- In the pretraining process, the phonemes corresponding to Chinese and English are used as input to achieve cross-language and personalized soft phoneme mapping
+- The joint mask learning of speech and text is used to realize the alignment of speech and text
 
 <p align="center">
     <img src="https://user-images.githubusercontent.com/24568452/186110814-1b9c6618-a0ab-4c0c-bb3d-3d860b0e8cc2.png" />
diff --git a/examples/aishell3_vctk/ernie_sat/README.md b/examples/aishell3_vctk/ernie_sat/README.md
index a849488d552..d55af67568d 100644
--- a/examples/aishell3_vctk/ernie_sat/README.md
+++ b/examples/aishell3_vctk/ernie_sat/README.md
@@ -1,11 +1,10 @@
-# ERNIE-SAT with AISHELL3 and VCTK dataset
+# ERNIE-SAT with VCTK dataset
+ERNIE-SAT speech-text joint pretraining framework, which achieves SOTA results in cross-lingual multi-speaker speech synthesis and cross-lingual speech editing tasks, It can be applied to a series of scenarios such as Speech Editing, personalized Speech Synthesis, and Voice Cloning.
 
-ERNIE-SAT 是可以同时处理中英文的跨语言的语音-语言跨模态大模型，其在语音编辑、个性化语音合成以及跨语言的语音合成等多个任务取得了领先效果。可以应用于语音编辑、个性化合成、语音克隆、同传翻译等一系列场景，该项目供研究使用。
-
-## 模型框架
-ERNIE-SAT 中我们提出了两项创新：
-- 在预训练过程中将中英双语对应的音素作为输入，实现了跨语言、个性化的软音素映射
-- 采用语言和语音的联合掩码学习实现了语言和语音的对齐
+## Model Framework
+In ERNIE-SAT, we propose two innovations:
+- In the pretraining process, the phonemes corresponding to Chinese and English are used as input to achieve cross-language and personalized soft phoneme mapping
+- The joint mask learning of speech and text is used to realize the alignment of speech and text
 
 <p align="center">
     <img src="https://user-images.githubusercontent.com/24568452/186110814-1b9c6618-a0ab-4c0c-bb3d-3d860b0e8cc2.png" />
diff --git a/examples/vctk/ernie_sat/README.md b/examples/vctk/ernie_sat/README.md
index 0a2f9359e0a..94c7ae25d7d 100644
--- a/examples/vctk/ernie_sat/README.md
+++ b/examples/vctk/ernie_sat/README.md
@@ -1,11 +1,10 @@
 # ERNIE-SAT with VCTK dataset
+ERNIE-SAT speech-text joint pretraining framework, which achieves SOTA results in cross-lingual multi-speaker speech synthesis and cross-lingual speech editing tasks, It can be applied to a series of scenarios such as Speech Editing, personalized Speech Synthesis, and Voice Cloning.
 
-ERNIE-SAT 是可以同时处理中英文的跨语言的语音-语言跨模态大模型，其在语音编辑、个性化语音合成以及跨语言的语音合成等多个任务取得了领先效果。可以应用于语音编辑、个性化合成、语音克隆、同传翻译等一系列场景，该项目供研究使用。
-
-## 模型框架
-ERNIE-SAT 中我们提出了两项创新：
-- 在预训练过程中将中英双语对应的音素作为输入，实现了跨语言、个性化的软音素映射
-- 采用语言和语音的联合掩码学习实现了语言和语音的对齐
+## Model Framework
+In ERNIE-SAT, we propose two innovations:
+- In the pretraining process, the phonemes corresponding to Chinese and English are used as input to achieve cross-language and personalized soft phoneme mapping
+- The joint mask learning of speech and text is used to realize the alignment of speech and text
 
 <p align="center">
     <img src="https://user-images.githubusercontent.com/24568452/186110814-1b9c6618-a0ab-4c0c-bb3d-3d860b0e8cc2.png" />