Skip to content

Commit

Permalink
Feature #4155 Text To Speech in Python (#36)
Browse files Browse the repository at this point in the history
Co-authored-by: rubynguyen1510 <[email protected]>
  • Loading branch information
Mushmou and rubynguyen1510 authored Aug 3, 2023
1 parent 92ea32b commit 12462c1
Show file tree
Hide file tree
Showing 5 changed files with 685 additions and 0 deletions.
85 changes: 85 additions & 0 deletions python/text-to-speech/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
# 🗣️ Text To Speech with Google, Azure and AWS API

A Python cloud function for text to speech synthesis using [Google](https://cloud.google.com/text-to-speech), [Azure](https://azure.microsoft.com/en-us/products/ai-services/text-to-speech) and [AWS](https://docs.aws.amazon.com/polly/latest/dg/API_SynthesizeSpeech.html).

### Supported Providers and Language Codes
| Providers | Language Code (BCP-47) |
| ----------- | ----------- |
| Google |[Google Language Code](https://cloud.google.com/text-to-speech/docs/voices) |
| Azure |[Azure Language Code](https://learn.microsoft.com/en-us/azure/ai-services/speech-service/language-support?tabs=stt) |
| AWS |[AWS Language Code](https://docs.aws.amazon.com/polly/latest/dg/API_SynthesizeSpeech.html) |

### Example Input:
```json
{
"provider":"<YOUR_PROVIDER_HERE>",
"language":"<YOUR_LANGUAGE_CODE>",
"text":"Hello world!"
}
```
### Example output:
```json
{
"success":true,
"audio_bytes":"iVBORw0KGgoAAAANSUhE...o6Ie+UAAAAASU5CYII="
}
```
### Example error output:
```json
{
"success":false,
"error":"Missing API_KEY"
}
```

## 📝 Environment Variables
List of environment variables used by this cloud function:
- **API_KEY** - Supported with Google, Azure, and AWS.
- **PROJECT_ID** - Supported with Google.
- **SECRET_API_KEY** - Supported with AWS.

| **Google**| **AWS** | **Azure** |
| -------- | -------- | -------- |
|API_KEY | API_KEY | API_KEY
|PROJECT_ID |SECRET_API_KEY|


## 🚀 Deployment

1. Clone this repository, and enter this function folder:

```bash
git clone https://github.com/open-runtimes/examples.git && cd examples
cd python/text-to-speech
```

2. Enter this function folder and build the code:
```bash
docker run --rm --interactive --tty --volume $PWD:/usr/code openruntimes/python:v2-3.10 sh /usr/local/src/build.sh
```
As a result, a `code.tar.gz` file will be generated.

3. Start the Open Runtime:
```bash
docker run -p 3000:3000 -e INTERNAL_RUNTIME_KEY=secret-key -e INTERNAL_RUNTIME_ENTRYPOINT=main.py --rm --interactive --tty --volume $PWD/code.tar.gz:/tmp/code.tar.gz:ro openruntimes/python:v2-3.10 sh /usr/local/src/start.sh
```

> Make sure to replace `YOUR_API_KEY` with your key.
Your function is now listening on port `3000`, and you can execute it by sending `POST` request with appropriate authorization headers. To learn more about runtime, you can visit Python runtime [README](https://github.com/open-runtimes/open-runtimes/tree/main/openruntimes/python:v2-3.10).

4. Run the cURL function to send request.
>Google Curl Example (Supports only API_KEY and PROJECT_ID in Environment Variables)
```bash
curl http://localhost:3000/ -H "X-Internal-Challenge: secret-key" -H "Content-Type: application/json" -d '{"payload": {"provider": "google", "language": "en-US", "text": "Hello World!"}, "variables": {"API_KEY": "<YOUR_API_KEY>", "PROJECT_ID": "<YOUR_PROJECT_ID>"}}'
```
>Azure Curl Example (Supports API_KEY in Environment Variables)
```bash
curl http://localhost:3000/ -H "X-Internal-Challenge: secret-key" -H "Content-Type: application/json" -d '{"payload": {"provider": "azure", "language":"en-US", "text": "Hello World!"}, "variables": {"API_KEY": "<YOUR_API_KEY>"}}'
```
>AWS Curl Example (Supports API_KEY and SECRET_API_KEY in Environment Variables)
```bash
curl http://localhost:3000/ -H "X-Internal-Challenge: secret-key" -H "Content-Type: application/json" -d '{"payload": {"provider": "aws", "language":"en-US", "text":"Hello World!"}, "variables": {"API_KEY": "<YOUR_API_KEY>", "SECRET_API_KEY": "<YOUR_SECRET_API_KEY>"}}'
```
## 📝 Notes
- This function is designed for use with Appwrite Cloud Functions. You can learn more about it in [Appwrite docs](https://appwrite.io/docs/functions).
- This example is compatible with Python 3.10. Other versions may work but are not guaranteed to work as they haven't been tested.
294 changes: 294 additions & 0 deletions python/text-to-speech/main.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,294 @@
"""Synthesize text to speech using Google, Azure and AWS API."""
# Standard library
import abc
import base64

# Third party
import boto3
import requests
from google.cloud import texttospeech


class TextToSpeech():
"""Base class for Text to Speech."""

def __init__(self, req: requests) -> None:
"""Initialize class method."""
self.validate_request(req)

@abc.abstractmethod
def validate_request(self, req: requests) -> None:
"""Abstract validate request method for providers."""

@abc.abstractmethod
def speech(self, text: str, language: str) -> bytes:
"""Abstract speech method for providers."""


class Google(TextToSpeech):
"""Represent the implementation of Google text to speech."""

def validate_request(self, req: requests) -> None:
"""
Validate the request data for Google text to speech.
Input:
req (request): The request provided by the user.
Raises:
ValueError: If any required value is missing or invalid.
"""
if not req.variables.get("API_KEY"):
raise ValueError("Missing API_KEY.")
if not req.variables.get("PROJECT_ID"):
raise ValueError("Missing PROJECT_ID.")
self.api_key = req.variables.get("API_KEY")
self.project_id = req.variables.get("PROJECT_ID")

def speech(self, text: str, language: str) -> bytes:
"""
Convert the given text into speech with the Google text to speech API.
Input:
text: The text to be converted into speech.
language: The language code (BCP-47 format).
Returns:
bytes: The synthezied speech in bytes.
"""
# Instantiate a client.
client = texttospeech.TextToSpeechClient(
client_options={
"api_key": self.api_key,
"quota_project_id": self.project_id,
}
)
# Set the text input to be synthesized.
synthesis_input = texttospeech.SynthesisInput(text=text)
# Build the voice request, select the language code ("en-US")
# and the ssml voice gender is neutral.
voice = texttospeech.VoiceSelectionParams(
language_code=language,
ssml_gender=texttospeech.SsmlVoiceGender.NEUTRAL,
)
# Select the type of audio file you want returned.
audio_config = texttospeech.AudioConfig(
audio_encoding=texttospeech.AudioEncoding.MP3,
)
# Perform the text-to-speech request on the text input
# with the selected voice parameters and audio file type.
response = client.synthesize_speech(
input=synthesis_input,
voice=voice,
audio_config=audio_config,
)
return response.audio_content


class Azure(TextToSpeech):
"""Represent the implementation of Azure text to speech."""

VOICE = "en-US-ChristopherNeural"
GENDER = "Male"
REGION = "westus"
FETCH_TOKEN_URL = (
"https://westus.api.cognitive.microsoft.com/sts/v1.0/issuetoken"
)

def validate_request(self, req: requests) -> None:
"""
Validate the request data for Azure text to speech.
Input:
req (request): The request provided by the user.
Raises:
ValueError: If any required value is missing or invalid.
"""
if not req.variables.get("API_KEY"):
raise ValueError("Missing API_KEY.")
self.api_key = req.variables.get("API_KEY")

def get_token(self, subscription_key: str) -> str:
"""Return an Azure token for a given subscription key."""
headers = {
"Ocp-Apim-Subscription-Key": subscription_key
}
# Send request with subscription key.
response = requests.post(
self.FETCH_TOKEN_URL,
headers=headers,
timeout=10,
)
# Grab access token valid for 10 minutes.
response.raise_for_status()
return response.text

def speech(self, text: str, language: str) -> bytes:
"""
Convert the given text into speech with the Google text to speech API.
Input:
text: The text to be converted into speech.
language: The language code (BCP-47 format).
Returns:
bytes: The synthezied speech in bytes.
"""
# Endpoint for cognitive services speech api
url = (
f"https://{self.REGION}.tts."
"speech.microsoft.com/cognitiveservices/v1"
)
# Headers and auth for request.
headers_azure = {
"Content-type": "application/ssml+xml",
"Authorization": "Bearer " + self.get_token(self.api_key),
"X-Microsoft-OutputFormat": "audio-16khz-32kbitrate-mono-mp3",
}
data_azure = (
f"<speak version='1.0' xml:lang='{language}'><voice "
f"xml:lang='{language}' xml:gender='{self.GENDER}' "
f"name='{self.VOICE}'>{text}</voice></speak>"
)
response = requests.request(
"POST",
url,
headers=headers_azure,
data=data_azure,
timeout=10,
)
response.raise_for_status()
return response.content


class AWS(TextToSpeech):
"""Represent the implementation of AWS text to speech. """

VOICE_ID = "Joanna"
REGION = "us-west-2"

def validate_request(self, req: requests) -> None:
"""
Validate the request data for AWS text to speech.
Input:
req (request): The request provided by the user.
Raises:
ValueError: If any required value is missing or invalid.
"""
if not req.variables.get("API_KEY"):
raise ValueError("Missing API_KEY.")
if not req.variables.get("SECRET_API_KEY"):
raise ValueError("Missing SECRET_API_KEY.")
self.api_key = req.variables.get("API_KEY")
self.secret_api_key = req.variables.get("SECRET_API_KEY")

def speech(self, text: str, language: str) -> bytes:
"""
Converts the given text into speech with the AWS text to speech API.
Input:
text: The text to be converted into speech.
language: The language code (BCP-47 format).
Returns:
bytes: The synthezied speech in bytes.
"""
# Call polly client using boto3.session.
polly_client = boto3.Session(
aws_access_key_id=self.api_key,
aws_secret_access_key=self.secret_api_key,
region_name=self.REGION,
).client("polly")

# Get response from polly client.
response = polly_client.synthesize_speech(
VoiceId=AWS.VOICE_ID,
OutputFormat="mp3",
Text=text,
LanguageCode=language,
)
return response["AudioStream"].read()


def validate_common(req: requests) -> tuple[str, str, str]:
"""
Validate common fields in request.
Input:
req (request): The request provided by the user.
Returns:
(tuple): A tuple containing the text and language from the request.
Raises:
ValueError: If any of the common fields (provider, text, language)
are missing in the request payload.
"""
# Check if the payload is empty.
if not req.payload:
raise ValueError("Missing Payload.")

# Check if variables is empty.
if not req.variables:
raise ValueError("Missing Variables.")

# Check if provider is empty.
if not req.payload.get("provider"):
raise ValueError("Missing Provider.")

# Check if text is empty.
if not req.payload.get("text"):
raise ValueError("Missing Text.")

# Check if language is empty.
if not req.payload.get("language"):
raise ValueError("Missing Language.")

# Return the text and langage.
return (
req.payload.get("provider").lower(),
req.payload.get("text"),
req.payload.get("language"),
)


def main(req: requests, res: str) -> str:

"""
Main Function for Text to Speech.
Input:
req(request): The request from the user.
res(json): The response for the user.
Returns:
(json): JSON representing the success value of the text to speech api
containing the synthesized audio in base64 encoded format.
"""
try:
provider, text, language = validate_common(req)
if provider == "google":
provider_class = Google(req)
elif provider == "azure":
provider_class = Azure(req)
elif provider == "aws":
provider_class = AWS(req)
else:
raise ValueError("Invalid Provider.")
except ValueError as value_error:
return res.json({
"success": False,
"error": str(value_error),
})
try:
audio_bytes = provider_class.speech(text, language)
except Exception as error:
return res.json({
"success": False,
"error": f"{type(error).__name__}: {error}",
})
return res.json({
"success": True,
"audio_bytes": base64.b64encode(audio_bytes).decode(),
})
5 changes: 5 additions & 0 deletions python/text-to-speech/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
boto3==1.28.9
botocore==1.31.9
google-cloud-texttospeech==2.14.1
parameterized==0.9.0
requests==2.31.0
1 change: 1 addition & 0 deletions python/text-to-speech/results.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
//NExAASoAEkAUAAAXwGQ/p/gDhGY/4fADM/x/ADAzMfwfAM6/o/AyB+Y/N4A7pkc+A7AyOeH/AMwZD/0+A4P8/4DgDo/4fADo/w/gBgZ4f4/AR63R+AhfA81AUf47wD//NExAgUiyowAZNoAIHidDXtgtAXBH/EmB2CZkr/4WsZZff/+tOHPLw8P/8S8Yc3RN3Hp//+XEEXQNzM3////qNHMzcwLiCac0/////zM3bWncwNC+boKhAsSKDAV90A//NExAgS4cKcAYNQAM+vABFdTDxmRCqimDwaEZaIkBAUgNs/NLnbGEhMNPzrVIFPvJm/ONfYwkE+Ri5l/+Y11WLnAAXV+n6HjBQmc//+tLaagQESJOt3hyTZiQnjMvw6//NExA8TMWKoAc9IAFYEPV3dawoSzDBkasnJWCooQpTjVuz/P9/zz/3/fWQ/j3LQPH1nZe+++0EQcPlKrkoFWGBCMftpbLOTvFRagKXtjORQvVLmEjQcfnk70D7pU44b//NExBUWuTagAMvMcDtwm6jSR/sJNlQTAxWVzG+QdAJrAomaWUlU/yzNX+T3z//9m1medMBsUYkgeGBEYgFAbSFGO0OOAuQiizvfUOD0uoe6fx3u7JlQ3nZYLMs3UOao//NExA0VgWawAMYMlKw4YVjA12JJyT0fBI0uKzxl83MlCXgQieqwO2nyudv6pLP4Um/zt4973h+2fx+9/T03XRCSZPHJp1AIMjsoo0EPtWz/2F3AQwrvLoMXLesIVzrT//NExAoU8W60AMYQlCR77CH4FFF5QIEOLk6ALq4t1JeV0sKALV+zDpq9luT6/3VHl9Bz/sfzfp8L9+8XBrYdg8DUkB5lUJpkSKowQzVc3A1nctiP/xwx39CitfwVQwyz//NExAkSEVq8AMPMlGLWpsVa02Kdgz46LEOgRyRwm07wNAY0rsvMVTqaNSaNaBLqk1fdtfX/////d23VmoGgthZ1zc1CLJHC5Iwd+lXFogOmaGSeL2E8uLOd1kyECQp+//NExBMRIVK8AHvMlQ+zheivslKjEZU+DDLC2nWswY8a943vjWr43vb43b//vrM9MTpIBgqBA2uf6V1A6h+l3xIBQ8u7JU/BYnzwQGKPgGCQmpwHEcgkLWpzrH6vjRP4//NExCESeWa4AHvKlIeQc2jrfKlxfzTyRZtbrjOldd1/qlHFhpzCYsLsGi1GMPMokZxSeu/S3QHENVuJbVvNaxfC+qgvg1lYykKqXQlx/F4LkujlMU41Cca2Jqc4mA8q//NExCoQqS64AHvOcXi00oapjnJtRf/3Z0UdID4lF2jRXMIZNY3dcKxY4X0aFLncnn0n3LFmGErC9H2iDBLYrRNR/k9KxHHyrTdblCaBtgNCwYDxoh5Io7nL6O/+iW9R//NExDoSESasAMPOcKoPBwiJnnR4YDa///9iVbijphYgkTSaw1ltIChmtbtuzeBsrOSMQDMEwSHw4TiQwJKIqCMTAyENltitougYm95rITet3eZC3DO8SlmflZhSkg+c//NExEQSKVqkAMMElP9P//yq0quA2ohqww/Sw/UlkPxyOUTAkojm5Hgotq1QQFREOmA2uC/BAyCQPl0IJhsL6jyfheXFO5y8v7r7J02CGmoQLHQ2FHtO0////JC9kpDN//NExE4SqRakAMPScCRKAx25MZxaO1owbRoARBXA0NBUDMCC2kWrKyUXIVSDBKAoH4FC70CiFfHJt1BiN05fARmQVcPdKkv////91iPR/qXoEElSkhA0OySG4jFK0yWH//NExFYRYPagAMJScMyF0JMIDZKabcwpclYV40vrriFJiUA6fKsmuIzlQulz27tnbT7qFaOoHLmUBUjQKX////zPbSrEypUxIgujlFb0Tuy9/C+lWcqAYWnLdWNI/kta//NExGMRmQqcAMJYcLnXtaUtjdQNLSbQWGdFmSN5azCHw74RtZxtSY4yJlAYIgdH////Q7///0V+DkRhYkxqkghyLSmH5lyn/hxaroW4NooZzz1+HbM/s8/8bV+1eCpd//NExG8RqRKYAMPScOeDQBSwJiwJAqPhVEySSHCAUpszVSn2xcMjCRUBHTVn////pvX29P+UPPDM4HNw/TEZVLxQGpgXdRNjmgcfaU02He+SJPJVIcJhz54mlxax8Lmp//NExHsUQR6QAVhIACczCuCXAKIZYXQTwE/AgKzIyWcF8DOBGi6J6MMLQXygkPUexk7VLMiEamwww9lj2JUpGxK0tfmrLrSqrQuzf/91oqN0klHupa0Eta//1MnP9kTY//NExH0hEqJ4AZpoAdb/bIFpG9ZVgkPqlevVgL5Om9klcaBqFjLNI5RxmxKalnDHdnn/tMvtZtMc5Nevu7ekBeYHYRUIxFqupOqVicgAsbol8L5rXO29ad0wct7aU91A//NExEsbMWoUAdhgAGtcIFUxp9qDKHvUTbKkhEQJE6mXVi1Jo8oDXLJJUKKYp61BEa1KIwLTSqjjMMxqmrTCrDVCpUKoHway4xry8rjHfHNj8lKcSElSCwqRKCoVPQoc//NExDEYwUYAAMJScJaWBIEg0TJiklrylcY+TSjxUioGn15INPLA1iIeCrhMDT+uIpGdeInnmDXM5Iqtj6gaCYhmCuKPnnWWtyq4hGg+XJD7LTKIhLHF2NyeSiqtdXJV//NExCESkbWMAGJElCTXhrOxioh2//oq/KiqjtuUyKn/3KGBgwTMmQEK/xUW/+Kijf6haoWFVUxBTTXYUNNntBQ28i7Tc/GCN14Jo3UyUTcUPONmtTU0xF9lFfBVi5jj//NExCkAAANIAAAAAF5UP5IJoGgaCohx+SxSiw7F3fQXPuixe0kFAeGFnkGIji9wn/yWLDv6Abn5/oiIiIn//T4iJXcO9dw4GLeaF13d3Qrz+Cw8PH+8/4AH8AQnAABH//NExHwAAANIAAAAAIAYe072jOCP8zxCBSiPGaUSFnSoXydOnhkDLKzVx9ejtbWzWtGjkgVKZkCDo6C/j/4q5LBRR++nDZYbjvm2VZFRfG7hdWSy/Oacyi/v//lRJuX+//NExMwH4CQAAPe8AEuqxZZf4+F5V3m/Kb/6/GoAqLBUhFduI/76O27EYo8cL+NGlM7PUnCRQo8y4djSij4dpOEgQEfDtqiL/9nKinb5TBlI7OVP/+YoYGDIfULC4rFR//NExP8Y6f3sAHoGmVEj//rFcVFjVbP4qKcWTEFNRTMuMTAwqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq//NExO4V0LIAAHpMTaqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq//NExOkUQbmYAMGElKqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
Loading

0 comments on commit 12462c1

Please sign in to comment.