Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 25 additions & 0 deletions .kokoro/functions/speech-to-speech.cfg
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# Format: //devtools/kokoro/config/proto/build.proto

# Set the folder in which the tests are run
env_vars: {
key: "PROJECT"
value: "functions/speech-to-speech"
}

# Tell the trampoline which build file to use.
env_vars: {
key: "TRAMPOLINE_BUILD_FILE"
value: "github/nodejs-docs-samples/.kokoro/build.sh"
}

# Environment values for tests that Kokoro doesn't provide natively
env_vars: {
key: "OUTPUT_BUCKET"
value: "6fa8d42c-a0f5-474e-a52b-687eb54c3f01"
}

env_vars: {
key: "SUPPORTED_LANGUAGE_CODES"
value: "en,es,fr"
}

16 changes: 16 additions & 0 deletions functions/speech-to-speech/.gcloudignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# This file specifies files that are *not* uploaded to Google Cloud Platform
# using gcloud. It follows the same syntax as .gitignore, with the addition of
# "#!include" directives (which insert the entries of the given .gitignore-style
# file at that point).
#
# For more information, run:
# $ gcloud topic gcloudignore
#
.gcloudignore
# If you would like to upload your .git directory, .gitignore file or files
# from your .gitignore file, remove the corresponding line
# below:
.git
.gitignore

node_modules
1 change: 1 addition & 0 deletions functions/speech-to-speech/.nvmrc
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
v6.14.4
127 changes: 127 additions & 0 deletions functions/speech-to-speech/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,127 @@
# Speech-to-Speech Translation Sample

The Speech-to-Speech Translation sample uses the [Speech-to-Text][1],
[Translation][2], and [Text-to-Speech][3] APIs to translate an audio message to
another language. The sample uses [Google Cloud Functions][4] to wrap up the
calls to the APIs to show how you can incrementally add features to your
existing apps, whether they're hosted on Google Cloud Platform or not.
The sample receives the input audio message as b64-encoded text and drops the
translated audio messages to [Google Cloud Storage][5] where existing apps can
retrieve them.

## Prerequisites

Before using the sample app, make sure that you have the following
prerequisites:

* A [Google Cloud Platform][0] (GCP) account with the following APIs enabled:
* Cloud Speech API
* Cloud Text-to-Speech API
* Cloud Translation API
* An API key file for a service account that has permissions to use the APIs
mentioned in the previous prerequisite. For more information, see [Using API
Keys][8].
* [Node Version Manager][6] (NVM)

## Configuring the sample

To configure the sample you must declare the required environment variables, set
up NVM, and install the [Cloud Functions Node.js emulator][7].

The sample requires the following environment variables:

* `GCF_REGION`: The region where your Cloud Function is deployed. For available
regions, see [Cloud Functions Locations][11] in the Functions documentation.
* `GOOGLE_CLOUD_PROJECT`: The project id of your GCP project.
* `BASE_URL`: The URL that the Cloud Functions emulator uses to serve requests.
* `OUTPUT_BUCKET`: A bucket that the sample uses to drop translated files. The
test script creates this buckt if it doesn't exist.
* `GOOGLE_APPLICATION_CREDENTIALS`: The path to your API key file.
* `SUPPORTED_LANGUAGE_CODES`: Comma-separated list of languages that the sample
translates messages to.

Use the following commands to declare the required environment variables:

```
export GCF_REGION=us-central1
export GOOGLE_CLOUD_PROJECT=[your-GCP-project-id]
export BASE_URL=http://localhost:8010/$GOOGLE_CLOUD_PROJECT/$GCF_REGION
export OUTPUT_BUCKET=[your-Google-Cloud-Storage-bucket]
export GOOGLE_APPLICATION_CREDENTIALS=[path-to-your-API-key-file]
export SUPPORTED_LANGUAGE_CODES=en,es,fr
```

The sample includes an `.nvmrc` file that declares the version of Node.js that
you should use to run the app.
Run the following commands to set up NVM to work with the Node.js version
declared in the `.nvmrc` file:

```
nvm install && nvm use
```

Run the following commands to install and start the Cloud Functions emulator:

```
npm install -g @google-cloud/functions-emulator
functions-emulator start
```

## Running the tests

The test script performs the following tasks:

1. Runs the linter.
1. Deploys the function to the emulator.
1. Runs tests that don't perform any calls to the Google Cloud APIs.
1. Creates the output bucket if it doesn't exist.
1. Runs tests that perform calls to the Google Cloud APIs and drop the
translated messages to the bucket.
1. Deletes the files created during the tests.

To run the tests, use the following commands from the
`functions/speech-to-speech` folder:

```
npm install && npm test
```

## Sending a request to the emulator

Once the tests have run, you can send a request to the emulator using an HTTP
tool, such as [curl][10]. Before sending a request, make sure that the
`OUTPUT_BUCKET` environment variable points to an existing bucket. If you update
the environment variables, you must restart the emulator to apply the new
values. Use the following commands to restart the emulator:

```
functions-emulator restart
```

The sample includes a `test/request-body.json` file that includes a JSON object
that represents the body of a valid request, including the base64-encoded audio
message. Run the following command to send a request to the emulator:

```
curl --request POST --header "Content-Type:application/json" \
--data @test/request-body.json $BASE_URL/speechTranslate
```

The command returns a JSON object with information about the translated message.
You can also see the logs using the following command:

```
functions-emulator logs read
```

[0]: https://cloud.google.com
[1]: https://cloud.google.com/speech-to-text/
[2]: https://cloud.google.com/translate/
[3]: https://cloud.google.com/text-to-speech/
[4]: https://cloud.google.com/functions/
[5]: https://cloud.google.com/storage/
[6]: https://github.com/creationix/nvm
[7]: https://cloud.google.com/functions/docs/emulator
[8]: https://cloud.google.com/docs/authentication/api-keys
[10]: https://curl.haxx.se/
[11]: https://cloud.google.com/functions/docs/locations
177 changes: 177 additions & 0 deletions functions/speech-to-speech/index.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,177 @@
/**
* Copyright 2018, Google, LLC
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

'use strict';

// This sample uses the UUID library to generate the output filename.
const uuid = require('uuid/v4');

const googleCloudProject = process.env.GOOGLE_CLOUD_PROJECT;
const supportedLanguageCodes = process.env.SUPPORTED_LANGUAGE_CODES.split(',');
const outputBucket = process.env.OUTPUT_BUCKET;
const outputAudioEncoding = 'MP3';
const voiceSsmlGender = 'NEUTRAL';
// Declare the API clients as global variables to allow them to initiaze at cold start.
const speechToTextClient = getSpeechToTextClient();
const textTranslationClient = getTextTranslationClient();
const textToSpeechClient = getTextToSpeechClient();
const storageClient = getStorageClient();

exports.speechTranslate = (request, response) => {
let responseBody = {};

validateRequest(request).then(() => {
const inputEncoding = request.body.encoding;
const inputSampleRateHertz = request.body.sampleRateHertz;
const inputLanguageCode = request.body.languageCode;
const inputAudioContent = request.body.audioContent;

console.log(`Input encoding: ${inputEncoding}`);
console.log(`Input sample rate hertz: ${inputSampleRateHertz}`);
console.log(`Input language code: ${inputLanguageCode}`);

return callSpeechToText(
inputAudioContent,
inputEncoding,
inputSampleRateHertz,
inputLanguageCode
);
}).then(data => {
const sttResponse = data[0];
// The data object contains one or more recognition alternatives ordered by accuracy.
const transcription = sttResponse.results
.map(result => result.alternatives[0].transcript)
.join('\n');
responseBody.transcription = transcription;
responseBody.gcsBucket = outputBucket;

let translations = [];
supportedLanguageCodes.forEach(languageCode => {
let translation = { languageCode: languageCode };
const filenameUUID = uuid();
const filename = filenameUUID + '.' + outputAudioEncoding.toLowerCase();
callTextTranslation(languageCode, transcription).then(data => {
const textTranslation = data[0];
translation.text = textTranslation;
return callTextToSpeech(languageCode, textTranslation);
}).then(data => {
const path = languageCode + '/' + filename;
return uploadToCloudStorage(path, data[0].audioContent);
}).then(() => {
console.log(`Successfully translated input to ${languageCode}.`);
translation.gcsPath = languageCode + '/' + filename;
translations.push(translation);
if (translations.length === supportedLanguageCodes.length) {
responseBody.translations = translations;
console.log(`Response: ${JSON.stringify(responseBody)}`);
response.status(200).send(responseBody);
}
}).catch(error => {
console.error(`Partial error in translation to ${languageCode}: ${error}`);
translation.error = error.message;
translations.push(translation);
if (translations.length === supportedLanguageCodes.length) {
responseBody.translations = translations;
console.log(`Response: ${JSON.stringify(responseBody)}`);
response.status(200).send(responseBody);
}
});
});
}).catch(error => {
console.error(error);
response.status(400).send(error.message);
});
};

function callSpeechToText (audioContent, encoding, sampleRateHertz, languageCode) {
console.log(`Processing speech from audio content in ${languageCode}.`);

const request = {
config: {
encoding: encoding,
sampleRateHertz: sampleRateHertz,
languageCode: languageCode
},
audio: { content: audioContent }
};

return speechToTextClient.recognize(request);
}

function callTextTranslation (targetLangCode, data) {
console.log(`Translating text to ${targetLangCode}: ${data}`);

return textTranslationClient.translate(data, targetLangCode);
}

function callTextToSpeech (targetLocale, data) {
console.log(`Converting to speech in ${targetLocale}: ${data}`);

const request = {
input: { text: data },
voice: { languageCode: targetLocale, ssmlGender: voiceSsmlGender },
audioConfig: { audioEncoding: outputAudioEncoding }
};

return textToSpeechClient.synthesizeSpeech(request);
}

function uploadToCloudStorage (path, contents) {
console.log(`Uploading audio file to ${path}`);

return storageClient
.bucket(outputBucket)
.file(path)
.save(contents);
}

function validateRequest (request) {
return new Promise(function (resolve, reject) {
if (!request.body.encoding) {
reject(new Error('Invalid encoding.'));
}
if (!request.body.sampleRateHertz || isNaN(request.body.sampleRateHertz)) {
reject(new Error('Sample rate hertz must be numeric.'));
}
if (!request.body.languageCode) {
reject(new Error('Invalid language code.'));
}
if (!request.body.audioContent) {
reject(new Error('Invalid audio content.'));
}

resolve();
});
}

function getSpeechToTextClient () {
const speech = require('@google-cloud/speech');
return new speech.SpeechClient();
}

function getTextTranslationClient () {
const { Translate } = require('@google-cloud/translate');
return new Translate({ projectId: googleCloudProject });
}

function getTextToSpeechClient () {
const textToSpeech = require('@google-cloud/text-to-speech');
return new textToSpeech.TextToSpeechClient();
}

function getStorageClient () {
const { Storage } = require('@google-cloud/storage');
return new Storage({ projectId: googleCloudProject });
}
Loading