Skip to content
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
resource-manager/api
runtimeconfig/usage
spanner/usage
speech/usage
speech/index
error-reporting/usage
monitoring/usage
logging/usage
Expand Down
7 changes: 0 additions & 7 deletions docs/speech/alternative.rst

This file was deleted.

7 changes: 0 additions & 7 deletions docs/speech/client.rst

This file was deleted.

7 changes: 0 additions & 7 deletions docs/speech/encoding.rst

This file was deleted.

6 changes: 6 additions & 0 deletions docs/speech/gapic/api.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
Speech Client API
=================

.. automodule:: google.cloud.speech_v1
:members:
:inherited-members:
5 changes: 5 additions & 0 deletions docs/speech/gapic/types.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
Speech Client Types
===================

.. automodule:: google.cloud.speech_v1.types
:members:
224 changes: 138 additions & 86 deletions docs/speech/usage.rst → docs/speech/index.rst
Original file line number Diff line number Diff line change
@@ -1,49 +1,41 @@
######
Speech
======

.. toctree::
:maxdepth: 2
:hidden:

client
encoding
operation
result
sample
alternative
######

The `Google Speech`_ API enables developers to convert audio to text.
The API recognizes over 80 languages and variants, to support your global user
base.

.. _Google Speech: https://cloud.google.com/speech/docs/getting-started

Client
------

:class:`~google.cloud.speech.client.Client` objects provide a
Authentication and Configuration
--------------------------------

:class:`~google.cloud.speech_v1.SpeechClient` objects provide a
means to configure your application. Each instance holds
an authenticated connection to the Cloud Speech Service.

For an overview of authentication in ``google-cloud-python``, see
:doc:`/core/auth`.

Assuming your environment is set up as described in that document,
create an instance of :class:`~google.cloud.speech.client.Client`.
create an instance of :class:`~.google.cloud.speech.SpeechClient`.

This comment was marked as spam.


.. code-block:: python

>>> from google.cloud import speech
>>> client = speech.Client()
>>> client = speech.SpeechClient()


Asynchronous Recognition
------------------------

The :meth:`~google.cloud.speech.Client.long_running_recognize` sends audio
data to the Speech API and initiates a Long Running Operation. Using this
operation, you can periodically poll for recognition results. Use asynchronous
requests for audio data of any duration up to 80 minutes.
The :meth:`~.google.cloud.speech.SpeechClient.long_running_recognize` method
sends audio data to the Speech API and initiates a Long Running Operation.

Using this operation, you can periodically poll for recognition results.
Use asynchronous requests for audio data of any duration up to 80 minutes.

See: `Speech Asynchronous Recognize`_

Expand All @@ -52,13 +44,16 @@ See: `Speech Asynchronous Recognize`_

>>> import time
>>> from google.cloud import speech
>>> client = speech.Client()
>>> sample = client.sample(source_uri='gs://my-bucket/recording.flac',
... encoding=speech.Encoding.LINEAR16,
... sample_rate_hertz=44100)
>>> operation = sample.long_running_recognize(
... language_code='en-US',
... max_alternatives=2,
>>> client = speech.SpeechClient()
>>> operation = client.long_running_recognize(
... audio=speech.types.RecognitionAudio(
... uri='gs://my-bucket/recording.flac',
... ),
... config=speech.types.RecognitionConfig(
... encoding='LINEAR16',
... language_code='en-US',
... sample_rate_hertz=44100,
... ),
... )
>>> retry_count = 100
>>> while retry_count > 0 and not operation.complete:
Expand Down Expand Up @@ -89,12 +84,17 @@ Great Britain.
.. code-block:: python

>>> from google.cloud import speech
>>> client = speech.Client()
>>> sample = client.sample(source_uri='gs://my-bucket/recording.flac',
... encoding=speech.Encoding.FLAC,
... sample_rate_hertz=44100)
>>> results = sample.recognize(
... language_code='en-GB', max_alternatives=2)
>>> client = speech.SpeechClient()
>>> result = client.recognize(
... audio=speech.types.RecognitionAudio(
... uri='gs://my-bucket/recording.flac',
... ),
... config=speech.types.RecognitionConfig(
... encoding='LINEAR16',
... language_code='en-US',
... sample_rate_hertz=44100,
... ),
... )
>>> for result in results:
... for alternative in result.alternatives:
... print('=' * 20)
Expand All @@ -112,14 +112,17 @@ Example of using the profanity filter.
.. code-block:: python

>>> from google.cloud import speech
>>> client = speech.Client()
>>> sample = client.sample(source_uri='gs://my-bucket/recording.flac',
... encoding=speech.Encoding.FLAC,
... sample_rate_hertz=44100)
>>> results = sample.recognize(
... language_code='en-US',
... max_alternatives=1,
... profanity_filter=True,
>>> client = speech.SpeechClient()
>>> result = client.recognize(

This comment was marked as spam.

This comment was marked as spam.

... audio=speech.types.RecognitionAudio(
... uri='gs://my-bucket/recording.flac',
... ),
... config=speech.types.RecognitionConfig(
... encoding='LINEAR16',
... language_code='en-US',
... profanity_filter=True,
... sample_rate_hertz=44100,
... ),
... )
>>> for result in results:
... for alternative in result.alternatives:
Expand All @@ -137,15 +140,20 @@ words to the vocabulary of the recognizer.
.. code-block:: python

>>> from google.cloud import speech
>>> client = speech.Client()
>>> sample = client.sample(source_uri='gs://my-bucket/recording.flac',
... encoding=speech.Encoding.FLAC,
... sample_rate_hertz=44100)
>>> hints = ['hi', 'good afternoon']
>>> results = sample.recognize(
... language_code='en-US',
... max_alternatives=2,
... speech_contexts=hints,
>>> from google.cloud import speech
>>> client = speech.SpeechClient()
>>> result = client.recognize(

This comment was marked as spam.

... audio=speech.types.RecognitionAudio(
... uri='gs://my-bucket/recording.flac',
... ),
... config=speech.types.RecognitionConfig(
... encoding='LINEAR16',
... language_code='en-US',
... sample_rate_hertz=44100,
... speech_contexts=[speech.types.SpeechContext(
... phrases=['hi', 'good afternoon'],
... )],
... ),
... )
>>> for result in results:
... for alternative in result.alternatives:
Expand All @@ -170,18 +178,27 @@ speech data to possible text alternatives on the fly.

.. code-block:: python

>>> import io
>>> from google.cloud import speech
>>> client = speech.Client()
>>> with open('./hello.wav', 'rb') as stream:
... sample = client.sample(stream=stream,
... encoding=speech.Encoding.LINEAR16,
... sample_rate_hertz=16000)
... results = sample.streaming_recognize(language_code='en-US')
... for result in results:
... for alternative in result.alternatives:
... print('=' * 20)
... print('transcript: ' + alternative.transcript)
... print('confidence: ' + str(alternative.confidence))
>>> client = speech.SpeechClient()
>>> config = speech.types.RecognitionConfig(
... encoding='LINEAR16',
... language_code='en-US',
... sample_rate_hertz=44100,
... )
>>> with io.open('./hello.wav', 'rb') as stream:
... requests = [speech.types.StreamingRecognizeRequest(
... audio_content=stream.read(),
... )]
>>> results = sample.streaming_recognize(
... config=speech.types.StreamingRecognitionConfig(config=config),
... requests,
... )
>>> for result in results:
... for alternative in result.alternatives:
... print('=' * 20)
... print('transcript: ' + alternative.transcript)
... print('confidence: ' + str(alternative.confidence))
====================
transcript: hello thank you for using Google Cloud platform
confidence: 0.927983105183
Expand All @@ -193,20 +210,36 @@ until the client closes the output stream or until the maximum time limit has
been reached.

If you only want to recognize a single utterance you can set
``single_utterance`` to :data:`True` and only one result will be returned.
``single_utterance`` to :data:`True` and only one result will be returned.

See: `Single Utterance`_

.. code-block:: python

>>> with open('./hello_pause_goodbye.wav', 'rb') as stream:
... sample = client.sample(stream=stream,
... encoding=speech.Encoding.LINEAR16,
... sample_rate_hertz=16000)
... results = sample.streaming_recognize(
... language_code='en-US',
... single_utterance=True,
... )
>>> import io
>>> from google.cloud import speech
>>> client = speech.SpeechClient()
>>> config = speech.types.RecognitionConfig(
... encoding='LINEAR16',
... language_code='en-US',
... sample_rate_hertz=44100,
... )
>>> with io.open('./hello-pause-goodbye.wav', 'rb') as stream:
... requests = [speech.types.StreamingRecognizeRequest(
... audio_content=stream.read(),
... )]
>>> results = sample.streaming_recognize(
... config=speech.types.StreamingRecognitionConfig(
... config=config,
... single_utterance=False,
... ),
... requests,
... )
>>> for result in results:
... for alternative in result.alternatives:
... print('=' * 20)
... print('transcript: ' + alternative.transcript)
... print('confidence: ' + str(alternative.confidence))
... for result in results:
... for alternative in result.alternatives:
... print('=' * 20)
Expand All @@ -221,22 +254,31 @@ If ``interim_results`` is set to :data:`True`, interim results

.. code-block:: python

>>> import io
>>> from google.cloud import speech
>>> client = speech.Client()
>>> with open('./hello.wav', 'rb') as stream:
... sample = client.sample(stream=stream,
... encoding=speech.Encoding.LINEAR16,
... sample_rate=16000)
... results = sample.streaming_recognize(
... interim_results=True,
... language_code='en-US',
... )
... for result in results:
... for alternative in result.alternatives:
... print('=' * 20)
... print('transcript: ' + alternative.transcript)
... print('confidence: ' + str(alternative.confidence))
... print('is_final:' + str(result.is_final))
>>> client = speech.SpeechClient()
>>> config = speech.types.RecognitionConfig(
... encoding='LINEAR16',
... language_code='en-US',
... sample_rate_hertz=44100,
... )
>>> with io.open('./hello.wav', 'rb') as stream:
... requests = [speech.types.StreamingRecognizeRequest(
... audio_content=stream.read(),
... )]
>>> results = sample.streaming_recognize(
... config=speech.types.StreamingRecognitionConfig(
... config=config,
... iterim_results=True,
... ),
... requests,
... )
>>> for result in results:
... for alternative in result.alternatives:
... print('=' * 20)
... print('transcript: ' + alternative.transcript)
... print('confidence: ' + str(alternative.confidence))
... print('is_final:' + str(result.is_final))
====================
'he'
None
Expand All @@ -254,3 +296,13 @@ If ``interim_results`` is set to :data:`True`, interim results
.. _Single Utterance: https://cloud.google.com/speech/reference/rpc/google.cloud.speech.v1beta1#streamingrecognitionconfig
.. _sync_recognize: https://cloud.google.com/speech/reference/rest/v1beta1/speech/syncrecognize
.. _Speech Asynchronous Recognize: https://cloud.google.com/speech/reference/rest/v1beta1/speech/asyncrecognize


API Reference
-------------

.. toctree::
:maxdepth: 2

gapic/api
gapic/types
7 changes: 0 additions & 7 deletions docs/speech/operation.rst

This file was deleted.

7 changes: 0 additions & 7 deletions docs/speech/result.rst

This file was deleted.

7 changes: 0 additions & 7 deletions docs/speech/sample.rst

This file was deleted.

2 changes: 1 addition & 1 deletion nox.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,13 +30,13 @@ def docs(session):
# Install Sphinx and also all of the google-cloud-* packages.
session.chdir(os.path.realpath(os.path.dirname(__file__)))
session.install('Sphinx >= 1.6.2', 'sphinx_rtd_theme')
session.install('-e', '.')

This comment was marked as spam.

session.install(
'core/', 'bigquery/', 'bigtable/', 'datastore/', 'dns/', 'language/',
'logging/', 'error_reporting/', 'monitoring/', 'pubsub/',
'resource_manager/', 'runtimeconfig/', 'spanner/', 'speech/',
'storage/', 'translate/', 'vision/',
)
session.install('-e', '.')

# Build the docs!
session.run('bash', './test_utils/scripts/update_docs.sh')
Expand Down
Loading