add code samples for TF speech models #16494

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

ydshieh merged 1 commit into huggingface:main from ydshieh:add_tf_speech_code_samples

Apr 1, 2022

src/transformers/utils/doc.py

-Original file line number
+Diff line change
@@ Expand Up @@
         ```
     """
+    TF_SPEECH_BASE_MODEL_SAMPLE = r"""
+        Example:
+        ```python
+        >>> from transformers import {processor_class}, {model_class}
+        >>> from datasets import load_dataset
+        >>> dataset = load_dataset("hf-internal-testing/librispeech_asr_demo", "clean", split="validation")
+        >>> dataset = dataset.sort("id")
+        >>> sampling_rate = dataset.features["audio"].sampling_rate
+        >>> processor = {processor_class}.from_pretrained("{checkpoint}")
+        >>> model = {model_class}.from_pretrained("{checkpoint}")
+        >>> # audio file is decoded on the fly
+        >>> inputs = processor(dataset[0]["audio"]["array"], sampling_rate=sampling_rate, return_tensors="tf")
+        >>> outputs = model(**inputs)
+        >>> last_hidden_states = outputs.last_hidden_state
+        >>> list(last_hidden_states.shape)
+        {expected_output}
+        ```
+    """
+    TF_SPEECH_CTC_SAMPLE = r"""
+        Example:
+        ```python
+        >>> from transformers import {processor_class}, {model_class}
+        >>> from datasets import load_dataset
+        >>> import tensorflow as tf
+        >>> dataset = load_dataset("hf-internal-testing/librispeech_asr_demo", "clean", split="validation")
+        >>> dataset = dataset.sort("id")
+        >>> sampling_rate = dataset.features["audio"].sampling_rate
+        >>> processor = {processor_class}.from_pretrained("{checkpoint}")
+        >>> model = {model_class}.from_pretrained("{checkpoint}")
+        >>> # audio file is decoded on the fly
+        >>> inputs = processor(dataset[0]["audio"]["array"], sampling_rate=sampling_rate, return_tensors="tf")
+        >>> logits = model(**inputs).logits
+        >>> predicted_ids = tf.math.argmax(logits, axis=-1)
+        >>> # transcribe speech
+        >>> transcription = processor.batch_decode(predicted_ids)
+        >>> transcription[0]
+        {expected_output}
+        ```
+        ```python
+        >>> with processor.as_target_processor():
+        ...     inputs["labels"] = processor(dataset[0]["text"], return_tensors="tf").input_ids
+        >>> # compute loss
+        >>> loss = model(**inputs).loss
+        >>> round(float(loss), 2)
+        {expected_loss}
+        ```
+    """
     TF_VISION_BASE_MODEL_SAMPLE = r"""
         Example:
@@ Expand Down Expand Up @@
         "MaskedLM": TF_MASKED_LM_SAMPLE,
         "LMHead": TF_CAUSAL_LM_SAMPLE,
         "BaseModel": TF_BASE_MODEL_SAMPLE,
+        "SpeechBaseModel": TF_SPEECH_BASE_MODEL_SAMPLE,
+        "CTC": TF_SPEECH_CTC_SAMPLE,
         "VisionBaseModel": TF_VISION_BASE_MODEL_SAMPLE,
         "ImageClassification": TF_VISION_SEQ_CLASS_SAMPLE,
     }
@@ Expand Down @@

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add code samples for TF speech models #16494

Uh oh!

Diff view

Diff view

There are no files selected for viewing