diff --git a/docs/source/en/main_classes/pipelines.mdx b/docs/source/en/main_classes/pipelines.mdx index daed2f42dc17..ecb8891bf6b3 100644 --- a/docs/source/en/main_classes/pipelines.mdx +++ b/docs/source/en/main_classes/pipelines.mdx @@ -20,31 +20,7 @@ Recognition, Masked Language Modeling, Sentiment Analysis, Feature Extraction an There are two categories of pipeline abstractions to be aware about: - The [`pipeline`] which is the most powerful object encapsulating all other pipelines. -- The other task-specific pipelines: - - - [`AudioClassificationPipeline`] - - [`AutomaticSpeechRecognitionPipeline`] - - [`ConversationalPipeline`] - - [`DepthEstimationPipeline`] - - [`DocumentQuestionAnsweringPipeline`] - - [`FeatureExtractionPipeline`] - - [`FillMaskPipeline`] - - [`ImageClassificationPipeline`] - - [`ImageSegmentationPipeline`] - - [`ImageToTextPipeline`] - - [`ObjectDetectionPipeline`] - - [`QuestionAnsweringPipeline`] - - [`SummarizationPipeline`] - - [`TableQuestionAnsweringPipeline`] - - [`TextClassificationPipeline`] - - [`TextGenerationPipeline`] - - [`Text2TextGenerationPipeline`] - - [`TokenClassificationPipeline`] - - [`TranslationPipeline`] - - [`VisualQuestionAnsweringPipeline`] - - [`ZeroShotClassificationPipeline`] - - [`ZeroShotImageClassificationPipeline`] - - [`ZeroShotObjectDetectionPipeline`] +- Task-specific pipelines are available for [audio](#audio), [computer vision](#computer-vision), [natural language processing](#natural-language-processing), and [multimodal](#multimodal) tasks. ## The pipeline abstraction @@ -322,8 +298,9 @@ That should enable you to do all the custom code you want. [Implementing a new pipeline](../add_new_pipeline) -## The task specific pipelines +## Audio +Pipelines available for audio tasks include the following. ### AudioClassificationPipeline @@ -337,51 +314,60 @@ That should enable you to do all the custom code you want. - __call__ - all -### ConversationalPipeline +## Computer vision -[[autodoc]] Conversation +Pipelines available for computer vision tasks include the following. -[[autodoc]] ConversationalPipeline +### DepthEstimationPipeline +[[autodoc]] DepthEstimationPipeline - __call__ - all -### DepthEstimationPipeline -[[autodoc]] DepthEstimationPipeline +### ImageClassificationPipeline + +[[autodoc]] ImageClassificationPipeline - __call__ - - all + - all -### DocumentQuestionAnsweringPipeline +### ImageSegmentationPipeline -[[autodoc]] DocumentQuestionAnsweringPipeline +[[autodoc]] ImageSegmentationPipeline - __call__ - all -### FeatureExtractionPipeline -[[autodoc]] FeatureExtractionPipeline +### ObjectDetectionPipeline + +[[autodoc]] ObjectDetectionPipeline - __call__ - all -### FillMaskPipeline +### ZeroShotImageClassificationPipeline -[[autodoc]] FillMaskPipeline +[[autodoc]] ZeroShotImageClassificationPipeline - __call__ - all -### ImageClassificationPipeline +### ZeroShotObjectDetectionPipeline -[[autodoc]] ImageClassificationPipeline +[[autodoc]] ZeroShotObjectDetectionPipeline - __call__ - all -### ImageSegmentationPipeline +## Natural Language Processing -[[autodoc]] ImageSegmentationPipeline +Pipelines available for natural language processing tasks include the following. + +### ConversationalPipeline + +[[autodoc]] Conversation + +[[autodoc]] ConversationalPipeline - __call__ - all -### ImageToTextPipeline +### FillMaskPipeline -[[autodoc]] ImageToTextPipeline +[[autodoc]] FillMaskPipeline - __call__ - all @@ -391,12 +377,6 @@ That should enable you to do all the custom code you want. See [`TokenClassificationPipeline`] for all details. -### ObjectDetectionPipeline - -[[autodoc]] ObjectDetectionPipeline - - __call__ - - all - ### QuestionAnsweringPipeline [[autodoc]] QuestionAnsweringPipeline @@ -444,27 +424,37 @@ See [`TokenClassificationPipeline`] for all details. - __call__ - all -### VisualQuestionAnsweringPipeline +### ZeroShotClassificationPipeline -[[autodoc]] VisualQuestionAnsweringPipeline +[[autodoc]] ZeroShotClassificationPipeline - __call__ - all -### ZeroShotClassificationPipeline +## Multimodal -[[autodoc]] ZeroShotClassificationPipeline +Pipelines available for multimodal tasks include the following. + +### DocumentQuestionAnsweringPipeline + +[[autodoc]] DocumentQuestionAnsweringPipeline - __call__ - all -### ZeroShotImageClassificationPipeline +### FeatureExtractionPipeline -[[autodoc]] ZeroShotImageClassificationPipeline +[[autodoc]] FeatureExtractionPipeline - __call__ - all -### ZeroShotObjectDetectionPipeline +### ImageToTextPipeline -[[autodoc]] ZeroShotObjectDetectionPipeline +[[autodoc]] ImageToTextPipeline + - __call__ + - all + +### VisualQuestionAnsweringPipeline + +[[autodoc]] VisualQuestionAnsweringPipeline - __call__ - all