Skip to content
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
74 changes: 38 additions & 36 deletions js/src/lib/interfaces/Types.ts
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
import type { ModelLibrary } from "./Libraries";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would maybe put LIBRARY_TASK_MAPPING_EXCLUDING_TRANSFORMERS into a new file to prevent the circular imports between Libraries.ts and Types.ts (even if it's just types)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

like for the already existing TASKS_MODEL_LIBRARIES which is very similar, BTW (cc @xianbaoqian)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(we can change that in the future though – no need to do it in this PR)


// Warning: order of modalities here determine how they are listed on the /tasks page
export const MODALITIES = [
"cv",
Expand Down Expand Up @@ -758,88 +760,88 @@ export interface TransformersInfo {
* This mapping is generated automatically by "python-api-export-tasks" action in huggingface/api-inference-community repo upon merge.
* Ref: https://github.com/huggingface/api-inference-community/pull/158
*/
export const LIBRARY_TASK_MAPPING_EXCLUDING_TRANSFORMERS: Record<string, Array<string>> = {
"adapter_transformers": [
export const LIBRARY_TASK_MAPPING_EXCLUDING_TRANSFORMERS: Partial<Record<keyof typeof ModelLibrary, PipelineType[]>> = {
"adapter-transformers": [
"question-answering",
"text-classification",
"token-classification"
"token-classification",
],
"allennlp": [
"question-answering"
"question-answering",
],
"asteroid": [
"audio-source-separation",
"audio-to-audio"
// "audio-source-separation",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(not a valid pipeline tag)

"audio-to-audio",
],
"diffusers": [
"text-to-image"
],
"doctr": [
"object-detection"
"text-to-image",
],
// "doctr": [
// "object-detection",
// ],
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented because it's absent from ModelLibrary

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same for superb & k2_sherpa

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be handled from the source huggingface/api-inference-community? (see huggingface/api-inference-community#158)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@osanseviero Any thoughts how to handle the discrepancy. Is the inference API working for doctr, superb & k2_sherpa? Or do we need to add these to ModelLibrary?

I didn't find anything related to k2_sherpa in the hub. I had a quick glimpse on a few superb models and they seem to be using transformer for inference API (?) Most doctr repos are dummy models and the inference API is unable to decide the pipeline_tag.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the inference API working for doctr, superb & k2_sherpa? Or do we need to add these to ModelLibrary?

  • Superb repos use the Community Inference API, but given they are not a library and their usage is more for a competition, they were not added to the Libraries.ts (confirmed with @lewtun). From a discussion, we think https://huggingface.co/models?library=superb&sort=modified are mostly old legacy models and it's ok to remove them from the Inference API, as the official ones now use transformers. (the old integration allowed users to specify their own requirements and inference code but that was deprecated)
  • I don't have context in the k2-sherpa integration, so maybe makes sense to remove entirely from the community API or exclude from the exporter
  • https://huggingface.co/models?library=doctr&sort=modified is just one model so I'm fine removing doctr

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for me this PR is ready to merge

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I think adding the two definitely makes sense.

Another issue is that people have been using k2 in HuggingFace tags and k2_sherpa in api-inference-community. I think the easiest fix is to rename k2_sherpa to k2. I'll propose a PR in api-inference-community!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and yes renaming k2_sherpa to k2 sounds perfect

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

64e8088 was exactly what I had in mind when we discussed @xianbaoqian. :) Thanks for doing it @julien-c!

"espnet": [
"text-to-speech",
"automatic-speech-recognition"
"automatic-speech-recognition",
],
"fairseq": [
"text-to-speech",
"audio-to-audio"
"audio-to-audio",
],
"fastai": [
"image-classification"
"image-classification",
],
"fasttext": [
"feature-extraction",
"text-classification"
"text-classification",
],
"flair": [
"token-classification"
],
"k2_sherpa": [
"automatic-speech-recognition"
"token-classification",
],
// "k2_sherpa": [
// "automatic-speech-recognition",
// ],
"keras": [
"image-classification"
"image-classification",
],
"nemo": [
"automatic-speech-recognition"
"automatic-speech-recognition",
],
"paddlenlp": [
"conversational",
"fill-mask"
"fill-mask",
],
"pyannote_audio": [
"automatic-speech-recognition"
"pyannote-audio": [
"automatic-speech-recognition",
],
"sentence_transformers": [
"sentence-transformers": [
"feature-extraction",
"sentence-similarity"
"sentence-similarity",
],
"sklearn": [
"tabular-classification",
"tabular-regression",
"text-classification"
"text-classification",
],
"spacy": [
"token-classification",
"text-classification",
"sentence-similarity"
"sentence-similarity",
],
"speechbrain": [
"audio-classification",
"audio-to-audio",
"automatic-speech-recognition",
"text-to-speech",
"text2text-generation"
"text2text-generation",
],
"stanza": [
"token-classification"
],
"superb": [
"automatic-speech-recognition",
"speech-segmentation"
"token-classification",
],
// "superb": [
// "automatic-speech-recognition",
// "speech-segmentation",
// ],
"timm": [
"image-classification"
]
}
"image-classification",
],
};