Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docs/hub/_toctree.yml
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,8 @@
title: Sentence Transformers
- local: spacy
title: spaCy
- local: span_marker
title: SpanMarker
- local: speechbrain
title: SpeechBrain
- local: stable-baselines3
Expand Down
1 change: 1 addition & 0 deletions docs/hub/models-libraries.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ The table below summarizes the supported libraries and their level of integratio
| [Sample Factory](https://github.com/alex-petrenko/sample-factory) | Codebase for high throughput asynchronous reinforcement learning. | ❌ | ✅ | ✅ | ✅ |
| [Sentence Transformers](https://github.com/UKPLab/sentence-transformers) | Compute dense vector representations for sentences, paragraphs, and images. | ✅ | ✅ | ✅ | ✅ |
| [spaCy](https://github.com/explosion/spaCy) | Advanced Natural Language Processing in Python and Cython. | ✅ | ✅ | ✅ | ✅ |
| [SpanMarker](https://github.com/tomaarsen/SpanMarkerNER) | Familiar, simple and state-of-the-art Named Entity Recognition. | ✅ | ✅ | ✅ | ✅ |
| [Scikit Learn (using skops)](https://skops.readthedocs.io/en/stable/) | Machine Learning in Python. | ✅ | ✅ | ✅ | ✅ |
| [Speechbrain](https://speechbrain.github.io/) | A PyTorch Powered Speech Toolkit. | ✅ | ✅ | ✅ | ❌ |
| [Stable-Baselines3](https://github.com/DLR-RM/stable-baselines3) | Set of reliable implementations of deep reinforcement learning algorithms in PyTorch | ❌ | ✅ | ✅ | ✅ |
Expand Down
63 changes: 63 additions & 0 deletions docs/hub/span_marker.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
# Using SpanMarker at Hugging Face

[SpanMarker](https://github.com/tomaarsen/SpanMarkerNER) is a framework for training powerful Named Entity Recognition models using familiar encoders such as BERT, RoBERTa and DeBERTa. Tightly implemented on top of the 🤗 Transformers library, SpanMarker can take good advantage of it. As a result, SpanMarker will be intuitive to use for anyone familiar with Transformers.

## Exploring SpanMarker in the Hub

You can find `span_marker` models by filtering at the left of the [models page](https://huggingface.co/models?library=span_marker).

All models on the Hub come with these useful features:
1. An automatically generated model card with a brief description.
2. An interactive widget you can use to play with the model directly in the browser.
3. An Inference API that allows you to make inference requests.

## Installation

To get started, you can follow the [SpanMarker installation guide](https://tomaarsen.github.io/SpanMarkerNER/install.html). You can also use the following one-line install through pip:

```
pip install -U span_marker
```

## Using existing models

All `span_marker` models can easily be loaded from the Hub.

```py
from span_marker import SpanMarkerModel

model = SpanMarkerModel.from_pretrained("tomaarsen/span-marker-bert-base-fewnerd-fine-super")
```

Once loaded, you can use [`SpanMarkerModel.predict`](https://tomaarsen.github.io/SpanMarkerNER/api/span_marker.modeling.html#span_marker.modeling.SpanMarkerModel.predict) to perform inference.

```py
model.predict("Amelia Earhart flew her single engine Lockheed Vega 5B across the Atlantic to Paris.")
```
```json
[
{"span": "Amelia Earhart", "label": "person-other", "score": 0.7629689574241638, "char_start_index": 0, "char_end_index": 14},
{"span": "Lockheed Vega 5B", "label": "product-airplane", "score": 0.9833564758300781, "char_start_index": 38, "char_end_index": 54},
{"span": "Atlantic", "label": "location-bodiesofwater", "score": 0.7621214389801025, "char_start_index": 66, "char_end_index": 74},
{"span": "Paris", "label": "location-GPE", "score": 0.9807717204093933, "char_start_index": 78, "char_end_index": 83}
]
```

If you want to load a specific SpanMarker model, you can click `Use in SpanMarker` and you will be given a working snippet!

<!--
TODO: Add this, but then with SpanMarker
<div class="flex justify-center">
<img class="block dark:hidden" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hub/libraries-speechbrain_snippet1.png"/>
<img class="hidden dark:block" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hub/libraries-speechbrain_snippet1-dark.png"/>
</div>
<div class="flex justify-center">
<img class="block dark:hidden" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hub/libraries-speechbrain_snippet2.png"/>
<img class="hidden dark:block" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hub/libraries-speechbrain_snippet2-dark.png"/>
</div>
-->

## Additional resources

* SpanMarker [repository](https://github.com/tomaarsen/SpanMarkerNER)
* SpanMarker [docs](https://tomaarsen.github.io/SpanMarkerNER)
12 changes: 12 additions & 0 deletions js/src/lib/interfaces/Libraries.ts
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ export enum ModelLibrary {
"sentence-transformers" = "Sentence Transformers",
"sklearn" = "Scikit-learn",
"spacy" = "spaCy",
"span-marker" = "SpanMarker",
"speechbrain" = "speechbrain",
"tensorflowtts" = "TensorFlowTTS",
"timm" = "Timm",
Expand Down Expand Up @@ -314,6 +315,11 @@ nlp = spacy.load("${nameWithoutNamespace(model.id)}")
import ${nameWithoutNamespace(model.id)}
nlp = ${nameWithoutNamespace(model.id)}.load()`;

const span_marker = (model: ModelData) =>
`from span_marker import SpanMarkerModel

model = SpanMarkerModel.from_pretrained("${model.id}")`;

const stanza = (model: ModelData) =>
`import stanza

Expand Down Expand Up @@ -528,6 +534,12 @@ export const MODEL_LIBRARIES_UI_ELEMENTS: Partial<Record<ModelLibraryKey, Librar
repoUrl: "https://github.com/explosion/spaCy",
snippet: spacy,
},
"span-marker": {
btnLabel: "SpanMarker",
repoName: "SpanMarkerNER",
repoUrl: "https://github.com/tomaarsen/SpanMarkerNER",
snippet: span_marker,
},
"speechbrain": {
btnLabel: "speechbrain",
repoName: "speechbrain",
Expand Down
3 changes: 3 additions & 0 deletions js/src/lib/interfaces/LibrariesToTasks.ts
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,9 @@ export const LIBRARY_TASK_MAPPING_EXCLUDING_TRANSFORMERS: Partial<Record<ModelLi
"text-classification",
"sentence-similarity",
],
"span-marker": [
"token-classification",
],
"speechbrain": [
"audio-classification",
"audio-to-audio",
Expand Down
2 changes: 1 addition & 1 deletion tasks/src/const.ts
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ export const TASKS_MODEL_LIBRARIES: Record<PipelineType, ModelLibraryKey[]> = {
"text-to-video": [],
"text2text-generation": ["transformers"],
"time-series-forecasting": [],
"token-classification": ["adapter-transformers", "flair", "spacy", "stanza", "transformers"],
"token-classification": ["adapter-transformers", "flair", "spacy", "span-marker", "stanza", "transformers"],
"translation": ["transformers"],
"unconditional-image-generation": [],
"visual-question-answering": [],
Expand Down