Skip to content
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docs/hub/_toctree.yml
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,8 @@
title: Sentence Transformers
- local: spacy
title: spaCy
- local: span_marker
title: SpanMarker
- local: speechbrain
title: SpeechBrain
- local: stable-baselines3
Expand Down
1 change: 1 addition & 0 deletions docs/hub/models-libraries.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ The table below summarizes the supported libraries and their level of integratio
| [Sample Factory](https://github.com/alex-petrenko/sample-factory) | Codebase for high throughput asynchronous reinforcement learning. | ❌ | ✅ | ✅ | ✅ |
| [Sentence Transformers](https://github.com/UKPLab/sentence-transformers) | Compute dense vector representations for sentences, paragraphs, and images. | ✅ | ✅ | ✅ | ✅ |
| [spaCy](https://github.com/explosion/spaCy) | Advanced Natural Language Processing in Python and Cython. | ✅ | ✅ | ✅ | ✅ |
| [SpanMarker](https://github.com/tomaarsen/SpanMarkerNER) | Familiar, simple and state-of-the-art Named Entity Recognition. | ✅ | ✅ | ✅ | ✅ |
| [Scikit Learn (using skops)](https://skops.readthedocs.io/en/stable/) | Machine Learning in Python. | ✅ | ✅ | ✅ | ✅ |
| [Speechbrain](https://speechbrain.github.io/) | A PyTorch Powered Speech Toolkit. | ✅ | ✅ | ✅ | ❌ |
| [Stable-Baselines3](https://github.com/DLR-RM/stable-baselines3) | Set of reliable implementations of deep reinforcement learning algorithms in PyTorch | ❌ | ✅ | ✅ | ✅ |
Expand Down
63 changes: 63 additions & 0 deletions docs/hub/span_marker.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
# Using SpanMarker at Hugging Face

SpanMarker is a framework for training powerful Named Entity Recognition models using familiar encoders such as BERT, RoBERTa and DeBERTa. Tightly implemented on top of the 🤗 Transformers library, SpanMarker can take good advantage of its valuable functionality.

## Exploring SpanMarker in the Hub

You can find `span_marker` models by filtering at the left of the [models page](https://huggingface.co/models?library=span_marker).

All models on the Hub come with these useful features:
1. An automatically generated model card with a brief description.
2. An interactive widget you can use to play with the model directly in the browser.
3. An Inference API that allows you to make inference requests.

## Installation

To get started, you can follow the [SpanMarker installation guide](https://tomaarsen.github.io/SpanMarkerNER/install.html). You can also use the following one-line install through pip:

```
pip install -U span_marker
```

## Using existing models

All `span_marker` models can easily be loaded from the Hub.

```py
from span_marker import SpanMarkerModel

model = SpanMarkerModel.from_pretrained("tomaarsen/span-marker-bert-base-fewnerd-fine-super")
```

Once loaded, you can use [`SpanMarkerModel.predict`](https://tomaarsen.github.io/SpanMarkerNER/api/span_marker.modeling.html#span_marker.modeling.SpanMarkerModel.predict) to perform inference.

```py
model.predict("Amelia Earhart flew her single engine Lockheed Vega 5B across the Atlantic to Paris.")
```
```json
[
{"span": "Amelia Earhart", "label": "person-other", "score": 0.7629689574241638, "char_start_index": 0, "char_end_index": 14},
{"span": "Lockheed Vega 5B", "label": "product-airplane", "score": 0.9833564758300781, "char_start_index": 38, "char_end_index": 54},
{"span": "Atlantic", "label": "location-bodiesofwater", "score": 0.7621214389801025, "char_start_index": 66, "char_end_index": 74},
{"span": "Paris", "label": "location-GPE", "score": 0.9807717204093933, "char_start_index": 78, "char_end_index": 83}
]
```

If you want to load a specific SpanMarker model, you can click `Use in SpanMarker` and you will be given a working snippet!

<!--
TODO: Add this, but then with SpanMarker
<div class="flex justify-center">
<img class="block dark:hidden" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hub/libraries-speechbrain_snippet1.png"/>
<img class="hidden dark:block" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hub/libraries-speechbrain_snippet1-dark.png"/>
</div>
<div class="flex justify-center">
<img class="block dark:hidden" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hub/libraries-speechbrain_snippet2.png"/>
<img class="hidden dark:block" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hub/libraries-speechbrain_snippet2-dark.png"/>
</div>
-->

## Additional resources

* SpanMarker [repository](https://github.com/tomaarsen/SpanMarkerNER).
* SpanMarker [docs](https://tomaarsen.github.io/SpanMarkerNER).
12 changes: 12 additions & 0 deletions js/src/lib/interfaces/Libraries.ts
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ export enum ModelLibrary {
"sentence-transformers" = "Sentence Transformers",
"sklearn" = "Scikit-learn",
"spacy" = "spaCy",
"span_marker" = "SpanMarker",
"speechbrain" = "speechbrain",
"tensorflowtts" = "TensorFlowTTS",
"timm" = "Timm",
Expand Down Expand Up @@ -314,6 +315,11 @@ nlp = spacy.load("${nameWithoutNamespace(model.id)}")
import ${nameWithoutNamespace(model.id)}
nlp = ${nameWithoutNamespace(model.id)}.load()`;

const span_marker = (model: ModelData) =>
`from span_marker import SpanMarkerModel

model = SpanMarkerModel.from_pretrained("${model.id}")`;

const stanza = (model: ModelData) =>
`import stanza

Expand Down Expand Up @@ -528,6 +534,12 @@ export const MODEL_LIBRARIES_UI_ELEMENTS: Partial<Record<ModelLibraryKey, Librar
repoUrl: "https://github.com/explosion/spaCy",
snippet: spacy,
},
"span_marker": {
btnLabel: "SpanMarker",
repoName: "SpanMarkerNER",
repoUrl: "https://github.com/tomaarsen/SpanMarkerNER",
snippet: span_marker,
},
"speechbrain": {
btnLabel: "speechbrain",
repoName: "speechbrain",
Expand Down
3 changes: 3 additions & 0 deletions js/src/lib/interfaces/LibrariesToTasks.ts
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,9 @@ export const LIBRARY_TASK_MAPPING_EXCLUDING_TRANSFORMERS: Partial<Record<ModelLi
"text-classification",
"sentence-similarity",
],
"span_marker": [
"token-classification",
],
"speechbrain": [
"audio-classification",
"audio-to-audio",
Expand Down
2 changes: 1 addition & 1 deletion tasks/src/const.ts
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ export const TASKS_MODEL_LIBRARIES: Record<PipelineType, ModelLibraryKey[]> = {
"text-to-video": [],
"text2text-generation": ["transformers"],
"time-series-forecasting": [],
"token-classification": ["adapter-transformers", "flair", "spacy", "stanza", "transformers"],
"token-classification": ["adapter-transformers", "flair", "spacy", "span_marker", "stanza", "transformers"],
"translation": ["transformers"],
"unconditional-image-generation": [],
"visual-question-answering": [],
Expand Down