Added ELECTRA as a thin wrapper around BERT #358

KennethEnevoldsen · 2023-12-30T13:53:03Z

Description

Added support for loading ELECTRA model from the HF hub using a thin wrapper around BERTEncoder (required as a few keys need to be mapped differently) and re-using the same config.

The tests are a bit extensive (loading three different models); I imagine you might want to create your own dummy Electra models for testing.

Checklist

I confirm that I have the right to submit this contribution under the project's MIT license.

KennethEnevoldsen · 2023-12-30T13:58:44Z

curated_transformers/models/electra/_hf.py

+    StringTransformations.regex_sub((r"\.gamma$", ".weight"), backward=None),
+    StringTransformations.regex_sub((r"\.beta$", ".bias"), backward=None),
+    # Prefixes.
+    StringTransformations.remove_prefix("electra.", reversible=False),


There is only two notably differences from BERT here, this like and embeddings_project.weight and embeddings_project.bias

KennethEnevoldsen · 2023-12-30T14:03:41Z

curated_transformers/models/electra/_hf.py

+
+HF_CONFIG_KEYS: List[Tuple[HFConfigKey, Optional[HFConfigKeyDefault]]] = [
+    (CommonHFKeys.ATTENTION_PROBS_DROPOUT_PROB, None),
+    (CommonHFKeys.EMBEDDING_SIZE, None),


I had to add embedding size here, which led to a conflict with the BERT model. This was what led to the thin wrapper class. It is def. possible to avoid it by implementing an if else logic in _config_from_hf.

This seemed like a reasonable comprise between not duplicating functionality and avoid coupling, though I could imagine you would want ELECTRA to be a part of BERT or completely independent.

KennethEnevoldsen · 2023-12-30T14:05:19Z

curated_transformers/tests/models/electra/test_encoder.py

+    [
+        "jonfd/electra-small-nordic",
+        "Maltehb/aelaectra-danish-electra-small-cased",
+        "google/electra-small-discriminator",
+    ],


I checked using a variety of models, but I imagine you might want to replace this with a dummy Electra model.

KennethEnevoldsen · 2023-12-30T14:28:41Z

curated_transformers/tests/tokenizers/legacy/test_electra_tokenizer.py

+@pytest.mark.skipif(not has_hf_transformers, reason="requires huggingface transformers")
+@pytest.mark.parametrize(
+    "model_name", ["jonfd/electra-small-nordic", "Maltehb/aelaectra-danish-electra-small-cased", "google/electra-small-discriminator"]
+)
+def test_from_hf_hub_equals_hf_tokenizer(model_name: str, sample_texts):
+    compare_tokenizer_outputs_with_hf_tokenizer(
+        sample_texts, model_name, BERTTokenizer
+    )


This test is simply to show/test that the electra models can use the BERT tokenizer

danieldk · 2024-01-03T11:05:43Z

Awesome, thanks a lot! I hope to have some time over the coming days to review your PR.

danieldk

Thanks a lot! Added a small comment.

curated_transformers/tests/models/electra/test_encoder.py

Co-authored-by: Daniël de Kok <[email protected]>

KennethEnevoldsen · 2024-01-08T20:03:37Z

Hi @danieldk I see that it now causes an error. Sadly I can't see the error on buildkite and when I run it locally I can't reproduce the error.

btw. I have also added a related PR over on spacy-curated-transformers (which I plan to finish up once this one is through)

shadeMe · 2024-01-08T21:09:09Z

Hi @danieldk I see that it now causes an error. Sadly I can't see the error on buildkite and when I run it locally I can't reproduce the error.

The BuildKite CI failure appears to be unrelated to the PR. We'll look into getting it fixed.

On a related note, I've now enabled the GitHub Actions CI for this PR, which seems to have unearthed an issue (formatting/isort).

danieldk · 2024-03-27T08:13:47Z

@KennethEnevoldsen would you still like to work on this PR? Otherwise, I can also do the last bits to push it over the finish line.

KennethEnevoldsen · 2024-03-27T10:34:19Z

Hi @danieldk I would love to. I have a deadline this week, but I can come back to it next week (potentially before if I find the time).

KennethEnevoldsen · 2024-04-01T09:41:14Z

@danieldk I have fixed the isort issues and formatted using black. I have run the tests locally as well and they pass (or are skipped).

.github/workflows/test.yml

KennethEnevoldsen · 2024-04-02T12:02:00Z

@danieldk checked the error, but it does not seem to be related to the PR (in tokenizer test on rotary embedding). Seems like there is an additional argument seq_len.

danieldk · 2024-04-02T12:04:39Z

@danieldk checked the error, but it does not seem to be related to the PR (in tokenizer test on rotary embedding). Seems like there is an additional argument seq_len.

I just fixed the errors in #367, merging in main into this PR should fix the test issues.

update from main

danieldk · 2024-04-02T12:57:29Z

Thanks a lot, looks good! I'll make a small test model in a bit to replace the tests.

KennethEnevoldsen added 5 commits December 30, 2023 12:02

duplicate bert functionality for testing

81ba840

Added electra encoder

5bd4d54

remove temporary file

467ee1f

removed un. class method

1ca8514

removed added typing to make the pr. cleaner

85c2a5b

KennethEnevoldsen commented Dec 30, 2023

View reviewed changes

Added test for electra tokenizers

54cd512

KennethEnevoldsen commented Dec 30, 2023

View reviewed changes

formatted using black

2d070f0

danieldk reviewed Jan 6, 2024

View reviewed changes

curated_transformers/tests/models/electra/test_encoder.py Outdated Show resolved Hide resolved

Update curated_transformers/tests/models/electra/test_encoder.py

37a3f0f

Co-authored-by: Daniël de Kok <[email protected]>

KennethEnevoldsen mentioned this pull request Jan 8, 2024

Added support for electra tokenizer as a special case of BERT tokenizer explosion/spacy-curated-transformers#28

Merged

3 tasks

Ran isort and black

9fd1d62

danieldk reviewed Apr 1, 2024

View reviewed changes

.github/workflows/test.yml Show resolved Hide resolved

revering changes to tests.ym

ad1af8a

Merge pull request #1 from explosion/main

8dc7b1a

update from main

danieldk merged commit c180a12 into explosion:main Apr 2, 2024
7 of 8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added ELECTRA as a thin wrapper around BERT #358

Added ELECTRA as a thin wrapper around BERT #358

KennethEnevoldsen commented Dec 30, 2023

KennethEnevoldsen Dec 30, 2023

KennethEnevoldsen Dec 30, 2023

KennethEnevoldsen Dec 30, 2023

KennethEnevoldsen Dec 30, 2023

danieldk commented Jan 3, 2024

danieldk left a comment

KennethEnevoldsen commented Jan 8, 2024

shadeMe commented Jan 8, 2024

danieldk commented Mar 27, 2024

KennethEnevoldsen commented Mar 27, 2024

KennethEnevoldsen commented Apr 1, 2024

KennethEnevoldsen commented Apr 2, 2024

danieldk commented Apr 2, 2024

danieldk commented Apr 2, 2024

Added ELECTRA as a thin wrapper around BERT #358

Added ELECTRA as a thin wrapper around BERT #358

Conversation

KennethEnevoldsen commented Dec 30, 2023

Description

Checklist

KennethEnevoldsen Dec 30, 2023

Choose a reason for hiding this comment

KennethEnevoldsen Dec 30, 2023

Choose a reason for hiding this comment

KennethEnevoldsen Dec 30, 2023

Choose a reason for hiding this comment

KennethEnevoldsen Dec 30, 2023

Choose a reason for hiding this comment

danieldk commented Jan 3, 2024

danieldk left a comment

Choose a reason for hiding this comment

KennethEnevoldsen commented Jan 8, 2024

shadeMe commented Jan 8, 2024

danieldk commented Mar 27, 2024

KennethEnevoldsen commented Mar 27, 2024

KennethEnevoldsen commented Apr 1, 2024

KennethEnevoldsen commented Apr 2, 2024

danieldk commented Apr 2, 2024

danieldk commented Apr 2, 2024