-
Notifications
You must be signed in to change notification settings - Fork 31.8k
Add model_cards #7969
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Add model_cards #7969
Changes from all commits
Commits
Show all changes
3 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,51 @@ | ||
| --- | ||
| language: de | ||
| license: mit | ||
| datasets: | ||
| - wikipedia | ||
| - OPUS | ||
| - OpenLegalData | ||
| --- | ||
|
|
||
| # German BERT base | ||
|
|
||
| Released, Oct 2020, this is a German BERT language model trained collaboratively by the makers of the original German BERT (aka "bert-base-german-cased") and the dbmdz BERT (aka bert-base-german-dbmdz-cased). In our [paper](https://arxiv.org/pdf/2010.10906.pdf), we outline the steps taken to train our model and show that it outperforms its predecessors. | ||
|
|
||
| ## Overview | ||
| **Paper:** [here](https://arxiv.org/pdf/2010.10906.pdf) | ||
| **Architecture:** BERT base | ||
| **Language:** German | ||
|
|
||
| ## Performance | ||
| ``` | ||
| GermEval18 Coarse: 78.17 | ||
| GermEval18 Fine: 50.90 | ||
| GermEval14: 87.98 | ||
| ``` | ||
|
|
||
| See also: | ||
| deepset/gbert-base | ||
| deepset/gbert-large | ||
| deepset/gelectra-base | ||
| deepset/gelectra-large | ||
| deepset/gelectra-base-generator | ||
| deepset/gelectra-large-generator | ||
|
|
||
| ## Authors | ||
| Branden Chan: `branden.chan [at] deepset.ai` | ||
| Stefan Schweter: `stefan [at] schweter.eu` | ||
| Timo Möller: `timo.moeller [at] deepset.ai` | ||
|
|
||
| ## About us | ||
|  | ||
|
|
||
| We bring NLP to the industry via open source! | ||
| Our focus: Industry specific language models & large scale QA systems. | ||
|
|
||
| Some of our work: | ||
| - [German BERT (aka "bert-base-german-cased")](https://deepset.ai/german-bert) | ||
| - [FARM](https://github.com/deepset-ai/FARM) | ||
| - [Haystack](https://github.com/deepset-ai/haystack/) | ||
|
|
||
| Get in touch: | ||
| [Twitter](https://twitter.com/deepset_ai) | [LinkedIn](https://www.linkedin.com/company/deepset-ai/) | [Website](https://deepset.ai) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,54 @@ | ||
| --- | ||
| language: de | ||
| license: mit | ||
| datasets: | ||
| - wikipedia | ||
| - OPUS | ||
| - OpenLegalData | ||
| - OSCAR | ||
| --- | ||
|
|
||
| # German BERT large | ||
|
|
||
| Released, Oct 2020, this is a German BERT language model trained collaboratively by the makers of the original German BERT (aka "bert-base-german-cased") and the dbmdz BERT (aka bert-base-german-dbmdz-cased). In our [paper](https://arxiv.org/pdf/2010.10906.pdf), we outline the steps taken to train our model and show that it outperforms its predecessors. | ||
|
|
||
| ## Overview | ||
| **Paper:** [here](https://arxiv.org/pdf/2010.10906.pdf) | ||
| **Architecture:** BERT large | ||
| **Language:** German | ||
|
|
||
| ## Performance | ||
| ``` | ||
| GermEval18 Coarse: 80.08 | ||
| GermEval18 Fine: 52.48 | ||
| GermEval14: 88.16 | ||
| ``` | ||
|
|
||
| See also: | ||
| deepset/gbert-base | ||
| deepset/gbert-large | ||
| deepset/gelectra-base | ||
| deepset/gelectra-large | ||
| deepset/gelectra-base-generator | ||
| deepset/gelectra-large-generator | ||
|
|
||
| ## Authors | ||
| Branden Chan: `branden.chan [at] deepset.ai` | ||
| Stefan Schweter: `stefan [at] schweter.eu` | ||
| Timo Möller: `timo.moeller [at] deepset.ai` | ||
|
|
||
| ## About us | ||
|  | ||
|
|
||
| We bring NLP to the industry via open source! | ||
| Our focus: Industry specific language models & large scale QA systems. | ||
|
|
||
| Some of our work: | ||
| - [German BERT (aka "bert-base-german-cased")](https://deepset.ai/german-bert) | ||
| - [FARM](https://github.com/deepset-ai/FARM) | ||
| - [Haystack](https://github.com/deepset-ai/haystack/) | ||
|
|
||
| Get in touch: | ||
| [Twitter](https://twitter.com/deepset_ai) | [LinkedIn](https://www.linkedin.com/company/deepset-ai/) | [Website](https://deepset.ai) | ||
|
|
||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,46 @@ | ||
| --- | ||
| language: de | ||
| license: mit | ||
| datasets: | ||
| - wikipedia | ||
| - OPUS | ||
| - OpenLegalData | ||
| --- | ||
|
|
||
| # German ELECTRA base generator | ||
|
|
||
| Released, Oct 2020, this is the generator component of the German ELECTRA language model trained collaboratively by the makers of the original German BERT (aka "bert-base-german-cased") and the dbmdz BERT (aka bert-base-german-dbmdz-cased). In our [paper](https://arxiv.org/pdf/2010.10906.pdf), we outline the steps taken to train our model. | ||
|
|
||
| The generator is useful for performing masking experiments. If you are looking for a regular language model for embedding extraction, or downstream tasks like NER, classification or QA, please use deepset/gelectra-base. | ||
|
|
||
| ## Overview | ||
| **Paper:** [here](https://arxiv.org/pdf/2010.10906.pdf) | ||
| **Architecture:** ELECTRA base (generator) | ||
| **Language:** German | ||
|
|
||
| See also: | ||
| deepset/gbert-base | ||
| deepset/gbert-large | ||
| deepset/gelectra-base | ||
| deepset/gelectra-large | ||
| deepset/gelectra-base-generator | ||
| deepset/gelectra-large-generator | ||
|
|
||
| ## Authors | ||
| Branden Chan: `branden.chan [at] deepset.ai` | ||
| Stefan Schweter: `stefan [at] schweter.eu` | ||
| Timo Möller: `timo.moeller [at] deepset.ai` | ||
|
|
||
| ## About us | ||
|  | ||
|
|
||
| We bring NLP to the industry via open source! | ||
| Our focus: Industry specific language models & large scale QA systems. | ||
|
|
||
| Some of our work: | ||
| - [German BERT (aka "bert-base-german-cased")](https://deepset.ai/german-bert) | ||
| - [FARM](https://github.com/deepset-ai/FARM) | ||
| - [Haystack](https://github.com/deepset-ai/haystack/) | ||
|
|
||
| Get in touch: | ||
| [Twitter](https://twitter.com/deepset_ai) | [LinkedIn](https://www.linkedin.com/company/deepset-ai/) | [Website](https://deepset.ai) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,51 @@ | ||
| --- | ||
| language: de | ||
| license: mit | ||
| datasets: | ||
| - wikipedia | ||
| - OPUS | ||
| - OpenLegalData | ||
| --- | ||
|
|
||
| # German ELECTRA base | ||
|
|
||
| Released, Oct 2020, this is a German ELECTRA language model trained collaboratively by the makers of the original German BERT (aka "bert-base-german-cased") and the dbmdz BERT (aka bert-base-german-dbmdz-cased). In our [paper](https://arxiv.org/pdf/2010.10906.pdf), we outline the steps taken to train our model. Our evaluation suggests that this model is somewhat undertrained. For best performance from a base sized model, we recommend deepset/gbert-base | ||
|
|
||
| ## Overview | ||
| **Paper:** [here](https://arxiv.org/pdf/2010.10906.pdf) | ||
| **Architecture:** ELECTRA base (discriminator) | ||
| **Language:** German | ||
|
|
||
| ## Performance | ||
| ``` | ||
| GermEval18 Coarse: 76.02 | ||
| GermEval18 Fine: 42.22 | ||
| GermEval14: 86.02 | ||
| ``` | ||
|
|
||
| See also: | ||
| deepset/gbert-base | ||
| deepset/gbert-large | ||
| deepset/gelectra-base | ||
| deepset/gelectra-large | ||
| deepset/gelectra-base-generator | ||
| deepset/gelectra-large-generator | ||
|
|
||
| ## Authors | ||
| Branden Chan: `branden.chan [at] deepset.ai` | ||
| Stefan Schweter: `stefan [at] schweter.eu` | ||
| Timo Möller: `timo.moeller [at] deepset.ai` | ||
|
|
||
| ## About us | ||
|  | ||
|
|
||
| We bring NLP to the industry via open source! | ||
| Our focus: Industry specific language models & large scale QA systems. | ||
|
|
||
| Some of our work: | ||
| - [German BERT (aka "bert-base-german-cased")](https://deepset.ai/german-bert) | ||
| - [FARM](https://github.com/deepset-ai/FARM) | ||
| - [Haystack](https://github.com/deepset-ai/haystack/) | ||
|
|
||
| Get in touch: | ||
| [Twitter](https://twitter.com/deepset_ai) | [LinkedIn](https://www.linkedin.com/company/deepset-ai/) | [Website](https://deepset.ai) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,56 @@ | ||
| --- | ||
| language: de | ||
| license: mit | ||
| datasets: | ||
| - wikipedia | ||
| - OPUS | ||
| - OpenLegalData | ||
| - OSCAR | ||
| --- | ||
|
|
||
| # German ELECTRA large generator | ||
|
|
||
| Released, Oct 2020, this is the generator component of the German ELECTRA language model trained collaboratively by the makers of the original German BERT (aka "bert-base-german-cased") and the dbmdz BERT (aka bert-base-german-dbmdz-cased). In our [paper](https://arxiv.org/pdf/2010.10906.pdf), we outline the steps taken to train our model. | ||
|
|
||
| The generator is useful for performing masking experiments. If you are looking for a regular language model for embedding extraction, or downstream tasks like NER, classification or QA, please use deepset/gelectra-large. | ||
|
|
||
| ## Overview | ||
| **Paper:** [here](https://arxiv.org/pdf/2010.10906.pdf) | ||
| **Architecture:** ELECTRA large (generator) | ||
| **Language:** German | ||
|
|
||
| ## Performance | ||
| ``` | ||
| GermEval18 Coarse: 80.70 | ||
| GermEval18 Fine: 55.16 | ||
| GermEval14: 88.95 | ||
| ``` | ||
|
|
||
| See also: | ||
| deepset/gbert-base | ||
| deepset/gbert-large | ||
| deepset/gelectra-base | ||
| deepset/gelectra-large | ||
| deepset/gelectra-base-generator | ||
| deepset/gelectra-large-generator | ||
|
|
||
| ## Authors | ||
| Branden Chan: `branden.chan [at] deepset.ai` | ||
| Stefan Schweter: `stefan [at] schweter.eu` | ||
| Timo Möller: `timo.moeller [at] deepset.ai` | ||
|
|
||
| ## About us | ||
|  | ||
|
|
||
| We bring NLP to the industry via open source! | ||
| Our focus: Industry specific language models & large scale QA systems. | ||
|
|
||
| Some of our work: | ||
| - [German BERT (aka "bert-base-german-cased")](https://deepset.ai/german-bert) | ||
| - [FARM](https://github.com/deepset-ai/FARM) | ||
| - [Haystack](https://github.com/deepset-ai/haystack/) | ||
|
|
||
| Get in touch: | ||
| [Twitter](https://twitter.com/deepset_ai) | [LinkedIn](https://www.linkedin.com/company/deepset-ai/) | [Website](https://deepset.ai) | ||
|
|
||
|
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,52 @@ | ||
| --- | ||
| language: de | ||
| license: mit | ||
| datasets: | ||
| - wikipedia | ||
| - OPUS | ||
| - OpenLegalData | ||
| - OSCAR | ||
| --- | ||
|
|
||
| # German ELECTRA large | ||
|
|
||
| Released, Oct 2020, this is a German ELECTRA language model trained collaboratively by the makers of the original German BERT (aka "bert-base-german-cased") and the dbmdz BERT (aka bert-base-german-dbmdz-cased). In our [paper](https://arxiv.org/pdf/2010.10906.pdf), we outline the steps taken to train our model and show that this is the state of the art German language model. | ||
|
|
||
| ## Overview | ||
| **Paper:** [here](https://arxiv.org/pdf/2010.10906.pdf) | ||
| **Architecture:** ELECTRA large (discriminator) | ||
| **Language:** German | ||
|
|
||
| ## Performance | ||
| ``` | ||
| GermEval18 Coarse: 80.70 | ||
| GermEval18 Fine: 55.16 | ||
| GermEval14: 88.95 | ||
| ``` | ||
|
|
||
| See also: | ||
| deepset/gbert-base | ||
| deepset/gbert-large | ||
| deepset/gelectra-base | ||
| deepset/gelectra-large | ||
| deepset/gelectra-base-generator | ||
| deepset/gelectra-large-generator | ||
|
|
||
| ## Authors | ||
| Branden Chan: `branden.chan [at] deepset.ai` | ||
| Stefan Schweter: `stefan [at] schweter.eu` | ||
| Timo Möller: `timo.moeller [at] deepset.ai` | ||
|
|
||
| ## About us | ||
|  | ||
|
|
||
| We bring NLP to the industry via open source! | ||
| Our focus: Industry specific language models & large scale QA systems. | ||
|
|
||
| Some of our work: | ||
| - [German BERT (aka "bert-base-german-cased")](https://deepset.ai/german-bert) | ||
| - [FARM](https://github.com/deepset-ai/FARM) | ||
| - [Haystack](https://github.com/deepset-ai/haystack/) | ||
|
|
||
| Get in touch: | ||
| [Twitter](https://twitter.com/deepset_ai) | [LinkedIn](https://www.linkedin.com/company/deepset-ai/) | [Website](https://deepset.ai) |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
datasets ids should be lowercase, not sure if we post-process those ids in model cards