chore(nlp): Bulgarian added to contenful #1250

elozano98 · 2021-01-15T16:02:40Z

Depends on #1249. Please, review it first. ⚠️

Description

Bulgarian tokenizer, stemmer, and stopwords have been added to contentful nlp.

Context

Adding them will make it possible to process Bulgarian text.

Approach taken / Explain the design

The Tokenizer used is the base-tokenizer from the nlpjs library.
The Stemmer has been implemented having as a reference this GitHub repository.
The Stopwords have been collected from here.

Testing

The pull request...

has unit tests

vanbasten17

good job!! 💯 💯 💯

ericmarcos

https://www.youtube.com/watch?v=H2-6ochfjzk

codecov · 2021-01-15T16:06:36Z

Codecov Report

Merging #1250 (38e4179) into contentful/nl (a269138) will increase coverage by 0.11%.
The diff coverage is 88.00%.

@@                Coverage Diff                @@
##           contentful/nl    #1250      +/-   ##
=================================================
+ Coverage          64.62%   64.73%   +0.11%     
=================================================
  Files                232      236       +4     
  Lines               6443     6469      +26     
  Branches            1115     1118       +3     
=================================================
+ Hits                4164     4188      +24     
  Misses              1966     1966              
- Partials             313      315       +2

Flag	Coverage Δ
botonic-plugin-contentful	`71.32% <88.00%> (+0.14%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
...nlp/stemmers/transformations/transformations-bg.ts	`100.00% <ø> (ø)`
...c-plugin-contentful/src/nlp/stemmers/stemmer-bg.ts	`81.25% <81.25%> (ø)`
...kages/botonic-plugin-contentful/src/nlp/locales.ts	`100.00% <100.00%> (ø)`
...kages/botonic-plugin-contentful/src/nlp/stemmer.ts	`95.91% <100.00%> (+0.17%)`	⬆️
...lugin-contentful/src/nlp/stopwords/stopwords-bg.ts	`100.00% <100.00%> (ø)`
...ugin-contentful/src/nlp/tokenizers/tokenizer-bg.ts	`100.00% <100.00%> (ø)`
...ckages/botonic-plugin-contentful/src/nlp/tokens.ts	`96.07% <100.00%> (+0.11%)`	⬆️
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a269138...38e4179. Read the comment docs.

elozano98 added 3 commits January 15, 2021 16:55

chore(nlp): tokenizer, stemmer and stopwords added.

37e9e07

chore(nlp): stemmer and normalizer tests implemented.

de2aa3d

chore(nlp): comment with the source of the stopwords.

38e4179

elozano98 requested review from asastre, dpinol, vanbasten17 and ericmarcos and removed request for asastre and dpinol January 15, 2021 16:02

elozano98 changed the base branch from master to contentful/nl January 15, 2021 16:03

vanbasten17 approved these changes Jan 15, 2021

View reviewed changes

ericmarcos approved these changes Jan 15, 2021

View reviewed changes

elozano98 requested a review from dpinol January 15, 2021 16:08

Base automatically changed from contentful/nl to master January 18, 2021 08:42

elozano98 merged commit 56a4557 into master Jan 18, 2021

elozano98 deleted the contentful/bg branch January 18, 2021 08:42

elozano98 added the documentation Documentation changes label Jan 19, 2021

snyk-bot mentioned this pull request Apr 5, 2023

[Snyk] Upgrade @nlpjs/lang-es from 4.22.0 to 4.26.1 #2329

Closed

manuelfidalgo mentioned this pull request Apr 19, 2023

[Snyk] Upgrade @nlpjs/ner from 4.22.0 to 4.26.1 #2338

Closed

This was referenced Apr 19, 2023

[Snyk] Upgrade @nlpjs/lang-it from 4.22.0 to 4.26.1 #2341

Closed

[Snyk] Upgrade @nlpjs/lang-en-min from 4.22.0 to 4.26.1 #2342

Closed

[Snyk] Upgrade @nlpjs/lang-ru from 4.22.0 to 4.26.1 #2343

Closed

This was referenced Apr 21, 2023

[Snyk] Upgrade @nlpjs/lang-fr from 4.22.0 to 4.26.1 #2350

Merged

[Snyk] Upgrade @nlpjs/lang-de from 4.22.0 to 4.26.1 #2351

Merged

castledom04 mentioned this pull request Apr 26, 2023

[Snyk] Upgrade @nlpjs/lang-de from 4.22.0 to 4.26.1 #2369

Closed

This was referenced May 10, 2023

[Snyk] Upgrade @nlpjs/lang-ca from 4.22.0 to 4.26.1 #2427

Closed

[Snyk] Upgrade @nlpjs/core from 4.22.0 to 4.26.1 #2428

Closed

[Snyk] Upgrade @nlpjs/lang-el from 4.22.0 to 4.26.1 #2429

Closed

This was referenced May 17, 2023

[Snyk] Upgrade @nlpjs/lang-fr from 4.22.0 to 4.26.1 #2481

Closed

[Snyk] Upgrade @nlpjs/lang-cs from 4.22.0 to 4.26.1 #2485

Closed

[Snyk] Upgrade @nlpjs/lang-hu from 4.22.0 to 4.26.1 #2486

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(nlp): Bulgarian added to contenful #1250

chore(nlp): Bulgarian added to contenful #1250

elozano98 commented Jan 15, 2021

vanbasten17 left a comment

ericmarcos left a comment

codecov bot commented Jan 15, 2021

chore(nlp): Bulgarian added to contenful #1250

chore(nlp): Bulgarian added to contenful #1250

Conversation

elozano98 commented Jan 15, 2021

Description

Context

Approach taken / Explain the design

Testing

vanbasten17 left a comment

Choose a reason for hiding this comment

ericmarcos left a comment

Choose a reason for hiding this comment

codecov bot commented Jan 15, 2021

Codecov Report